Study: PETSc ex19." Performance Analysis in the Uintah Software Development Cycle." Presented at ISHPC 2002 - Kernel Tuning and Analysis Utilities." Research at UO" Presented for the Brain Biology Machine Initiative February 2005 Characterization of Global Address Space Applications: A Case Study with NWChem." J. R. Hammond, S. Krishnamoorthy, S. Shende, N. A. Romero, A. D. Malony. To appear in Concurrency and Computation: Practice and Experience 2010 Technology for Complex Parallel Systems." Technology for Component Software," Allen D. Malony, Sameer S. Shende, Presentation at Performance Tools Workshop, Los Alamos Computer Science Institute Symposium (LACSI'02), Santa Fe, NM, Oct. 2002. Parallel Performance System." Performance System." Sameer Shende, Allen D. Malon. Presented at Lawrence Livermore National Laboratory January 2006 Modeling of Scaled Parallel Programs," Technical Report, Universitat Erlangen--Nurnberg, IMMD VII, 1994. Performance Monitoring Interface for OpenMP By Bernd Mohr , Allen D. Malony, Hans-Christian Hoppe, Frank Schlimbach, Grant Haab, Jay Hoeflinger, and Sanjiv Shah . Persented at EWOMP 2002 Sarje, Sukhyun Song, Douglas Jacobsen, Kevin A. Huck, Jeffrey K. Hollingsworth, Allen D. Malony, Samuel Williams, Leonid Oliker: Parallel Performance Optimizations on Unstructured Mesh-based Simulations. ICCS 2015: 2016-2025 and A. Malony, "Experimental Results for Vector Processing on the Alliant FX/ 8," CSRD Tech Report #549, UIUC, Feb. 1986. Salman, Allen D. Malony, Matthew J. Sottile: An Open Domain-Extensible Environment for Simulation-Based Scientific Investigation (ODESSI). ICCS (1) 2009: 23-32 Salman, Allen D. Malony, Sergei Turovets, Don M. Tucker: Use of Parallel Simulated Annealing for Computational Modeling of Human Head Conductivity. International Conference on Computational Science (1) 2007: 86-93 Salman, Allen D. Malony, Sergei Turovets, Vasily Volkov, David Ozog, Don M. Tucker: Next-generation human brain neuroimaging and the role of high-performance computing. HPCS 2013: 234-242 Salman, Sergei Turovets, Allen Malony, Jeff Eriksen, and Don Tucker, "Computational Modeling of Human Head Conductivity." Presented at International Conference on Computational Science. Salman , Sergei Turovets, Allen Malony, and Vasily Volkov, "Multi-Cluster, Mixed-Mode Computational Modeling of Human Head Conductivity." Presented at IWOMP 2005 Computing Laboratory, Los Alamos National Laboratory: PDT: Program Database Toolkit, Supercomputing '99 flyer, Los Alamos National Laboratory Publication LALP-99-204, November 1999. Computing Laboratory, Los Alamos National Laboratory: TAU: Tuning and Analysis Utilities, Supercomputing '99 flyer, Los Alamos National Laboratory Publication LALP-99-205, November 1999. Qawasmeh, Abid Muslim Malik, Barbara M. Chapman, Kevin A. Huck, Allen D. Malony: Open Source Task Profiling by Extending the OpenMP Runtime API. IWOMP 2013: 186-199 D. Malony , "ICONIC Grid – Improving Diagnosis of Brain Disorders." Presented at Super Computing Conference 2004 D. Malony, "Data Interpretation and Experiment Planning in Performance Tools," Joint International Conference on Measurement and Modeling of Computer Systems, Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and Modeling of Computer Systems, pp. 62-63, 1995. D. Malony, "Distributed Computational Architectures for Integrated Time-Dynamic Neuroimaging." Presented at The Hill Center November 2005 D. Malony, "Event-based Performance Perturbation: a Case Study," Proc. third ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP'91), pp. 201-212, 1991. D. Malony, "High-Performance Computing, Computational Science, and NeuroInformatics Research." Presented at PNNL April 2004 D. Malony, "JED: Just an Event Display," Chapter, Performance Instrumentation and Visualization, (Eds: M. Simmons, R. Koskela), ACM Press, NY, pp. 99-115, 1990. D. Malony, "Multi-Experiment Performance Data Management and Data Mining." Presented at UTK 2005 D. Malony, "Multiprocessor Instrumentation: Approaches for Cedar," Chapter, Instrumentation for Future Parallel Computing Systems (Eds: M. Simmons, R. Koskela, I. Bucher), ACM Press, NY, pp 1-33, 1989. [bio04] Allen D. Malony "Neuroinformatics, the ICONIC Grid, and Oregon’s Science Industry." Presented to the 2004 Bioscience Conference D. Malony, "Performance Observability," Ph.D. Dissertation, University of Illinois at Urbana-Champaign, Technical Report UIUCDCS-R-90-1603, October 1990. D. Malony, "Performance Technology for Scientific (Parallel and Distributed) Component Software." Presented at Grid Performance Workshop 2002 D. Malony, "Performance Technology for Productive, High-End Parallel Computing." Presented at LLNL October 2004 D. Malony, "Performance Technology for Productive, High-End Parallel Computing." Presented at ORNL 2005 D. Malony, "Performance Technology for Productive, High-End Parallel Computing." Presented at PSC 2005 D. Malony, "Regular Processor Arrays," Proc., 2nd Symposium on the Frontiers of Massively Parallel Computation, IEEE, pp. 499-502, 1988. D. Malony, "Supercomputing Around the World," Proc. 1992 ACM/IEEE Conference on Supercomputing (mini symposium), pp. 126-129, 1992. [europar02] Allen D. Malony, "TAU Performance DataBase Framework (PerfDBF)." Persented at APART EuroPar 2002 workshop D. Malony, "TAU Performance DataBase Framework (PerfDBF)." Presented at EuroPar 2002 D. Malony, "The TAU Performance System." Presented at the DOE ACTS workshop September 2002 D. Malony: Through the Looking-Glass: From Performance Observation to Dynamic Adaptation. HPDC 2015: 1 D. Malony, Adnan Salman, Sergei Turovets, Don M. Tucker, Vasily Volkov, Kai Li, Jung Eun Song, Scott Biersdorff, Colin Davey, Chris Hoge, David K. Hammond: Computational Modeling of Human Head Electromagnetics for Source Localization of Milliscale Brain Dynamics. MMVR 2011: 329-335 D. Malony, B. Robert Helm, "A theory and architecture for automating performance diagnosis," Future Generation Computer Systems, Vol 18, Issue 1, Elsevier Science Publishers, Amsterdam, pg. 189-200, Sept. 2001. D. Malony, Daniel A. Reed, "A Hardware-based Performance Monitor for the Intel iPSC/2 Hypercube, Proc. 4th International Conference on Supercomputing (ICS'90), pp. 213-226, 1990. D. Malony, Daniel A. Reed, "Models for Performance Perturbation Analysis," Workshop on Parallel and Distributed Debugging, Proc. 1991 ACM/ ONR workshop on Parallel and Distributed Debugging, pp. 15-25, 1991 D. Malony, Daniel A. Reed, Patrick J. McGuire,"MPF: A Portable Message Passing Facility for Shared Memory Multiprocessors," Proc. ICPP 1987: pp. 739-741, 1987. D. Malony, David H. Hammerslag, David J. Jablonowski, "Traceview: A Trace Visualization Tool," IEEE Software, 8(5), pp 19-28, 1991. D. Malony, Gregory V. Wilson, "Future directions in parallel performance environments", Proceedings of the workshop on performance measurement and visualization on Performance measurement and visualization of parallel systems, Elsevier Science Publishers, B.V., Amsterdam, pp. 331-351, 1993. D. Malony, Helen D. Karatza, William J. Knottenbelt, Sally McKee: Topic 2: Performance Prediction and Evaluation. Euro-Par 2012: 52-53 D. Malony, John L. Larson, Daniel A. Reed, "Tracing Application Program Execution on the Cray X-MP and Cray 2," Proc. of the 1990 conference on Supercomputing, pp. 60-73, 1990. D. Malony, Kevin A. Huck: General Hybrid Parallel Profiling. PDP 2014: 204-212 D. Malony, Sameer S. Shende, Robert Bell, "The TAU Performance System." Presented at Super Computing Conference November 2002 D. Malony, Sameer Shende, "Advances in the TAU Performance System." Persented to Dagstuhl Conference August 2002 [] Allen D. Malony, Sameer Shende, "PERC Ideas." Presented at Performance Technology for Productive, High-End Parallel Computing D. Malony, Sameer Shende, "Performance Engineering Technology for Complex Scientific Component Software." Presented at Pasadena CCA Meeting January 2003 D. Malony, Sameer Shende, "Performance Technology for Complex Parallel and Distributed Systems," in "Quality of Parallel and Distributed Programs and Systems," (Eds. Peter Kacsuk and Gabriele Kotsis), Nova Science Publishers, Inc., New York, pp. 25-41, 2003. D. Malony, Sameer Shende, "Recent Advances in the TAU Performance System." Presented at LLNL September 2002 D. Malony, Sameer Shende, Alan Morris, "Phase-Based Parallel Performance Profiling" Presented at ParaCo 2005 D. Malony, Sameer Shende, Craig Rasmussen, Jaideep Ray, Matt Sottile, "Performance Technology for Component Software - TAU." D. Malony, Sameer Shende, Robert Ansell-Bell, "Parallel Program Analysis Framework for the DOE ACTS Toolkit." Presented at Super Computing Confernce 2000 D. Malony, Sameer Shende, Robert Bell, "Online Performance Monitoring, Analysis, and Visualization of Large-Scale Parallel Applications." Presented at ParaCo 2003 D. Malony, Shangkar Mayanglambam, Laurent Morin, Matthew J. Sottile, Stéphane Bihan, Sameer Shende, François Bodin: Performance Tool Integration in a GPU Programming Environment: Experiences with TAU and HMPP. PARCO 2009: 685-692 D. Malony, Vassilis Mertsiotakis, Andreas Quick, "Automatic Scalability Analysis of Parallel Programs Based on Modeling Techniques," In G. Haring and G. Kotsis, editors, Proc. 7th International Conference on Modeling Techniques and Tools for Computer Performance Evaluation, LNCS, Springer, 1994. D. Malony, Wolfgang E. Nagel: Open trace - The open trace format (OTF) and open tracing for HPC. SC 2006: 24 D. Malony, and Joseph R. Pickert, "An Environment Architecture and its use in Performance Data Analysis," Center for Supercomputing Research and Development, Technical Report 829, University of Illinois, Urbana-Champaign, Illinois, Oct. 1988. D. Malony and Sameer Shende, "Models for On-the-Fly Compensation of Measurement Overhead in Parallel Performance Profiling, pp. 72-82, 2005. [ijpdsn99] Allen D. Malony and Steven T. Hackstadt, \fIPerformance of a System for Interacting with Parallel Applications\fP, International Journal of Parallel and Distributed Systems and Networks, special issue on Measurement of Program and System Performance, M. H. Mickle, ed., Vol. 2, No. 3, 1999, Acta Press, Anaheim, CA, pp. 155-170. Malony, Sameer Shende, Wyatt Spear, Chee Wai Lee, and Scott Biersdorff, "Advances in the TAU Performance System," in Proc. of the 5th International Workshop on Parallel Tools for High Performance Computing, September 2011, ZIH, Dresden, piblished as book "Tools for High Performance Computing," Eds. H. Brunst, M. Muller, W. Nagel, M. Resch, pp. 119-130, Springer, 2011. Ferscha, Allen D. Malony: Performance data mining: Automated diagnosis, adaption, and optimization. Future Generation Comp. Syst. 18(1): 127-130 (2001) Ferscha and Allen D. Malony, "Performance-Oriented Development of Irregular, Unstructured and Unbalanced Parallel Applications in the N-MAP Environment, " Proc. 8th GI/ITG Conference on Measuring, Modeling and Evaluating Computing and Communication Systems, MMB '95, LNCS 977, Springer, Berlin, pp. 340-356, 1995. Nataraj, Allen D. Malony, Alan Morris, Dorian C. Arnold, Barton P. Miller: A framework for scalable, parallel performance monitoring. Concurrency and Computation: Practice and Experience 22(6): 720-735 (2010) Nataraj, Suravee Suthikulpanit, "KTAU: Kernel TAU." Abdullah Shahneous, et al. "ARCS: Adaptive Runtime Configuration Selection for Power-Constrained OpenMP Applications." Cluster Computing (CLUSTER), 2016 IEEE International Conference on. IEEE, 2016. and A. Nataraj, "Benchmarking the effects of operating system interference on extreme-scale parallel machines", Appears in Cluster Computing 2008 (pg 3-16) published by Springer Netherlands 1386-7857 (Print) 1573-7543 (Online) Volume 11, Number 1 / March, 2008 S. Shende, "A Portable, Extensible, and Scalable Tool for Parallel Performance Profile Analysis", Proc. EUROPAR 2003 conference, LNCS 2790, Springer, Berlin, pp. 17-26, 2003. [wompat02] Bernd Mohr, Allen D. Malony, Rudi Eigenmann, "On the Integration and Use of OpenMP Performance Tools in the SPEC OMP2001 Benchmarks." Presented at John von Neumann Institut fur Computing [wompat02] Bernd Mohr, Allen Malony, Rudi Eigenmann, "On the Integration and Use of OpenMP Performance Tools in the SPEC OMP2001 Benchmarks." Presented at John von Neumann Istitut fur Computing Task-Containers as an Alternative to Runtime-Stacking. In Proceedings of the 23rd European MPI Users' Group Meeting (pp. 51-63). ACM. Performance Views in Charm++: Projections Meets TAU." International Conference on Parallel Processing, September 2009. [sc93] F. Bodin, P. Beckman, D. Gannon, S. Yang, S. Kesavan, A. Malony, B. Mohr, \fIImplementing a Parallel C++ Runtime System for Scalable Parallel Systems\fP, Proceedings of the 1993 Supercomputing Conference, Portland, Oregon, November 1993, pp. 588-597. [hipc95] D. Brown, A. Malony, B. Mohr, \fILanguage-based Parallel Program Interaction: the Breezy Approach\fP, Proceedings of the International Conference on High Performance Computing (HiPC'95), India, December 1995. A. D. Malony, "A Distributed Performance Analysis Architecture for Clusters," In Proc. IEEE International Conference on Cluster Computing (Cluster 2003), IEEE Computer Society, pp. 73-83, Dec. 2003. Performance Analysis of Parallel Systems: Concepts and Experiences," Proc. PARCO 2003 Conference, in (J. Joubert, W. Nagel, F. Peters, W. Walter eds.), Parallel Computing: Software Technology, Algorithms, Architectures and Applications, Advances in Parallel Computing 13 Elsevier 2004, pp. 737-744, 2004. B. Chapman, "A Component Infrastructure for Performance and Power Modeling of Parallel Scientific Applications", in Component-Based High Performance Computing (CBHPC 2008), 2008. Estevan Moron, Allen D. Malony: Development of embedded multicore systems. ETFA 2011: 1-4 Estevan Moron, Antonio Ideguchi, Marcio Merino Fernandes, Allen D. Malony: From MultiTask to MultiCore: Design and Implementation Using an RTOS. ISPDC 2014: 111-118 Spark on Lustre. In International Conference on High Performance Computing (pp. 649-659). Springer International Publishing. : A language to build program analysis tools through static binary instrumentation," in Proc. 20th Annual International Conference on High Performance Computing, HiPC'13, Hyderabad, India, IEEE, December 2013. [sc98] Christopher W. Harrop, Steven T. Hackstadt, Janice E. Cuny, Allen D. Malony, and Laura S. Magde, \fISupporting Runtime Tool Interaction for Parallel Simulations\fP, Proceedings of Supercomputing '98 (SC98), Orlando, FL, November 7-13, 1998 (Best Student Paper Finalist). J. Kundu. \fILogical Time in Visualizations Produced by Parallel Programs\fP, Proceedings of Visualization '92, 1992, pp. 186-193. A. Ellsworth, Allen D. Malony, Barry Rountree, Martin Schulz: Dynamic power sharing for higher job throughput. SC 2015: 80:1-80:11 A. Ellsworth, Allen D. Malony, Barry Rountree, Martin Schulz: POW: System-wide Dynamic Reallocation of Limited Power in HPC. HPDC 2015: 145- 148 A. Ellsworth, Tapasya Patki, Martin Schulz, Barry Rountree, Allen D. Malony: A Unified Platform for Exploring Power Management Strategies. E2SC@SC 2016: 24- 30 A. Reed, Allen D. Malony, Bradley D. McCredie, "Parallel Discrete Event Simulation: a Shared Memory Approach," Proc. 1987 ACM SIGMETRICS conference on Measurement and Modeling of Computer Systems, 15(1), pp. 36- 38, 1987. A. Reed, Allen D. Malony, Bradley McCredie, "Parallel Discrete Event Simulation Using Shared Memory," IEEE Transactions on Software Engineering, 14(4), April 1988, pp. 541-553, 1988 M. Pressel, David Cronk, and Sameer Shende, "PENVELOPE: A New Approach to Rapidly Predicting the Performance of Computationally Intensive Scientific Applications on Parallel Computer Architectures," Proc. 2004 DOD Users Group Conference, Williamsburg, Virginia, IEEE Computer Society, pp. 314-318, 2004. [etpsc94] Darryl I. Brown, Steven T. Hackstadt, Allen D. Malony, Bernd Mohr, \fIProgram Analysis Environments for Parallel Language Systems: The TAU Environment\fP, Proc. of the Workshop on Environments and Tools For Parallel Scientific Computing, Townsend, TN, May 1994, pp. 162-171. E. Bernholdt, Benjamin A. Allan, Robert Armstrong, Felipe Bertrand, Kenneth Chiu, Tamara L. Dahlgren, Kostadin Damevski, Wael R. Elwasif, Thomas G. W. Epperly, Madhusudhan Govindaraju, Daniel S. Katz, James A. Kohl, Manoj Krishnan, Gary Kumfert, J. Walter Larson, Sophia Lefantzi, Michael J. Lewis, Allen D. Malony, Lois C. McInnes, Jarek Nieplocha, Boyana Norris, Steven G. Parker, Jaideep Ray, Sameer Shende, Theresa L. Windus, and Shujia Zhou, "A Component Architecture for High-Performance Scientific Computing," International Journal of High Performance Computing Applications, ACTS Collection Special Issue, SAGE Publications, 20(2):163 -- 202, Summer 2006. K. Hammond, Benoit Scherrer, Allen D. Malony: Incorporating anatomical connectivity into EEG source estimation via sparse approximation with cortical graph wavelets. ICASSP 2012: 573-576 Ozog, Allen D. Malony, Jeff R. Hammond, Pavan Balaji: WorkQ: A many-core producer/consumer execution model applied to PGAS computations. ICPADS 2014: 632-639 Ozog, Jay McCarty, Grant Gossett, Allen D. Malony, Marina Guenza: Fast equilibration of coarse-grained polymeric liquids. J. Comput. Science 9: 33-38 (2015) Ozog, Sameer Shende, Allen D. Malony, Jeff R. Hammond, James Dinan, Pavan Balaji: Inspector/executor load balancing algorithms for block-sparse tensor contractions. ICS 2013: 483-484 de St. Germain, Alan Morris, Steven G. Parker, Allen D. Malony, Sameer Shende: Performance Analysis Integration in the Uintah Software Development Cycle. International Journal of Parallel Programming 31(1): 35-53 (2003) Dou, Gwen A. Frishkoff, Jiawei Rong, Robert M. Frank, Allen D. Malony, Don M. Tucker: Development of NeuroElectroMagnetic ontologies(NEMO): a framework for mining brainwave ontologies. KDD 2007: 270-279 S. Shende, "Performance Instrumentation and Measurement for Terascale Systems," Proc. International Conference on Computational Science (ICCS 2003), LNCS 2660, Springer, Berlin, pp. 53-62, 2003. S. Shende, "Performance Instrumentation and Measurement for Terascale Systems," Proc. Terascale Performance Analysis Workshop, International Conference on Computational Science (ICCS 2003), 2003. Power Management with Argo. In Parallel and Distributed Processing Symposium Workshops, 2016 IEEE International (pp. 1118- 1121). IEEE. - massively parallel electronic structure calculations with Python-based software." International Conference on Computational Science 2011. , "Experimentally Characterizing the Behavior of Multiprocessor Memory Systems: A Case Study," IEEE Transactions on Software Engineering, 16(2), pp 216-223, 1990 and P.-C. Yew, "Performance Analysis on the Cedar System," CSRD Report No. 680, University of Illinois at Urbana-Champaign, Sept. 1987. Generic and Configurable Source-Code Instrumentation Component". Proceedings of the International Conference on Computational Science 2009. pp. 696-705, LNCS 5545. C. Hulette, Matthew J. Sottile, Allen D. Malony: A Type-Based Approach to Separating Protocol from Application Logic - A Case Study in Hybrid Computer Programming. Euro-Par 2012: 40-51 C. Hulette, Matthew J. Sottile, Allen D. Malony: Composing typemaps in Twig. GPCE 2012: 41-49 C. Hulette, Matthew J. Sottile, Allen D. Malony: WOOL: A Workflow Programming Language. eScience 2008: 71-78 Fursin, Renato Miceli, Anton Lokhmotov, Michael Gerndt, Marc Baboulin, Allen D. Malony, Zbigniew Chamski, Diego Novillo, Davide Del Vento: Collective mind: Towards practical and collaborative auto-tuning. Scientific Programming 22(4): 309-329 (2014) Y. Zhang, "Performance database technology for SciDAC applications", Journal of Physics: Conference Series, vol. 78, June 2007. Characterization of Global Address Space Applications: A Case Study with NWChem", Concurrency and Computation: Practice and Experience 24(2): 135-154, John Wiley and Sons, DOI:10.1002/cpe.1881, 2012. Childs, Scott Biersdorff, David Poliakoff, David Camp, Allen D. Malony: Particle advection performance over varied architectures and workloads. HiPC 2014: 1-10 [tr9605] Harold H. Hersey, Steven T. Hackstadt, Lars T. Hansen, and Allen D. Malony, \fIViz: A Visualization Programming System\fP, University of Oregon, Department of Computer and Information Science, Technical Report CIS-TR-96-05, April 1996. and Automating Performance Diagnosis: the Poirot approach," Proc. 9th International Parallel Processing Symposium (IPPS'95), pp. Brunst, Allen D. Malony, Sameer S. Shende, and Robert Bell, "Online Remote Trace Analysis of Parallel Applications on High-Performance Clusters", Proceedings of ISHPC'03 Conference, LNCS 2858, Springer, Berlin, pp. 440-449,2003. B. Norris, "Capturing Performance Knowledge for Automated Analysis", in International Conference for High Performance Computing, Networking, Storage and Analysis (SC'08), 2008. Support for Parallel Performance Data Mining," Ph.D. Dissertation, University of Oregon, March 2009. A. Morris, "Design and Implementation of a Parallel Performance Data Management Framework," Proc. International Conference on Parallel Processing (ICPP 2005), IEEE Computer Society, 2005. A. D. Malony, "PerfExplorer: A Performance Data Mining Framework for Large- Scale Parallel Computing," in Proc. of SC 2005 Conference, ACM, 2005. and A. Morris. "TAUg: Runtime Global Performance Data Access using MPI." EuroPVM/MPI Conference, LNCS 4192, pp. 313-321, Springer, September 2006. and A. Morris. "TAUg: Runtime Global Performance Data Access using MPI." EuroPVM/MPI Conference, September 2006 A. Morris, "Knowledge Support and Automation for Performance Analysis with PerfExplorer 2.0", Large-Scale Programming Tools and Environments, special issue of Scientific Programming, vol. 16, no. 2-3, pp. 123--134. 2008. Parallel Performance Analysis with TAU, PerfDMF and PerfExplorer." Presented at International Conference on Parallel Computing (ParCo) September 2007. A. Morris, "Scalable, Automated Performance Analysis with TAU and PerfExplorer", in Parallel Computing (ParCo2007), (Aachen, Germany), 2007. Early Prototype of an Autonomic Performance Environment for Exascale." Published in Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers, ICS'13, ACM, DOI: 10.1145/2491661.2481434, 2013. A. Morris, “Parametric Studies in Eclipse with TAU and PerfExplorer,” Workshop on Productivity and Performance (PROPER 2008), EuroPar 2008, Las Palmas de Gran Canaria, Spain, August, 2008. Dongarra, Shirley Moore, Philip Mucci, Sameer Shende, and Allen Malony, "Performance Instrumentation and Measurement for Terascale Systems." 2003 Holistic Approach for Performance Measurement and Analysis for Petascale Applications.” International Conference on Computational Science (ICCS 2009), Baton Rouge, LA, 2009 [sigplan94] Janice Cuny, George Forman, Alfred Hough, Joydip Kundu, Calvin Lin, Lawrence Snyder, and David Stemple, \fIThe Ariadne Debugger: Scalable Application of Event-Based Abstraction\fP, SIGPLAN Notices, Vol. 28, No. 12, 1994, pp. 85-95. [etpsc96] Janice Cuny, Robert Dunn, Steven T. Hackstadt, Christopher Harrop, Harold H. Hersey, Allen D. Malony, and Douglas Toomey, \fIBuilding Domain-Specific Environments for Computational Science: A Case Study in Seismic Tomography\fP, International Journal of Supercomputing Applications and High Performance Computing, Vol. 11, No. 3, Fall 1997. Also appearing in the Proceedings of the Workshop on Environments and Tools For Parallel Scientific Computing, Lyon, France, August 1996. [ijsahpc97] Janice Cuny, Robert Dunn, Steven T. Hackstadt, Christopher Harrop, Harold H. Hersey, Allen D. Malony, and Douglas Toomey, \fIBuilding Domain-Specific Environments for Computational Science: A Case Study in Seismic Tomography\fP, International Journal of Supercomputing Applications and High Performance Computing, Vol. 11, No. 3, Fall 1997. Also appearing in the Proceedings of the Workshop on Environments and Tools For Parallel Scientific Computing, Lyon, France, August 1996. Besnard, Allen D. Malony, Sameer Shende, Marc Pérache, Patrick Carribault, Julien Jaeger: An MPI Halo-Cell Implementation for Zero-Copy Abstraction. EuroMPI 2015: 3:1-3:9 K. Hollingsworth, Allen D. Malony, Jesús Labarta, Thomas Fahringer: Performance Evaluation and Prediction. Euro-Par 2003: 87 [sc98] Jenifer L. Skidmore, Matthew J. Sottile, Janice E. Cuny, and Allen D. Malony, \fIA Prototype Notebook-Based Environment for Computational Tools\fP, Proceedings of Supercomputing '98, Orlando, FL, November 1998. C. Linford, Tyler A. Simon, Sameer Shende, Allen D. Malony: Profiling Non-numeric OpenSHMEM Applications with the TAU Performance System. OpenSHMEM 2014: 105-119 D. Malony, Arndt Bode, Dieter Kranzlmüller: Topic 1: Support Tools and Environments. Euro-Par 2004: 38 and P.C. Yew, "Performance Analysis on the Cedar System", Chapter, Performance Evaluation of Supercomputers, Edited by J. Martin, Elsevier Science Publishers B.V. (North-Holland), pp. 109-142, 1987. Li, Allen D. Malony, Don M. Tucker: A Multiscale Morphological Approach to Topology Correction of Cortical Surfaces. MIAR 2006: 52-59 Li, Allen D. Malony, Don M. Tucker: Automatic brain mr image segmentation by relative thresholding and morphological image analysis. VISAPP (1) 2006: 354-364 Li, Allen D. Malony, Robert Bell, Sameer Shende, "A Framework for Online Performance Analysis and Visualization of Large-Scale Parallel Applications." Presented at PPAM 2003 B. Pugh, "Integrating Database Technology with Comparison-Based Parallel Performance Diagnosis: The Perftrack Performance Experiment Management Tool", in International Conference for High Performance Computing, Networking, Storage and Analysis (SC'05), (Washington, DC, USA), IEEE Computer Society, 2005. M. Frank, Allen D. Malony: Parallel ICA methods for EEG neuroimaging. IPDPS 2006 A. Glass, Gwen A. Frishkoff, Robert M. Frank, Colin Davey, Joseph Dien, Allen D. Malony, Don M. Tucker: A Framework for Evaluating ICA Methods of Artifact Removal from Multichannel EEG. ICA 2004: 1033-1040 A. Huck, Allen D. Malony, Sameer Shende, Doug W. Jacobsen: Integrated Measurement for Cross-Platform OpenMP Performance Analysis. IWOMP 2014: 146-160 A. Huck, Kristin Potter, Doug W. Jacobsen, Hank Childs, Allen D. Malony: Linking performance data into scientific visualization tools. VPA@SC 2014: 50-57 [sc05] Knowledge Engineering for Model-based Parallel Performance Diagnosis (Poster) By Li Li and Allen D. Malony Computer and Information Science Department, University of Oregon, Eugene, OR Mey, S. Biersdorff, K. Diethelm, D. Eschweiler, M. Geimer, M. Gerndt, D. Lorenz, A. Malony, W. Nagel, Y. Oleynik, P. Philippen, P. Saviankou, D. Schmidl, S. Shende, R. Tshueter, M. Wagner, B. Wesarg, and F. Wolf, "Score-P: A Joint Performance Measurement Run-Time Infrastructure for Periscope, Scalasca, TAU, and Vampir," in Proc. of the 5th International Workshop on Parallel Tools for High Performance Computing, September 2011, ZIH, Dresden, piblished as book "Tools for High Performance Computing," Eds. H. Brunst, M. Muller, W. Nagel, M. Resch, pp. 79-92, Springer, 2011. [joydip96] J. Kundu, \fIIntegrating Event- and State- Based Approaches to Debugging of Parallel Programs\fP, PhD Thesis, University of Massachusetts, Amherst, MA 01003, 1996. [front95] J. Kundu and J. E. Cuny, \fIA Scalable, Visual Interface for Debugging with Event-Based Behavioral Abstraction\fP, Frontiers of Massively Parallel Computing, 1995, pp. 472-479. [icpp95] J. Kundu and J. E. Cuny, \fIThe Integration of Event- and State-Based Debugging in Ariadne\fP, Proceedings of the International Conference on Parallel Processing (ICPP '95), August 1995, pp. II 130-134. Gallivan, Dennis Gannon, William Jalby, Allen D. Malony, Harry A. G. Wijshoff, "Behavioral Characterization of Multiprocessor Memory Systems: a Case Study," ACM SIGMETRICS Performance Evaluation Review, Vol. 17, Issue 1, pp. 79-88, May 1989. Gallivan, William Jalby, Allen Malony, Harry Wijshoff, "Performance Prediction of Loop Constructs on Multiprocessor Hierarchical-Memory Systems," Proc. 3rd International Conference on Supercomputing (ICS'86), pp. 433-442, 1986. A. Morris, "TAUmon: Scalable Online Performance Data Analysis in TAU", in 3rd Workshop on Productivity and Performance (PROPER 2010), 2010. Li, Allen D. Malony: Automatic Performance Diagnosis of Parallel Computations with Compositional Models. IPDPS 2007: 1-8 K. Huck, "Model-Based Relative Performance Diagnosis of Wavefront Parallel Computations", in International Conference on High Performance Computing and Communications (HPCC2006), (Munich, Germany), 2006. A. D. Malony, "Knowledge Engineering for Automatic Parallel Performance Diagnosis," (submitted to) Concurrency and Computation: Practice and Experience, John Wiley & Sons, 2005. Performance Diagnosis of Master-Worker Parallel Computations," Euro-Par 2006 Parallel Processing Conference September 2006 (LNCS 4128). Pages 35-46. Performance Analysis and Visualization of Large-Scale Parallel Applications", Poster SC 2002 conference. Automatic Performance Diagnosis of Parallel Computations." Department of Computer and Information Science University of Oregon. Feburary 2007 "Neuroanatomical Segmentation in MRI Exploiting a priori Knowledge." Department of Computer and Information Science University of Oregon. March 2007 Tool Framework for Static and Dynamic Analysis of Object-Oriented Software with Templates." Proceedings of SC2000: High Performance Networking and Computing Conference, Dallas, November 2000. Tool Framework for Static and Dynamic Analysis of Object-Oriented Software with Templates." Talk at SC2000: High Performance Networking and Computing Conference, Dallas, November 2000. [spdt98] K. Lindlan, A. Malony, J. Cuny, S. Shende, and P. Beckman, \fIAn IL Converter and Program Database for Analysis Tools\fP, Proceedings of ACM SIGMETRICS Symposium on Parallel and Distributed Tools (SPDT '98), August 1998, pp. 153. Production OpenSHMEM Applications. In Workshop on OpenSHMEM and Related Technologies (pp. 219-224). Springer International Publishing. Engineering FUN3D at Scale with TAU Commander", Poster, SC'16 Conference, 2016. (Eds), "OpenMP Shared Memory Parallel Programming". Proceedings of the International Workshop, IWOMP 2005 and IWOMP 2006. and C. Lamb, "Parallel Performance Measurement of Heterogeneous Parallel Systems with GPUs." Presented at International Conference on Parallel Processing Sept 2011. Experimental Approach to Performance Measurement of Heterogeneous Parallel Applications using CUDA." Presented at International Conference on Supercomputing, Tsukuba, Japan 2010. Computational Architectures for Integrated Time-Dynamic Neuroimaging," HBP Neuroinformatics conference, 2000. Computational Architectures for Integrated Time-Dynamic Neuroimaging," presentation at Hill Center, 2001. Performance Analysis in Complex Scientific Software: Experiences with the Uintah Computational Framework," presentation at FZJ, ZAM, NIC Germany, 2002. [psc94] A. Malony, B. Mohr, P. Beckman, D. Gannon, \fIProgram Analysis and Tuning Tools for a Parallel Object Oriented Language: An Experiment with the TAU System\fP, Proceedings of the Workshop on Parallel Scientific Computing, Cape Cod, MA, October 1994. [ipps94] A. Malony, B. Mohr, P. Beckman, D. Gannon, S. Yang, F. Bodin, \fIPerformance Analysis of pC++: A Portable Data-Parallel Programming System for Scalable Parallel Computers\fP, Proceedings of the 8th International Parallel Processing Symbosium (IPPS), Cancún, Mexico, April 1994, pp. 75-85. Technology for Complex Parallel and Distributed Systems," presentation at T.U.M. Germany, 2000. Tools Interface for OpenMP," a presentation to the OpenMP Futures Committee, 2001. B. K. Totty, "An Integrated Performance Data Collection Analysis, and Visualization System," Proc. Fourth Conferenceon Hypercube Concurrent Computers and Applications, Mar. 1989. Also appears as Technical Report UIUCDCS-R-89-1504, Center for Supercomputing Research and Development, U. of Ill., March 1989. Measurement Intrusion and Perturbation Analysis," IEEE Transactions on Parallel and Distributed Systems," 3(4) July 1992, pp. 433-450, 1992. Processor Arrays," CSRD Report No. 734, UIUC, Jan. 1988. in the TAU Performance System," Chapter, "Performance Analysis and Grid Computing," (Eds. V. Getov, M. Gerndt, A. Hoisie, A. Malony, B. Miller), Kluwer, Norwell, MA, pp. 129-144, 2003. R. Bell, "Online Performance Observation of Large-Scale Parallel Applications," Proc. Parco 2003 Symposium, in "Parallel Computing: Software Technology, Algorithms, Architectures and Applications," (Eds. G. R. Joubert, W. E. Nagel, F. J. Peters, and W. V. Walter), Advances in Parallel Computing, Vol. 13, Elsevier B.V., pp. 761 -768, 2004. Program Analysis Framework for the DOE ACTS Toolkit," presentation at NERSC ACTS booth, SC'00, 2002. R. A. Bell, "TAU Performace System: Developments and Evolution," presentation at LLNL, 2001. B. Mohr, "Performance Technology for Complex Parallel Systems," Tutorial at SC'01 conference, Nov. 2001. and A. Nataraj. "Evolution of a Parallel Performance System," Second International Workshop on Tools for High Performance Computing. July 2008 A. Morris, "Phase-Based Parallel Performance Profiling," (to appear) Proc. of PARCO 2005 conference. F. Wolf, "Compensation of Measurement Overhead in Parallel Performance Profiling," in International Journal of High Performance Computing Applications (IJHPCA), Vol 21, No. 2, pp. 174--194, Summer 2007. S. S. Shende, "Overhead Compensation in Performance Profiling," Proc. Europar 2004 Conference, LNCS 3149, Springer, pp. 119-132, 2004. and S. Shende, "Performance Technology for Complex Parallel and Distributed Systems," Proc. Third Austrian-Hungarian Workshop on Distributed and Parallel Systems, DAPSYS 2000, "Distributed and Parallel Systems: From Concepts to Applications," (Eds. G. Kotsis and P. Kacsuk)Kluwer, Norwell, MA, pp. 37-46, 2000. Technology for Complex Parallel and Distributed Systems," presentation at DAPSYS 2000 conference, 2000. M. Sottile, "Performance Technology for Parallel and Distributed Component Software," Concurrency and Computation: Practice and Experience, Vol. 17, Issue 2-4, pp. 117-141, John Wiley & Sons, Ltd., Feb - Apr, 2005. M. Sottile. \fIComputational Experiments using Distributed Tools in a Web-based Electronic Notebook Environment\fP, Proceedings of HPCN Europe '99, LNCS 1593, Springer, Berlin, pp. 381 -390, April 1999. Framework for Parallel Performance Analysis," presentation at PTOOLS meeting, 2000. for Parallel Computing: A Performance Evaluation Perspective," in J. Blazewicz et. al. (Editors), Handbook on Parallel and Distributed Processing, Springer Verlag, pp. 342-363, 2000. Bubak, Wlodzimierz Funika, Marcin Koch, Dominik Dziok, Allen D. Malony, Marcin Smetek, Roland Wismüller: Towards the Performance Visualization of Web-Service Based Applications. PPAM 2005: 108-115 J. Sottile, "The design of a general method for constructing coupled scientific simulations," M.S. Thesis, University of Oregon, 2001. J. Sottile, Geoffrey C. Hulette, Allen D. Malony: Workflow representation and runtime based on lazy functional streams. SC-WORKS 2009 [europar99] Matthew Sottile and Allen Malony, \fIINTERLACE: An Interoperation and Linking Architecture for Computational Engines\fP, Proceedings of EuroPar 99 Conference, LNCS 1685, Springer, Berlin, pp.135-138, 1999. S. Shende, "Research Initiatives for Plug-and-play Scientific Computing", J. Physics: Conference Series Vol. 78 No. 012046, doi:10.1088/1742-6596/78/1/012046, Proc. SciDAC Conference, 2007. O. McCracken, Allan Snavely, Allen Malony, "Performance Modeling for Dynamic Algorithm Selection," Proc. International Conference on Computational Science (ICCS'03), LNCS 2660, Springer, Berlin, pp. 749-758, 2003. T. Heath, Allen D. Malony, Diane T. Rover, "The Visual Display of Parallel Performance Data," IEEE Computer, 28(11), Nov. 1995, pp. 21-28, 1995. [ieeecomp95] Michael T. Heath, Allen D. Malony, and Diane T. Rover, \fIThe Visual Display of Parallel Performance Data\fP, IEEE Computer, Vol. 28, No. 11, November 1995, pp. 21-28. [ieeepdt95] Michael T. Heath, Allen D. Malony, and Diane T. Rover, \fIParallel Performance Visualization: From Practice To Theory\fP, IEEE Parallel and Distributed Technology, Vol. 3, No. 4, Winter 1995, pp. 44-60. [etpsc92] B. Mohr, \fIStandardization of Event Traces Considered Harmful or Is an Implementation of Object-Independent Event Trace Monitoring and Analysis Systems Possible?\fP, Proceedings of the CNRS-NSF Workshop on Environments and Tools For Parallel Scientific Computing, St. Hilaire du Touvet, France, Elsevier, Advances in Parallel Computing, Vol. 6, September 1992, pp. 103-124. Portable Parallel Program Analysis Environment for pC++, Proceedings of CONPAR 94 - VAPP VI, University of Linz, Austria, LNCS 854, September 1994, pp. 29-40. the Integration and Use of OpenMP Performance Tools in the SPEC OMP2001 Benchmarks," Presentation at the WOMPAT 2002 conference. F. Wolf, "Design and Prototype of a Performance Tool Interface for OpenMP," Proceedings of the LACSI Symposium, 2001. and Prototype of a Performance Tool Interface for OpenMP," The Journal of Supercomputing, 23, 105-128,2002 Kluwer Academic Publishers. F. Wolf, "Towards a Performance Tool Interface for OpenMP: An Approach Based on Directive Rewriting," Presentation at EWOMP'01 Third European Workshop on OpenMP, Sept. 2001. F. Wolf, "Towards a Performance Tool Interface for OpenMP: An Approach Based on Directive Rewriting," Proceedings of EWOMP'01 Third European Workshop on OpenMP, Sept. 2001. B. Mohr, "A Scalable Approach to MPI Application Performance Analysis," in Proc. of EuroPVM/MPI 2005, (eds. B. Di Martino) LNCS 3666, Springer, pp. 309-316, 2005. K. Huck "Design and Implementation of a Hybrid Parallel Performance Measurement System." International Conference on Parallel Processing September 2010. pages 492-501 Nested OpenMP Parallelism in the TAU Performance System," (to appear) International Journal of Parallel Programming, Springer, LNCS, 2007. Nested OpenMP Parallelism in the TAU Performance System," (to appear) Proceedings of the IWOMP 2006 Conference, Springer, LNCS, 2007. Performance System", presentation at BGL Workshop, Tokyo 2006 and S. Shende. "Observing Performance Dynamics using Parallel Profile Snapshots," European Conference on Parallel Processing (EuroPar 2008). August 2008 Framework for Scalable, Parallel Performance Monitoring" published in Concurrency and Computation: Practice and Experience, Special Issue from STHEC'08 Workshop. Search of Sweet-Spots in Parallel Performance Monitoring", Presented at International Conference on Cluster Computing, Tsukuba, Japan, September 2008 (ToM) : A Framework for Scalable Parallel Performance Monitoring", Presented at STHEC'08: International Workshop on Scalable Tools for High-End Computing, held in conjunction with the International Conference on Supercomputing (ICS 2008) Experiences with KTAU on the IBM BG/L," Proc. EUROPAR 2006 Conference, Springer, LNCS 4128, pp. 99-110, 2006. Experiences with KTAU on the IBM Blue Gene / L" Europar 2006. [cluster07] A. Nataraj, A.Malony, S. Shende, A. Morris, "Integrated Parallel Performance Views." Appears in Cluster Computing published by Springer Netherlands [cluster06] A. Nataraj, A. Malony, S. Shende, A. Morris, "Kernel-Level Measurement for Integrated Parallel Performance Views: the KTAU Project," Measurement for Integrated Parallel Performance Views: the KTAU Project," In Proc. Cluster 2006, IEEE Computer Society, 2006. Ghost in the Machine: Observing the Effects of Kernel Operation on Parallel Application Performance." Supercomputing Conference 2007. Ghost in the Machine: Observing the Effects of Kernel Operation on Parallel Application Performance," Presented at SuperComputing Conference 2007. Online Parallel Performance Measurement Over a Cluster Monitor." Presented at LACSI'06 (Los Alamos Computer Science Institute Symposium). Online Parallel Performance Monitoring." Presented at EuroPar 2007. (ToS) Low-Overhead Online Parallel Performance Monitoring." Presented at Euro-Par 2007. [ibm06] A. Nataraj, "TAU: Recent Advances KTAU: Kernel-Level Measurement for Integrated Parallel Performance Views, TAUg: Runtime Global Performance Data Access Using MPI." Chaimov, Allen D. Malony, Shane Canon, Costin Iancu, Khaled Z. Ibrahim, Jay Srinivasan: Scaling Spark on HPC Systems. HPDC 2016: 97-110 Chaimov, Boyana Norris, Allen D. Malony: Toward multi-target autotuning for accelerators. ICPADS 2014: 534-541 Chaimov, Scott Biersdorff, Allen D. Malony: Tools for machine-learning-based empirical autotuning and specialization. IJHPCA 27(4): 403-411 (2013) Dale Trebon, "Performance Measurement and Modeling of Component Applications in a High Performance Computing Environment," M.S. Thesis, University of Oregon, June 2005 and S. Shende, "Computational Quality of Service for Scientific Components," Proceedings of the International Symposium on Component-Based Software Engineering (CBSE7), Edinburgh, Scotland, LNCS 3054, Springer, pp. 264-271, May 2004. Also available as Argonne National Laboratory preprint ANL/MCS-P1131-0204. hartree-fock application using upc++ and the new darray library. In Parallel and Distributed Processing Symposium, 2016 IEEE International (pp. 453-462). IEEE. UA? CG Workflow: High Performance Molecular Dynamics of Coarse-Grained Polymers. In Parallel, Distributed, and Network-Based Processing (PDP), 2016 24th Euromicro International Conference on (pp. 272-279). IEEE. Performance Analysis of SIMD Algorithms for Monte Carlo Simulations of Nuclear Reactor Cores. In Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International (pp. 733- 742). IEEE. Load Balancing Algorithms for Block-Sparse Tensor Contractions," in Proc. International Conference on Parallel Processing (ICPP'13), IEEE, 10.1109/ICPP.2013.12, 2013. Grubel, Hartmut Kaiser, Kevin A. Huck, Jeanine Cook: Using Intrinsic Performance Counters to Assess Efficiency in Task-Based Parallel Applications. IPDPS Workshops 2016: 1692-1701 S. Kassinos, "Test-driven coarray parallelization of a legacy Fortran Application," Proc. SE-HPCCSE 2013: 1st International Workshop on Software Engineering for Performance Computing in Computational Science and Engineering, workshop at SC'13, ACM SIGHPC, pp. 33-40, 2013. Analysis and Automatic Code Generation for Improved Fortran90 and C++ Interoperability," Proceedings of LACSI Symposium, 2001. A. D. Malony, "Bridging the language gap in scientific computing: the Chasm approach," Concurrency and Computation: Practice and Experience,Volume 18, Issue 2 (February 2006), pp. 151-162, John Wiley & Sons, 2006. A. Malony, "Performance Measurement and Modeling of Component Applications in a High Performance Computing Environment: A Case Study," Proc. 18th International Parallel and Distributed Processing Symposium (IPDPS'04), IEEE Computer Society, 2004. A. Malony, "Performance Measurement and Modeling of Component Applications in a High Performance Computing Environment: A Case Study, " Technical Report SAND2003-8631, Sandia National Laboratories, Livermore, CA, Nov. 2003. Bell, Allen D. Malony, Sameer Shende, "ParaProf: A Portable, Extensible, and Scalable Tool for Parallel Performance Profile Analysis." Presented at EuroPar August 2003 Lim, Allen D. Malony, Boyana Norris, Nicholas Chaimov: Identifying Optimization Opportunities Within Kernel Execution in GPU Codes. Euro- Par Workshops 2015: 185-196 J. J. Mortensen, "Design and performance characterization of electronic structure calculations on massively parallel supercomputers: a case study of GPAW on the Blue Gene/P architecture", Concurrency and Computation: Practice and Experience, Dec. 2013, DOI: 10.1002/cpe.3199, John Wiley and Sons. Software Framework for Simulation-based Scientific Investigations." Department of Computer and Information Science University of Oregon. March 2010 in electrical neuroinformatics: parallel computation for studying the volume conduction of brain electrical fields in human head tissues. Concurrency and Computation: Practice and Experience. Shende , "The TAU Performance System: Advances in Performance Mapping." Shende , "Tuning and Analysis Utilities ." Shende, "Building Your Own Performance Evaluation Tools", talk at Portland State University, May 13, 2000. Shende, "Generating Proxy Components using PDT." Presented at Boulder CCA Meeting April 2004 Shende, "TAU: New Directions", presentation at Parallel Software Tools Workshop, LACSI 2000 Symposium, Aug 28-30, 2000, Santa Fe, NM. Shende, Alan Morris, "Advances in the TAU Performance System." Shende, Allen D. Malony , "Performance Optimization and Tools for HPC Architectures using TAU." Presented at ERDC October 2004 [epvmmpi05] Sameer Shende, Allen D. Malony, Alan Morris, Felix Wolf, "Performance Profiling Overhead Compensation for MPI Programs ." Presented at EuroPVM-MPI Shende, Allen D. Malony, "Integration and Application of the TAU Performance System in Parallel Java Environments." Presented at SMPAG Java Interest Group May 2002 Shende, Allen D. Malony, "Performance Technology for Complex Parallel Systems", Talk at Army Research Lab (Aberdeen Proving Ground), MD, Sept. 2002. Shende, Allen D. Malony, "Recent Advances in the TAU Performance System," Presentation at PTOOLS'02 meeting, Knoxville, TN, Sept. 2002. Shende, Allen D. Malony, "TAU: Performance Technology for Productive, High Performance Computing." Presented at LLNL 2005 Shende, Allen D. Malony, "TAU Performance System ." Presented at ACTS Workshop 2005 Shende, Allen D. Malony, Robert Bell University of Oregon , "The TAU Performance Technology for Complex Parallel Systems." Presented at NRL D.C. BYOC Workshop August 2004 Shende, Allen D. Malony, Robert Bell, "The TAU Performance Technology for Complex Parallel Systems." Presented at NASA Stennis Space Center March 2004 Shende, Nancy Collins, "Using TAU Performance Technology in ESMF." Presented at ESMF Team Meeting July 2004 Shende, Steven T. Hackstadt, and Allen D. Malony, "Dynamic Performance Callstack Sampling: Merging TAU and DAQV-II," Proceedings of the Fourth International Workshop on Applied Parallel Computing (PARA98), June 14-17, 1998, LNCS 1541, Springer, Berlin, pp. 515-520, 1998. Shende, and Allen D. Malony, "Performance Tools for Parallel Java Environments," slides from talk at Second Workshop on Java for High Performance Computing, ICS 2000, Santa Fe, May 2000. Suresh Shende, Allen D. Malony, Alan Morris: Improving the Scalability of Performance Evaluation Tools. PARA (2) 2010: 441-451 Sharma, Allen D. Malony, Michael W. Berry, Priyamvada Sinvhal- Sharma, "Run-time monitoring of concurrent programs on the Cedar multiprocessor ," Proc. 1990 conference on Supercomputing, pp. 784-793, 1990 R. Sarukkai, Allen D. Malony, "Perturbation Analysis of High Level Instru mentation for SPMD Programs," Proc. fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'93), pp. 44-53, 1993. Turovets, Pieter Poolman, Adnan Salman, Allen D. Malony, Don M. Tucker: Conductivity Analysis for High-Resolution EEG. BMEI (2) 2008: 386-393 Turovets, Vasily Volkov, Aleksei Zherdetsky, Alena Prakonina, Allen D. Malony: A 3D Finite-Difference BiCG Iterative Solver with the Fourier-Jacobi Preconditioner for the Anisotropic EIT/EEG Forward Problem. Comp. Math. Methods in Medicine 2014: 426902:1-426902:12 (2014) Mayanglambam, Allen D. Malony, Matthew J. Sottile: Performance Measurement of Applications with GPU Acceleration using CUDA. PARCO 2009: 341-348 [mmb95] K. Shanmugam, A. Malony, B. Mohr, \fISpeedy: An Integrated Performance Extrapolation Tool for pC++ Programs\fP, Proceedings of the Joint Conference PERFORMANCE TOOLS '95 and MMB '95, September 1995, Heidelberg, Germany. Runtime Monitoring Framework for the TAU Profiling System", Proceedings of the Third International Symposium on Computing in Object-Oriented Parallel Environments (ISCOPE'99), LNCS 1732, Springer, Berlin, pp. 170-181, December 1999. [linux99] S. Shende, \fIProfiling and Tracing in Linux\fP, Proceedings of the Extreme Linux Workshop #2, USENIX, Monterey CA, June 1999. [spdt96] S. Shende, J. Cuny, L. Hansen, J. Kundu, S. McLaughry and O. Wolf, \fIEvent and State-Based Debugging in TAU: A Prototype\fP, Proceedings of ACM SIGMETRICS Symposium on Parallel and Distributed Tools (SPDT '96), May, 1996, pp. 21-30. and Measurement Strategies for Flexible and Portable Empirical Performance Evaluation," Proceedings Tools and Techniques for Performance Evaluation Workshop, PDPTA'01, CSREA, Vol. 3, pp. 1150-1156, June 2001. and Measurement Strategies for Flexible and Portable Empirical Performance Evaluation," presentation at Tools and Techniques for Performance Evaluation Workshop, PDPTA'01, C.S.R.E.A., June 2001. [spdt98] S. Shende, A. D. Malony, J. Cuny, K. Lindlan, P. Beckman and S. Karmesin, \fIPortable Profiling and Tracing for Parallel Scientific Applications using C++\fP, Proceedings of ACM SIGMETRICS Symposium on Parallel and Distributed Tools (SPDT '98), August, 1998, pp. 134-145. A. D. Malony, "Integration and Application of TAU in Parallel Java Environments," Concurrency and Computation: Practice and Experience, Volume 15 (3-5), Mar-Apr 2003, Wiley, pp. 501-519, 2003. and A. D. Malony, "Integration and Application of the TAU Performance System in Parallel Java Environments," Proceedings of the Joint ACM Java Grande - ISCOPE 2001 Conference, pp. 87-96, June 2001. and A. D. Malony, "Integration and Application of the TAU Performance System in Parallel Java Environments," presentation at the Joint ACM Java Grande - ISCOPE 2001 Conference, June 2001. P. Beckman, "Performance and Memory Evaluation Using TAU," In Proc. for Cray User's Group Conference (CUG 2006), 2006. and Memory Evaluation using TAU," Presentation at the Cray User's Group conference (CUG'06), May 2006. and Memory Evaluation using the TAU Performance System," Presented in SIAM, 2006. for Performance Discovery and Optimization," Presented at SIAM, 2006. [para06] S. Shende, A. Malony, A. Morris, "Optimization of Instrumentation in Parallel Performance Evaluation Tools," Performance Research Laboratory, Department of Computer and Information Science University of Oregon, Eugene, OR, USA. of Instrumentation in Parallel Performance Evaluation Tools," in Proc. PARA 2006 Conference, Springer, LNCS, 2006. de St. Germain, "Performance Evaluation of Adaptive Scientific Applications using TAU," Parallel Computational Fluid Dynamics: Theory and Applications, Proceedings of the 2005 International Conference on Parallel Computational Fluid Dynamics, May 24-27. St. Germain, "Performance Evaluation of Adaptive Scientific Applications using TAU," chapter, Parallel Computational Fluid Dynamics - Theory and Applications, (eds - A. Deane et. al.), pp. 421-428, Elsevier B.V., 2006. A. Morris, "TAU Performance System," talk at ACTS Workshop, LBL, Aug. 2006. F. Wolf, "Performance Profiling Overhead Compensation for MPI Programs," in Proc. EuroPVM/MPI 2005 Conference, (eds. B. Di. Martino et. al.), LNCS 3666, Springer, pp. 359-367, 2005. [para06] S. Shende, A. Malony, A. Morris, "Workload Characterization using the TAU Performance System," Performance Research Laboratory, Department of Computer and Information Science University of Oregon, Eugene, OR, USA. Characterization using the TAU Performance System," in Proc. of PARA 2006 Conference, Springer, LNCS, 2006. A. D. Malony, "Performance Tools for Parallel Java Environments," Proc. Second Workshop on Java for High Performance Computing, ICS 2000, May 2000. Performance Interface for Component-Based Applications," Proc. International Workshop on Performance Modeling, Evaluation, and Optimization of Parallel and Distributed Systems, IPDPS'03, IEEE Computer Society, 278, 2003. K. Schuchardt, "Characterizing I/O Performance Using the TAU Performance System." Presented at ICPP Parco 2011 conference Exascale Mini-symposium. and A. D. Malony, "The TAU Parallel Performance System," International Journal of High Performance Computing Applications, SAGE Publications, 20(2):287-331, Summer 2006 Role of Instrumentation and Mapping in Performance Measurement," Ph.D. Dissertation, University of Oregon, August 2001. TAU Performance System: Advances in Performance Mapping," presentation at "Tools for Performance Analysis of large Scale Applications," workshop, LACSI 2001 Symposium, Santa Fe, NM, Oct. 15-18, 2001. and Analysis Utilities," presentation at ACTS Toolkit Workshop, "Solving Problems in Science and Engineering," LBNL, NERSC, Berkeley, CA, Oct. 10-13, 2001. Lefantzi, Jaideep Ray, and Sameer Shende, "Strong Scalability Analysis and Performance Evaluation of a SAMR CCA-based Reacting Flow Code," Poster, SC2003 Conference, Nov. 2003. Approach to Creating Performance Visualizations in a Parallel Profile Analysis Tool." Presented at the Workshop on Productivity and Performance (PROPER 2011), August 2011. TAU with Eclipse: A Performance Analysis System in a Integrated Development Environment," High Performance Computing and Communications (HPCC) Conference. September 2006 (LNCS 4208). Pages 230-239. Performance Analysis Tuning Part of the Software Development Cycle.” UGC 2009, San Diego, CA, June 15-18, 2009. McLaughry, "Debugging Optimized Parallel Programs," Directed Research Project (DRP) report, University of Oregon, May 1997. [hackstadt94] Steven T. Hackstadt, \fIPrototyping Advanced Parallel Program and Performance Visualizations\fP, University of Oregon, Department of Computer and Information Science, Masters Thesis, June 1994. Also available as University of Oregon, Department of Computer and Information Science, Technical Report CIS-TR-95-15, June 1995. [hackstadt97] Steven T. Hackstadt, \fIDomain-Specific Metacomputing for Computational Science: Achieving Specificity Through Abstraction\fP, University of Oregon, Department of Computer and Information Science, Oral Comprehensive Exam Position Paper, September 1997. Available as University of Oregon, Department of Computer and Information Science, Technical Report CIS-TR-97-08, November 1997. [tr9515] Steven T. Hackstadt, \fIPrototyping Advanced Parallel Program and Performance Visualizations\fP, University of Oregon, Department of Computer and Information Science, Masters Thesis, June 1994. Also available as University of Oregon, Department of Computer and Information Science, Technical Report CIS-TR-95-15, June 1995. [tr9708] Steven T. Hackstadt, \fIDomain-Specific Metacomputing for Computational Science: Achieving Specificity Through Abstraction\fP, University of Oregon, Department of Computer and Information Science, Oral Comprehensive Exam Position Paper, September 1997. Available as University of Oregon, Department of Computer and Information Science, Technical Report CIS-TR-97-08, November 1997. [shpcc94] Steven T. Hackstadt, Allen D. Malony, and Bernd Mohr, \fIScalable Performance Visualization for Data-Parallel Programs\fP, Proc. of the Scalable High Performance Computing Conference (SHPCC), Knoxville, TN, May 1994, pp. 342-349. Also available as University of Oregon, Department of Computer and Information Science, Technical Report CIS-TR-94-09, March 1994. [tr9409] Steven T. Hackstadt, Allen D. Malony, and Bernd Mohr, \fIScalable Performance Visualization for Data-Parallel Programs\fP, Proc. of the Scalable High Performance Computing Conference (SHPCC), Knoxville, TN, May 1994, pp. 342-349. Also available as University of Oregon, Department of Computer and Information Science, Technical Report CIS-TR-94-09, March 1994. [hpdc98] Steven T. Hackstadt, Christopher W. Harrop, and Allen D. Malony, \fIA Framework for Interacting with Distributed Programs and Data\fP, Proceedings of the Seventh IEEE International Symposium on High Performance Distributed Computing (HPDC7), Chicago, IL, July 28-31, 1998, pp. 206-214. Also available as University of Oregon, Department of Computer and Information Science, Technical Report CIS-TR-98-02, June 1998. [tr9802] Steven T. Hackstadt, Christopher W. Harrop, and Allen D. Malony, \fIA Framework for Interacting with Distributed Programs and Data\fP, Proceedings of the Seventh IEEE International Symposium on High Performance Distributed Computing (HPDC7), Chicago, IL, July 28-31, 1998. Also available as University of Oregon, Department of Computer and Information Science, Technical Report CIS-TR-98-02, June 1998. [dxcomm95] Steven T. Hackstadt and Allen D. Malony, \fIVisualizing Parallel Program and Performance Data with IBM Visualization Data Explorer\fP, IBM Visualization Data Explorer Communiqué Newsletter, Vol. 3, No. 1, March 1995, pp. 6-8. [europar96] Steven T. Hackstadt and Allen D. Malony, \fIDistributed Array Query and Visualization for High Performance Fortran\fP, Proc. of Euro-Par '96, Lyon, France, August 1996, pp. 55-63. Also available as University of Oregon, Department of Computer and Information Science, Technical Report CIS-TR-96-02, February 1996. [ieeecga95] Steven T. Hackstadt and Allen D. Malony, \fIVisualizing Parallel Programs and Performance\fP, IEEE Computer Graphics and Applications, Vol. 15, No. 4, July 1995, pp. 12-14. [istspie95] Steven T. Hackstadt and Allen D. Malony, \fICase Study: Applying Scientific Visualization to Parallel Performance Visualization\fP, Proc. of the IST&T/SPIE symposium on Electronic Imaging: Science and Technology, Conference on Visual Data Exploration and Analysis, San Jose, CA, February 1995, pp. 238-247. [parle94] Steven T. Hackstadt and Allen D. Malony, \fINext-Generation Parallel Performance Visualization: A Prototyping Environment for Visualization Development\fP, Proc. of the Parallel Architectures and Languages Europe (PARLE) Conference, Athens, Greece, July 1994, pp. 192-201. Also available as University of Oregon, Department of Computer and Information Science, Technical Report CIS-TR-93-23, October 1993. [pgipp96] Steven T. Hackstadt and Allen D. Malony, \fIDistributed Array Query and Visualization for High Performance Fortran\fP, Peak Performance Newsletter, Portland Group, Inc., Spring 1996. [tcs98] Steven T. Hackstadt and Allen D. Malony, \fIDAQV: Distributed Array Query and Visualization Framework\fP, Journal of Theoretical Computer Science, special issue on Parallel Computing, Vol. 196, No. 1-2, April 1998, pp. 289-317. [tr9321] Steven T. Hackstadt and Allen D. Malony, \fIData Distribution Visualization (DDV) for Performance Visualization\fP, University of Oregon, Department of Computer and Information Science, Technical Report CIS-TR-93-21, October 1993. [tr9323] Steven T. Hackstadt and Allen D. Malony, \fINext-Generation Parallel Performance Visualization: A Prototyping Environment for Visualization Development\fP, Proc. of the Parallel Architectures and Languages Europe (PARLE) Conference, Athens, Greece, July 1994, pp. 192-201. Also available as University of Oregon, Department of Computer and Information Science, Technical Report CIS-TR-93-23, October 1993. [tr9602] Steven T. Hackstadt and Allen D. Malony, \fIDistributed Array Query and Visualization for High Performance Fortran\fP, Proc. of Euro-Par '96, Lyon, France, August 1996, pp. 55-63. Also available as University of Oregon, Department of Computer and Information Science, Technical Report CIS-TR-96-02, February 1996. A. D. Malony, "Performance Modeling of Component Assemblies," Concurrency and Computation: Practice and Experience, CPE 1076, Special issue Compframe 2005, John Wiley, 2006. A. Malony, "Performance Modeling of Component Assemblies with TAU," Proc. Workshop on Component Models and Frameworks in High Performance Computing (CompFrame 2005). A. Malony, "Performance Modeling of Component Assemblies with TAU," presentation at the CompFrame05 conference, Atlanta, 2005. A. Malony, "An Approximate Method for Optimizing HPC component Applications in the Presence of Multiple Component Implementations," Technical Report SAND2003-8760C, Sandia National Laboratories, Livermore, CA, December 2003. Available from [http://infoserve.sandia.gov/sand_doc/ 2003/038760c.pdf] S. Shende, "On Using SCALEA for Performance Analysis of Distributed and Parallel Programs," Proceedings of SC'2001 conference, Nov. 2001. S. Smith, "SMARTS: Exploiting Temporal Locality and Parallelism through Vertical Execution," Proceedings of ACM International Conference on Supercomputing (ICS '99), pp. 302-310, 1999. Volkov, Aleksei Zherdetsky, Sergei Turovets, Allen D. Malony: A 3D Vector-Additive Iterative Solver for the Anisotropic Inhomogeneous Poisson Equation in the Forward EEG problem. ICCS (1) 2009: 511-520 A. Guarna, Jr., Dennis Gannon, David Jablonowski, Allen D. Malony, Yogesh Gaur, "Faust: An Integrated Environment for Parallel Programming," IEEE Software Vol. 6, No. 4, July/August 1989, pp. 20-27, 1989. Abu-Sufah, Allen D. Malony, "Vector Processing on the Alliant FX/8 Multiprocessor," Proc. of ICPP 1986, pp. 559-566, 1986. A. Morris, "Trace-Based Parallel performance Overhead Compensation," in Proc. of HPCC 2005 Conference, (eds. L. T. Yang, et. al.), LNCS 3726, Springer, pp. 617-628, 2005. Y. Zhang, "Performance Analysis of GYRO: A Tool Evaluation," Poster, Scientific Discovery through Advanced Computing Conference, (SciDAC 2005), 2005. Y. Zhang, "Performance analysis of GYRO: a tool evaluation", Journal of Physics: Conference Series 16 (2005) pp. 551-555, SciDAC 2005, Institute of Physics Publishing Ltd., 2005. Spear, Allen D. Malony, Alan Morris, Sameer Shende: Performance Tool Workflows. ICCS (3) 2008: 276-285 Dai, Boyana Norris, Allen D. Malony: Autoperf: Workflow Support for Performance Experiments. WOSP-C@ICPE 2015: 11- 16 X. Wu, "US QCD Computational Performance Studies with PERI," J. Physics: Conference Series Vol. 78, No. 012083 doi:10.1088/1742-6596/78/1/012083, Proc. of SciDAC 2007 conference, 2007. machine learning- based profiler for self-adaptive instrumentation of scientific workflows. Procedia Computer Science, 80, 1507-1518. St. Germain, A. Morris, S. G. Parker, A. D. Malony, and S. Shende, "Integrating Performance Analysis in the Uintah Software Development Cycle," Proceedings of the ISHPC'02 conference, LNCS 2327,Springer, Berlin, pp. 190-206, 2002. Rong, Dejing Dou, Gwen A. Frishkoff, Robert M. Frank, Allen D. Malony, Don M. Tucker: A Semi-Automatic Framework for Mining ERP Patterns. AINA Workshops (1) 2007: 329-334