Abstract Accounting for Memory Bank Contention and Delay in High-Bandwidth Multiprocessors (2009)
Guy E. Blelloch, Phillip B. Gibbons, Yossi Matias, Marco Zagha
This paper considers issues of memory performance in shared memory multiprocessors that provide a high-bandwidth net-work and in which the memory banks are slower than the processors. We are...
Abstract A Comparison of Sorting Algorithms for the Connection Machine CM-2 (2008)
Guy E. Blelloch, Bruce M. Maggs, C. Greg, Plaxton Stephen, J. Smith, ...
We have implemented three parallel sorting algorithms on the Con-nection Machine Supercomputer model CM-2: B atcher’s bitonic sort, a parallel radix sor ~ and a sample sort similar to Reif and...
Abstract A Comparison of Sorting Algorithms for the Connection Machine CM-2 (2008)
Guy E. Blelloch, C. Greg Plaxton, Charles E. Leiserson, Stephen J. Smith, Bruce M. Maggs, Marco Zagha
We have implemented three parallel sorting algorithms on the Connection Machine Supercomputer model CM-2: Batcher's bitonic sort, a parallel radix sort, and a sample sort similar to Reif and...
Guy E. Blelloch, Yossi Matias Y, Marco Zagha
For years, the computation rate of processors has been much faster than the access rate of memory banks, and this divergence in speeds has been constantly increasing in recent years. As a result,...
Abstract Implementation of a Portable Nested Data-Parallel Language* (2008)
Guy E. Blellochl, Siddhartha Chatterjee, Jonathan C. Hardwick, Jay Sipelstein, Marco Zagha
This paper gives an overview of the implementation of NESL, a portable nested data-parallel language. This language and its implementation are the first to fully support nested data structures as...
Abstract Implementation of a Portable Nested Data-Parallel Language (2008)
Guy E. Blelloch, Siddhartha Chatterjee, Jonathan C. Hardwick, Jay Sipelstein, Marco Zagha
This paper gives an overview of the implementation of Nesl, a portable nested data-parallel language. This language and its implementation are the rst to fully support nested data structures as well...
Abstract Implementation of a Portable Nested Data-Parallel Language (2008)
Guy E. Blelloch, Siddhartha Chatterjee, Jonathan C. Hardwick, Jay Sipelstein, Marco Zagha
This paper gives an overview of the implementation of Nesl, a portable nested data-parallel language. This language and its implementation are the rst to fully support nested data structures as well...
Document Number, Edited Christina Cary, Production Kirsten Pekarek, Leo Dagum, Wesley Jones, ...
The contents of this document may not be copied or duplicated in any form, in whole
Optimizing the NAS Parallel BT Application for the POWER CHALLENGEarray (2008)
Power Challengearray, John Brown, Marco Zagha
: The POWER CHALLENGEarray is a coarse-grained collection of large processor SMP nodes. This creates interesting parallelization opportunities for scalable applications. The NAS BT benchmark is a...
Cvl: A C Vector Library Manual Version 2 (2007)
Guy Blelloch Siddhartha, Guy E. Blelloch, Siddhartha Chatterjee, Jonathan C. Hardwick, Margaret Reid-miller, Jay Sipelstein, ...
Cvl is a library of low-level vector routines callable from C. This library includes a wide variety of vector operations such as elementwise function applications, scans, reduces and permutations....
Instrumentation System Design, Modeling, and Evaluation (Bibliography) (2007)
Abdul Waheed, Jerry C. Yan, Jerry C. Yan, S. R. Sarukkai, P. Mehra, Performance Measurement, ...
/sc96/proceedings/SC96PROC/ZAGHA/ INDEX.HTM. [234] W. Zhao, J. Stankovic, and K. Ramamritham, "A Window Protocol for Time Constrained Messages," IEEE Transactions on Computers, 39(9), Sep....
Cvl: ACVector Library Manual Version 2 (2007)
Guy E. Blelloch, Siddhartha Chatterjee, Jonathan C. Hardwick, Margaret Reid-miller, Jay Sipelstein, Marco Zagha
Cvl is a library of low-level vector routines callable from C. This library includes a wide variety of vector operations such as elementwise function applications, scans, reduces and permutations....
Implementation of a Portable Nested Data-Parallel Language (2006)
Blelloch, Guy E., Chatterjee, Siddhartha, Hardwick, Jonathan C., Sipelstein, Jay, Zagha, Marco
This paper gives an overview of the implementation of NESL, a portable nested data-parallel language. This language and its implementation are the first to fully support nested data structures as...
Segmented Operations for Sparse Matrix Computation on Vector Multiprocessors, (1998)
Blelloch, Guy E., Heroux, Michael A., Zagha, Marco
In this paper we present a new technique for sparse matrix multiplication on vector multiprocessors based on the efficient implementation of a segmented sum operation. We describe how the segmented...
Performance Evaluation of a New Parallel Preconditioner. (1998)
Gremban, Keith D., Miller, Gary L., Zagha, Marco
Solution of partial differential equations by either the finite element or the finite difference methods often requires the solution of large, sparse linear systems. When the coefficient matrices...
NESL User's Manual (for NESL Version 3.1). (1998)
Blelloch, Guy E., Sipelstein, Jay, Hardwick, Jonathan C., Zagha, Marco
This manual is a supplement to the language definition of NEsL version 3.1. It describes how to use the NEsL system interactively and covers features for accessing on-line help, debugging, profiling,...
"May 1998."
An Experimental Analysis of Parallel Sorting Algorithms (1998)
Guy E. Blelloch, C. Greg Plaxton, Charles E. Leiserson, Stephen J. Smith, Bruce M. Maggs, Marco Zagha
We have developed a methodology for predicting the performance of parallel algorithms on real parallel machines. The methodology consists of two steps. First, we characterize a machine by enumerating...
Accounting for memory bank contention and delay in high-bandwidth multiprocessors (1997)
Guy E. Blelloch, Phillip B. Gibbons, Yossi Matias, Marco Zagha
Abstract—For years, the computation rate of processors has been much faster than the access rate of memory banks, and this divergence in speeds has been constantly increasing in recent years. As a...
Performance Analysis Using the MIPS R10000 Performance Counters (1996)
Marco Zagha, Brond Larson, Steve Turner, Marty Itzkowitz
: Tuning supercomputer application performance often requires analyzing the interaction of the application and the underlying architecture. In this paper, we describe support in the MIPS R10000 for...
Performance Evaluation of a New Parallel Preconditioner (1995)
Keith D. Gremban, Gary L. Miller, Marco Zagha
The linear systems associated with large, sparse, symmetric, positive definite matrices are often solved iteratively using the preconditioned conjugate gradient method. We have developed a new class...
Performance Evaluation of a New Parallel Preconditioner (1995)
Keith D. Gremban, Gary L. Miller, Marco Zagha
purposes, notwithstanding any copyright notation thereon. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official...
NESL user’s manual (for NESL version 3.1 (1995)
Guy E. Blelloch, Jay Sipelstein, Jonathan C. Hardwick, Marco Zagha
This manual is a supplement to the language de nition of Nesl version 3.1. It describes how to use the Nesl system interactively and covers features for accessing on-line help, debugging, pro ling,...
NESL User's Manual (For NESL Version 3.1) (1995)
Guy E. Blelloch, Jay Sipelstein, Jonathan C. Hardwick, Marco Zagha
This manual is a supplement to the language definition of NESL version 3.1. It describes how to use the NESL system interactively and covers features for accessing on-line help, debugging, profiling,...
Accounting for Memory Bank Contention and Delay in High-Bandwidth Multiprocessors (1995)
Guy E. Blelloch, Phillip B. Gibbons, Yossi Matias, Marco Zagha
This paper considers issues of memory performance in shared memory multiprocessors that provide a high-bandwidth network and in which the memory banks are slower than the processors. We are concerned...
School Of, Guy E. Blelloch, Jay Sipelstein, Jonathan C. Hardwick, Marco Zagha
This manual is a supplement to the language definition of Nesl version 3.1. It describes how to use the Nesl system interactively and covers features for accessing on-line help, debugging, profiling,...
Performance Evaluation of a New Parallel Preconditioner (1995)
Keith D. Gremban, Gary L. Miller, Marco Zagha
Solution of partial differential equations by either the finite element or the finite difference methods often requires the solution of large, sparse linear systems. When the coefficient matrices...
Implementation of a Portable Nested Data-Parallel Language (1994)
Guy Blelloch, Siddhartha Chatterjee, Jonathan C. Hardwick, Jay Sipelstein, Marco Zagha
This paper gives an overview of the implementation of Nesl, a portable nested data-parallel language. This language and its implementation are the first to fully support nested data structures as...
New Parallel Preconditioner (1994)
Keith D. Gremban, Marco Zagha, Gary L. Miller
Solution of partial differential equations by either the finite element or the finite difference methods often requires the solution of large, sparse linear systems. When the coefficient matrices...
Implementation of a Portable Nested Data-Parallel Language (1994)
Guy Blelloch Siddhartha, Guy E. Blelloch, Siddhartha Chatterjee, Jonathan C. Hardwick, Jay Sipelstein, Marco Zagha
This paper gives an overview of the implementation of NESL, a portable nested data-parallel language. This language and its implementation are the first to fully support nested data structures as...
Performance Evaluation of a New Parallel Preconditioner (1994)
Keith D. Gremban, Marco Zagha, Gary L. Miller
Solution of partial differential equations by either the finite element or the finite difference methods often requires the solution of large, sparse linear systems. When the coefficient matrices...
Implementation of a Portable Nested Data-Parallel Language (1994)
Guy E. Blelloch, Siddhartha Chatterjee, Jonathan C. Hardwick, Jay Sipelstein, Marco Zagha
This paper gives an overview of the implementation of NESL, a portable nested data-parallel language. This language and its implementation are the first to fully support nested data structures as...
School Of Computer, Guy E. Blelloch, Siddhartha Chatterjee, Jonathan C. Hardwick, Margaret Reid-miller, Jay Sipelstein, ...
Cvl is a library of low-level vector routines callable from C. This library includes a wide variety of vector operations such as elementwise function applications, scans, reduces and permutations....
Segmented Operations for Sparse Matrix Computation on Vector Multiprocessors (1993)
Guy Blelloch Michael, Michael A. Heroux, Marco Zagha
In this paper we present a new technique for sparse matrix multiplication on vector multiprocessors based on the efficient implementation of a segmented sum operation. We describe how the segmented...
Cvl: A C Vector Library - Manual Version 2 (1993)
Guy E. Blelloch, Siddhartha Chatterjee, Jonathan C. Hardwick, Margaret Reid-miller, Jay Sipelstein, Marco Zagha
CVL is a library of low-level vector routines callable from C. This library includes a wide variety of vector operations such as elementwise function applications, scans, reduces and permutations....
Segmented Operations for Sparse Matrix Computation on Vector Multiprocessors (1993)
Guy E. Blelloch, Michael A. Heroux, Marco Zagha
In this paper we present a new technique for sparse matrix multiplication on vector multiprocessors based on the efficient implementation of a segmented sum operation. We describe how the segmented...
Implementation of a Portable Nested Data-Parallel Language (1993)
Guy E. Blelloch, Jonathan C. Hardwick, Jay Sipelstein, Marco Zagha, Siddhartha Chatterjee
This paper gives an overview of the implementation of NESL, a portable nested data-parallel language. This language and its implementation are the first to fully support nested data structures as...
Solving Linear Recurrences with Loop Raking (1992)
Guy E. Blelloch, Siddhartha Chatterjee, Marco Zagha
We present a variation of the partitionmethod for solving m th -order linear recurrences that is well-suited to vector multiprocessors. The algorithm fully utilizes both vector and multiprocessor...
Solving Linear Recurrences with Loop Raking (1992)
Guy Blelloch, Marco Zagha, Siddhartha Chatterjee, Siddhartha Chatterjee
We present a variation of the partition method for solving linear recurrences that is wellsuited to vector multiprocessors. The algorithm fully utilizes both vector and multiprocessor capabilities,...
Solving Linear Recurrences with Loop Raking (1992)
Guy Blelloch School, Guy E. Blelloch, Marco Zagha, Siddhartha Chatterjee, Prof Guy, E. Blelloch
We present a variation of the partition method for solving linear recurrences that is wellsuited to vector multiprocessors. The algorithm fully utilizes both vector and multiprocessor capabilities,...
A Comparison of Sorting Algorithms for the Connection Machine CM-2 (1991)
Guy Blelloch Carnegie, Guy E. Blelloch, C. Greg Plaxton, Charles E. Leiserson, Stephen J. Smith, Bruce M. Maggs, ...
We have implemented three parallel sorting algorithms on the Connection Machine Supercomputer model CM-2: Batcher's bitonic sort, a parallel radix sort, and a sample sort similar to Reif and...
Radix Sort For Vector Multiprocessors (1991)
We have designed a radix sort algorithm for vector multiprocessors and have implemented the algorithm on the CRAY Y-MP. On one processor of the Y-MP, our sort is over 5 times faster on large sorting...
A Comparison of Sorting Algorithms for the Connection Machine CM-2 (1991)
Guy E. Blelloch, C. Greg Plaxton, Charles E. Leiserson, Stephen J. Smith, Bruce M. Maggs, Marco Zagha
We have implemented three parallel sorting algorithms on the Connection Machine Supercomputer model CM-2: Batcher's bitonic sort, a parallel radix sort, and a sample sort similar to Reif and...
An Implementation of Back-Propagation Learning on GF11, a Large SIMD Parallel Computer (1990)
Current connectionist simulations require huge computational resources. We describe a neural network simulator for the IBM GF11, an experimental SIMD machine with 566 processors and a peak arithmetic...
Scan Primitives for Vector Computers (1990)
Siddhartha Chatterjee, Guy E. Blelloch, Marco Zagha
This paper describes an optimized implementation of a set of scan (also called allprefix -sums) primitives on a single processor of a CRAY Y-MP, and demonstrates that their use leads to greatly...
Reference Manual (Version 1.1) (1990)
Guy Blelloch, Siddhartha Chatterjee, Fritz Knabe, Jay Sipelstein, Marco Zagha
This report introduces VCODE, an intermediate language for data-parallel computations. VCODE is designed to allow easy porting of data-parallel languages, such as C*, PARALATION LISP, and Fortran 8x,...