Abstract The Implementation of the Cilk-5 Multithreaded Language (2009)
Matteo Frigo, Charles E. Leiserson, Keith H. Randall
The fth release of the multithreaded language Cilk uses a provably good \work-stealing " scheduling algorithm similar to the rst system, but the language has been completely redesigned and...
Unbounded Transactional Memory (2009)
C. Scott, Ananian Krste, Asanović Bradley, C. Kuszmaul, Charles E. Leiserson, Sean Lie
Hardware transactional memory should support unbounded transactions: transactions of arbitrary size and duration. We describe a hardware implementation of unbounded transactional memory, called UTM,...
Adaptive Work Stealing with Parallelism Feedback (2008)
Kunal Agrawal, Yuxiong He, Charles E. Leiserson
Abstract We present an adaptive work-stealing thread scheduler, A-STEAL, for fork-join multithreaded jobs, like those written using the Cilk multithreaded language or the Hood work-stealinglibrary....
Abstract Adaptive Work Stealing with Parallelism Feedback (2008)
Kunal Agrawal, Yuxiong He, Charles E. Leiserson
We present an adaptive work-stealing thread scheduler, A-STEAL, for fork-join multithreaded jobs, like those written using the Cilk multithreaded language or the Hood work-stealing library. The...
Unbounded Transactional Memory (2008)
C. Scott, Ananian Krste, Asanović Bradley, C. Kuszmaul, Charles E. Leiserson, Sean Lie
Hardware transactional memory should support unbounded transactions: transactions of arbitrary size and duration. We describe a hardware implementation of unbounded transactional memory, called UTM,...
Abstract A Comparison of Sorting Algorithms for the Connection Machine CM-2 (2008)
Guy E. Blelloch, C. Greg Plaxton, Charles E. Leiserson, Stephen J. Smith, Bruce M. Maggs, Marco Zagha
We have implemented three parallel sorting algorithms on the Connection Machine Supercomputer model CM-2: Batcher's bitonic sort, a parallel radix sort, and a sample sort similar to Reif and...
Abstract Cilk: An Efficient Multithreaded Runtime System (2008)
Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, Yuli Zhou
Cilk (pronounced “silk”) is a C-based runtime system for multithreaded parallel programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler, both empirically and...
Amund Tveit, Thomas H. Cormen, Charles E. Leiserson
Abstract. A lower bound of Omega(n 2 log(n)) is proved for the time complexity of calculating the inverse of a matrix nxn, over the real or complex numbers in the sequential computation case.
[David 00] David Solomon, Data Compression: The Complete Reference, Springer, 2000. (2008)
Rivest Thomas, H. Cormen, Charles E. Leiserson, Ronald L
[Abbott and Ahuja 90] A. L. Abbott and N. Ahuja, “Active surface reconstruction by integrating focus, vergence, stereo, and camera calibration”, Proceedings: Third
Thomas H. Corman, Charles E. Leiserson, Ronald L. Rivest, Introduction To
asymptotic notation with several parameters – conditional asymptotic notation – amortized analysis – NPcompleteness – NP-hard – recurrence equations – solving recurrence equations –...
Processing Letters, 18(3):147–150, 1984. (2008)
Ab Sanjeev Arora, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein, Introduction Algorithms, ...
modern approach. Initial draft kindly placed by the authors on the web, 2006. [AGP94] W. R. Alford, A. Granville, and C. Pomerance. There are infinitely many Carmichael numbers. Annals of Mathematics,
[3] Jørgen Bang-Jensen and Gregory Gutin. Digraphs: Theory, Algorithms and Applications. (2008)
S. Ferenczi, C. Mauduit, A. Siegel, Elwyn R. Berlekamp, Hohn H. Conway, ...
[5] József Beck. On positional games. Journal of Combinatorial Theory, Series A, 30:117–
Abstract Adaptive Work Stealing with Parallelism Feedback (2008)
Kunal Agrawal, Yuxiong He, Charles E. Leiserson
We present an adaptive work-stealing thread scheduler, A-STEAL, for fork-join multithreaded jobs, like those written using the Cilk multithreaded language or the Hood work-stealing library. The...
Provably Efficient Adaptive Scheduling For Parallel Jobs (2008)
Yuxiong He, Wen Jing Hsu, Charles E. Leiserson
Abstract — Scheduling competing jobs on multiprocessors has always been an important issue for parallel and distributed systems. The challenge is to ensure global, systemwide efficiency while...
Abstract The Implementation of the Cilk-5 Multithreaded Language (2008)
Matteo Frigo, Charles E. Leiserson, Keith H. Randall
The fth release of the multithreaded language Cilk uses a provably good \work-stealing " scheduling algorithm similar to the rst system, but the language has been completely redesigned and...
Adversarial Analyses of Window Backoff Strategies for Simple Multiple-Access Channels (2008)
Michael A. Bender, Martin Farach-colton, Simai He, Bradley C. Kuszmaul, Charles E. Leiserson
Abstract — Backoff strategies have typically been analyzed by making statistical assumptions on the distribution of problem inputs. Although these analyses have provided valuable insights into the...
Unbounded Transactional Memory (2008)
C. Scott, Ananian Krste, Asanović Bradley, C. Kuszmaul, Charles E. Leiserson, Sean Lie
Hardware transactional memory should support unbounded transactions: transactions of arbitrary size and duration. We describe a hardware implementation of unbounded transactional memory, called UTM,...
Unbounded Transactional Memory (2008)
C. Scott, Ananian Krste, Asanović Bradley, C. Kuszmaul, Charles E. Leiserson, Sean Lie
Hardware transactional memory should support unbounded transactions: transactions of arbitrary size and duration. We describe a hardware implementation of unbounded transactional memory, called UTM,...
Brian Borchers, Jan Korst, Simulated Annealing, Thomas L. Magnanti, As Noga Alon, ...
I've compiled this list of books on combinatorial optimization and integer programming as a source of additional reading and project ideas for students in my graduate course in combinatorial...
C. Scott Ananian, Krste Asanović, Bradley C. Kuszmaul, Charles E. Leiserson
Researchers have proposed using transactional memory as a flexible method by which programs can read and modify disparate primary memory locations atomically in a single operation, much as a database...
Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, Charles E. Leiserson
A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two...
Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, Charles E. Leiserson
A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two...
An Experimental Comparison of Edge-Triggering and Level-Clocking (2007)
Charles E. Leiserson, A. Gould, Keith H. Randall, Keith H. Randall
Circuits implemented with two-phase level-clocked latches have the theoretical potential to operate faster and require less state than equivalent circuits implemented with edge-triggered latches. We...
Optimizing Two-Phase, Level-Clocked Circuitry (2007)
Extend Ed, Extended Abstract, Alexander T. Ishii, Charles E. Leiserson, Marios C. Papaefthymiou, O(v E V
) Alexander T. Ishii NEC C&C Research Laboratories Princeton, New Jersey 08540 Charles E. Leiserson Marios C. Papaefthymiou MIT Laboratory for Computer Science Cambridge, Massachusetts 02139...
Certified by......................................................... (2007)
Charles E. Leiserson, R. Morgenthaler
This thesis presents the theory, design, and implementation of Cilk (pronounced "silk") and Cilk-NOW. Cilk is a C-based language and portable runtime system for programming and...
Nathaniel A. Kushman, Volker Strumpen, Charles E. Leiserson, Arthur C. Smith
requirements for the degrees of
Certi ed by........................................................................ (2007)
Charles E. Leiserson, Arthur C. Smith, Guang-ien Cheng, Guang-ien Cheng
in partial ful llment oftherequirements for the degree of
Certified by fGE] ' '/-t------- (2007)
Marios Christos Papaefthymiou, Marios Christos Papaefthymiou, Charles E. Leiserson
In this paper we investigate properties of retirning, a circuit transformation which preserves the behavior of the circuit as a whole. We present an algorithm which transforms a given combinational...
Matteo Frigo, Charles E. Leiserson, Arthur C. Smith
notice and this permission notice are preserved on all copies.
and Creating Locality in Numerical Algorithms (2007)
Charles E. Leiserson, Sivan Avraham Toledo, Sivan Avraham Toledo
How do you determine the running time of a program without actually running it? How do you design an efficient out-of-core iterative algorithm? These are the two questions answered in this thesis....
Charles E. Leiserson, R. Morgenthaler, Christopher F. Joerg, Christopher F. Joerg
Although cost-effective parallel machines are now commercially available, the widespread use of parallel processing is still being held back, due mainly to the troublesome nature of parallel...
Certified by......................................................... (2007)
Charles E. Leiserson, R. Morgenthaler, Robert D. Blumofe
This thesis presents the theory, design, and implementation of Cilk (pronounced "silk") and Cilk-NOW. Cilk is a C-based language and portable runtime system for programming and...
Charles E. Leiserson, Satish Rao, Sivan Toledo
When a numerical computation fails to fit in the primary memory of a serial or parallel computer, a so-called "out-of-core " algorithm must be used which moves data between primary...
Provably Efficient Adaptive Scheduling for Parallel Jobs (2007)
He, Yuxiong, Hsu, Wen Jing, Leiserson, Charles E.
Scheduling competing jobs on multiprocessors has always been an important issue for parallel and distributed systems. The challenge is to ensure global, system-wide efficiency while offering a level...
Provably Efficient Adaptive Scheduling for Parallel Jobs (2007)
He, Yuxiong, Hsu, Wen Jing, Leiserson, Charles E.
Scheduling competing jobs on multiprocessors has always been an important issue for parallel and distributed systems. The challenge is to ensure global, system-wide efficiency while offering a level...
An empirical evaluation of work stealing with parallelism feedback (2006)
Kunal Agrawal, Yuxiong He, Charles E. Leiserson
A-STEAL is a provably good adaptive work-stealing thread scheduler that provides parallelism feedback to a multiprocessor job scheduler. A-STEAL uses a simple multiplicative-increase,...
Provably efficient two-level adaptive scheduling (2006)
Yuxiong He, Wen-jing Hsu, Charles E. Leiserson
Abstract. Multiprocessor scheduling in a shared multiprogramming environment can be structured in two levels, where a kernel-level job scheduler allots processors to jobs and a user-level thread...
Memory models for open-nested transactions (2006)
Kunal Agrawal, Charles E. Leiserson, Jim Sukha
Open nesting provides a loophole in the strict model of atomic transactions. Moss and Hosking suggested adapting open nesting for transactional memory, and Moss and a group at Stanford have proposed...
Memory models for open-nested transactions (2006)
Kunal Agrawal, Charles E. Leiserson, Jim Sukha
Open nesting provides a loophole in the strict model of atomic transactions. Moss and Hosking suggested adapting open nesting for transactional memory, and Moss and a group at Stanford have proposed...
An empirical evaluation of work stealing with parallelism feedback (2006)
Kunal Agrawal, Yuxiong He, Charles E. Leiserson
Abstract A-STEAL is a provably good adaptive work-stealingthread scheduler that provides parallelism feedback to a multiprocessor job scheduler. A-STEAL uses a simplemultiplicative-increase,...
Memory models for open-nested transactions (2006)
Kunal Agrawal, Charles E. Leiserson, Jim Sukha
Open nesting provides a loophole in the strict model of atomic transactions. Moss and Hosking suggested adapting open nesting for transactional memory, and Moss and a group at Stanford have proposed...
Provably efficient two-level adaptive scheduling (2006)
Yuxiong He, Wen-jing Hsu, Charles E. Leiserson
Abstract. Multiprocessor scheduling in a shared multiprogramming environment can be structured in two levels, where a kernel-level job scheduler allots processors to jobs and a user-level thread...
External-memory search trees with fast insertions (2006)
Jelani Nelson, Charles E. Leiserson, Bradley C. Kuszmaul, Jelani Nelson
This thesis provides both experimental and theoretical contributions regarding externalmemory dynamic search trees with fast insertions. The first contribution is the implementation of the buffered...
An empirical evaluation of work stealing with parallelism feedback (2006)
Kunal Agrawal, Yuxiong He, Charles E. Leiserson
Abstract A-STEAL is a provably good adaptive work-stealingthread scheduler that provides parallelism feedback to a multiprocessor job scheduler. A-STEAL uses a simplemultiplicative-increase,...
An empirical evaluation of work stealing with parallelism feedback (2006)
Kunal Agrawal, Yuxiong He, Charles E. Leiserson
A-STEAL is a provably good adaptive work-stealing thread scheduler that provides parallelism feedback to a multiprocessor job scheduler. A-STEAL uses a simple multiplicative-increase,...
Programming with Exceptions in JCilk (2005)
Danaher, John S., Lee, I-Ting Angelina, Leiserson, Charles E.
JCilk extends the Java language to provide call-return semantics for multithreading, much as Cilk does for C. Java's built-in thread model does not support the passing of exceptions or return values...
Programming with Exceptions in JCilk (2005)
Danaher, John S., Lee, I-Ting Angelina, Leiserson, Charles E.
JCilk extends the Java language to provide call-return semantics for multithreading, much as Cilk does for C. Java's built-in thread model does not support the passing of exceptions or return values...
The JCilk Language for Multithreaded Computing (2005)
Danaher, John, Lee, I-Ting Angelina, Leiserson, Charles E.
JCilk extends the Java language to provide call-return semantics for multithreading, much as Cilk does for C. Java's built-in thread model does not support the passing of exceptions or return values...
Provably good race detection that runs in parallel (2005)
Jeremy T. Fineman, Charles E. Leiserson, Arthur C. Smith, Jeremy T. Fineman
Certified by..................... (2005)
Jim Sukha, Charles E. Leiserson, Bradley C. Kuszmaul, Arthur C. Smith
Memory-mapped transactions combine the advantages of both memory mapping and transactions to provide a programming interface for concurrently accessing data on disk without explicit I/O or locking...
Unbounded Transactional Memory (2005)
C. Scott Ananian, Krste Asanović, Bradley C. Kuszmaul, Charles E. Leiserson, Sean Lie
Background: Programming in a shared-memory environment often requires the use of atomic regions for program correctness. Traditionally, atomicity is achieved through critical sections protected by...
C. Scott Ananian, Krste Asanović, Bradley C. Kuszmaul, Charles E. Leiserson, Sean Lie, M. A. Bender, ...
Unbounded transactional memory. In Proceedings of the 11th International Symposium on High-
Certified by.......................................................... (2005)
John Danaher, Charles E. Leiserson, Arthur C. Smith
JCilk extends the Java language to provide call-return semantics for multithreading, much as Cilk does for C. Java’s built-in thread model does not support the passing of exceptions or return...
Certified by.......................................................... (2005)
JCilk is a Java-based multithreaded programming language which extends Java to provide a dynamic threading model. Specifically, JCilk imports Cilk’s fork-join primitives spawn and sync into Java to...
Provably good race detection that runs in parallel (2005)
Jeremy T. Fineman, Charles E. Leiserson, Arthur C. Smith, Jeremy T. Fineman
Certified by.......................................................... (2005)
2 The JCilk Multithreaded Language
Thesis Supervisor Certified by.......................................................... (2005)
Jim Sukha, Charles E. Leiserson, Bradley C. Kuszmaul, Arthur C. Smith
Memory-mapped transactions combine the advantages of both memory mapping and transactions to provide a programming interface for concurrently accessing data on disk without explicit I/O or locking...
Certified by.......................................................... (2005)
Charles E. Leiserson, John Danaher
2
On-the-Fly Maintenance of Series-Parallel Relationships in Fork-Join Multithreaded Programs (2004)
Bender, Michael A., Fineman, Jeremy T., Gilbert, Seth, Leiserson, Charles E.
A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two...
On-the-Fly Maintenance of Series-Parallel Relationships in Fork-Join Multithreaded Programs (2004)
Bender, Michael A., Fineman, Jeremy T., Gilbert, Seth, Leiserson, Charles E.
A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two...
Certified by.;..............;.,. v....... Accepted by.. (2004)
Siddhartha Sen, Charles E. Leiserson
This thesis addresses the problem of scheduling multiple, concurrent, adaptively parallel jobs on a multiprogrammed shared-memory multiprocessor. Adaptively parallel jobs are jobs for which the...
Jeff M. Phillips, Thomas H. Cormen, Charles E. Leiserson
Union-Find is a simple and incredibly useful algorithm for finding connected sets in a graph, G = (V, E) that is often taught in an advanced algorithms class. Where n = |V | and m = |E|, Union-Find...
Dynamic processor allocation for adaptively parallel jobs (2004)
Siddhartha Sen, Charles E. Leiserson, Siddhartha Sen
This thesis addresses the problem of scheduling multiple, concurrent, adaptively parallel jobs on a multiprogrammed shared-memory multiprocessor. Adaptively parallel jobs are jobs for which the...
On-the-fly maintenance of series-parallel relationships in fork-join multithreaded programs (2004)
Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, Charles E. Leiserson
A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two...
Siddhartha Sen, Charles E. Leiserson, Siddhartha Sen
This thesis addresses the problem of scheduling multiple, concurrent, adaptively parallel jobs on a multiprogrammed shared-memory multiprocessor. Adaptively parallel jobs are jobs for which the...
On-the-fly maintenance of series-parallel relationships in fork-join multithreaded programs (2004)
Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, Charles E. Leiserson
A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two...
Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, Charles E. Leiserson
A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two...
Adversarial Analyses of Window Backoff Strategies for Simple Multiple-Access Channels (2003)
Bender, Michael A., Farach-Colton, Martin, He, Simai, Kuszmaul, Bradley C., Leiserson, Charles E.
Backoff strategies have typically been analyzed by making statistical assumptions on the distribution of problem inputs. Although these analyses have provided valuable insights into the efficacy of...
Hardware Transactional Memory (2003)
Lie, Sean, Asanovic, Krste, Kuszmaul, Bradley C., Leiserson, Charles E.
This work shows how hardware transactional memory (HTM) can be implemented to support transactions of arbitrarily large size, while ensuring that small transactions run efficiently. Our...
A Media Player for Use in Distance Education (2003)
Huang, Kai, Leiserson, Charles E., Sarmenta, Luis F.G.
We have developed a media player for use in distance education. The player can incorporate several time-indexed sources, including video, audio, PowerPoint, and text index. We have converted all the...
A Media Player for Use in Distance Education (2003)
Huang, Kai, Leiserson, Charles E., Sarmenta, Luis F.G.
We have developed a media player for use in distance education. The player can incorporate several time-indexed sources, including video, audio, PowerPoint, and text index. We have converted all the...
Hardware Transactional Memory (2003)
Lie, Sean, Asanovic, Krste, Kuszmaul, Bradley C., Leiserson, Charles E.
This work shows how hardware transactional memory (HTM) can be implemented to support transactions of arbitrarily large size, while ensuring that small transactions run efficiently. Our...
Adversarial Analyses of Window Backoff Strategies for Simple Multiple-Access Channels (2003)
Bender, Michael A., Farach-Colton, Martin, He, Simai, Kuszmaul, Bradley C., Leiserson, Charles E.
Backoff strategies have typically been analyzed by making statistical assumptions on the distribution of problem inputs. Although these analyses have provided valuable insights into the efficacy of...
Transactions Everywhere (2003)
Kuszmaul, Bradley C., Leiserson, Charles E.
Arguably, one of the biggest deterrants for software developers who might otherwise choose to write parallel code is that parallelism makes their lives more complicated. Perhaps the most basic...
Transactions Everywhere (2003)
Kuszmaul, Bradley C., Leiserson, Charles E.
Arguably, one of the biggest deterrants for software developers who might otherwise choose to write parallel code is that parallelism makes their lives more complicated. Perhaps the most basic...
Transactions everywhere (2003)
Bradley C. Kuszmaul, Charles E. Leiserson, Sma Fellow
Abstract — Arguably, one of the biggest deterrants for software developers who might otherwise choose to write parallel code is that parallelism makes their lives more complicated. Perhaps the most...
Systolic Arrays for (VLSI). (2002)
Kung,H. T., Leiserson,Charles E.
A systolic system is a network of processors which rhythmically compute and pass data through the system. Physiologists use the work 'systole' to refer to the rhythmically recurrent contraction of...
A Layout for the Shuffle-Exchange Network. (2002)
Hoey,Dan, Leiserson,Charles E.
This paper describes a technique for producing a VLSI layout of the shuffle-exchange graph. It is based on the layout procedure which lays out a graph by bisecting the graph, recursively laying out...
Optimal Placement for River Routing. (2002)
Leiserson,Charles E., Pinter,Roy Y.
Programs for integrated circuit layout typically have two phases: placement and routing. The router should produce as efficient a layout as possible, but of course the quality of the routing depends...
Wafer-Scale Integration of Systolic Arrays, (2002)
Leighton,Frank Thomson, Leiserson,Charles E.
VLSI technologists are fast developing wafer-scale integration. Rather than partitioning a silicon wafer into chips as is usually done, the idea behind wafer-scale integration is to assemble an...
A Coherent VLSI Design Environment. (2002)
Glasser,Lance A., Leiserson,Charles E., Rivest,Ronald L.
This report covers the period from April 1, 1985 through September 30, 1985. The research discussed here is described in more detail in several published and unpublished reports cited. The CAD frame...
Randomized Routing on Fat-Trees, (2002)
Greenberg, Ronald I., Leiserson, Charles E.
Fat-trees are a class of routing networks for hardware-efficient parallel computation. This paper presents a randomized algorithm for routing messages on a fat-tree. The quality of the algorithm is...
Randomized Routing on Fat Trees. (2002)
Greenberg,Ronald I., Leiserson,Charles E.
Fat-trees are a class of routing networks for hardware-efficient parallel computation. This paper presents a randomized algorithm for routing messages on a fat-tree. The quality of the algorithm is...
Retiming Synchronous Circuitry. (2002)
Leiserson,Charles E., Saxe,James B.
This paper shows how the technique of retiming can be used to transform a given synchronous circuit into a more efficient circuit under a variety of different cost criteria. We model a circuit as a...
A Hyperconcentrator Switch for Routing Bit-Serial Messages (Extended Abstract), (2002)
Cormen,Thomas H., Leiserson,Charles E.
In highly parallel message routing networks, it is sometimes desirable to concentrate relatively few messages on many wires onto fewer wires. We have designed a VLSI chip for this purpose which is...
Communication-Efficient Parallel Graph Algorithms. (2002)
Leiserson,Charles E., Maggs,Bruce M.
Communication bandwidth is a resource ignored by most parallel random-access machine (PRAM) models. This paper shows that many graph problems can be solved in parallel, not only with polylogarithmic...
A Hyperconcentrator Switch for Routing Bit-Serial Messages. (2002)
Cormen,Thomas H., Leiserson,Charles E.
In highly parallel message routing networks, it is sometimes desirable to concentrate relatively few messages on many wires onto fewer wires. We have designed a VLSI chip for this purpose which is...
A Coherent VLSI Design Environment. (2002)
Dally, William J., Glasser, Lance A., Leiserson, Charles E.
The CAD frame Schema has been outfitted with a uniform protocol or aggregate data structures, which includes new control structures for examining and searching through the data structures. Circuit...
Introduction to Algorithms, second edition (2001)
Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein
Using Cilk to Write Multiprocessor Chess Programs (2001)
Don Dailey, Charles E. Leiserson
This paper overviews the Cilk language, illustrating how Cilk supports the programming of parallel game-tree search and other chess mechanisms
Textbooks: 1. Introduction to Automata Theory, Languages, and Computation (2000)
Prepared Lesson Plans, John E. Hopcroft, H. Cormen, Charles E. Leiserson, Ronald L. Rivest
Portable High-Performance Programs (1999)
Matteo Frigo, Charles E. Leiserson
right notice and this permission notice are preserved on all copies.
Cache-oblivious algorithms (Extended Abstract) (1999)
Matteo Frigo, Charles E. Leiserson, Harald Prokop, Sridhar Ramachandran
This paper presents asymptotically optimal algorithms for rectangular matrix transpose, FFT, and sorting on computers with multiple levels of caching. Unlike previous optimal algorithms, these...
Cache-Oblivious Algorithms (Extended Abstract) (1999)
Matteo Frigo, Charles E. Leiserson, Harald Prokop, Sridhar Ramachandran, Z W(l
This paper presents asymptotically optimal algorithms for rectangular matrix transpose, FFT, and sorting on computers with multiple levels of caching. Unlike previous optimal algorithms, these...
Cache-Oblivious Algorithms (Extended Abstract) (1999)
Matteo Frigo, Charles E. Leiserson, Harald Prokop, Sridhar Ramachandran, Z W(l
SUBMITTED FOR PUBLICATION. Matteo Frigo Charles E. Leiserson Harald Prokop Sridhar Ramachandran MIT Laboratory for Computer Science, 545 Technology Square, Cambridge, MA 02139...
Cache-oblivious algorithms (extended abstract (1999)
Matteo Frigo, Charles E. Leiserson, Harald Prokop, Sridhar Ramachandran
Abstract This paper presents asymptotically optimal algorithms for rectangular matrix transpose, FFT, and sorting on computers with multiple levels of caching. Unlike previous optimal algorithms,...
Cache-oblivious algorithms (extended abstract (1999)
Matteo Frigo, Charles E. Leiserson, Harald Prokop, Sridhar Ramachandran
Abstract This paper presents asymptotically optimal algorithms for rectangular matrix transpose, FFT, and sorting on computers with multiple levels of caching. Unlike previous optimal algorithms,...
Optimixing Synchronous Systems. (1998)
Leiserson,Charles E., Saxe,John B.
The complexity of integrated-circuit chips produced today makes it feasible to build inexpensive, special-purpose subsystems that rapidly solve sophisticated problems on behalf of a general-purpose...
Wafer-Scale Integration of Systolic Arrays, (1998)
Leighton,Frank Thomson, Leiserson,Charles E.
VLSI technologies are fast developing wafer-scale integration. Rather than partitioning a silicon wafer into chips as is usually done, the idea behind wafer-scale integration is to assemble an entire...
A Space-Efficient Algorithm for Finding the Connected Components of Rectangles in the Plane. (1998)
Leiserson,Charles E., Phillips,Cynthia A.
An algorithm is presented for determining the connectivity of a set of N rectangles in the plane, a problem central to avoiding aliasing in VLSI design rule checkers. Previous algorithms for this...
Retiming Synchronous Circuitry. (1998)
Leiserson, Charles E., Saxe, James B.
This paper describes a circuit transformation called retiming in which registers are added at some points in a circuit and removed from others in such a way that he functional behavior of the circuit...
The Organization of Permutation Architectures with Bussed Interconnections. (1998)
Kilian, Joe, Kipnis, Shlomo, Leiserson, Charles E.
This paper explores the problem of efficiently permuting data stored in VLSI chips in accordance with a predetermined set of permutations. By connecting chips with shared bus interconnections, as...
The Organization of Permutation Architectures with Bussed Interconnections. (1998)
Kilian, Joe, Kipnis, Sholomo, Leiserson, Charles E.
This paper explores the problem of efficiently permuting data stored in VLSI chips in accordance with a predetermined set of permutations. By connecting chips with shared bus interconnections, as...
VLSI Theory and Parallel Supercomputing. (1998)
Since its inception, VLSI theory has expanded in many fruitful and interesting directions. One major branch is layout theory which studies the efficiency with which graphs can be embedded in the...
Partial contents: Micron-Scale Display Technology; Virtual Memory for Data-Parallel Computing; The Optimal Synthesis of VLSI Array Architectures from Algorithmic Descriptions; Clay-1: A Distributed...
A Timing Analysis of Level-Clocked Circuitry, (1998)
Ishii, Alexander T., Leiserson, Charles E.
This paper presents an algorithm for verifying proper timing in VLSI circuits where latches are controlled by the levels (high or low) of the controlling clocks rather than the transitions (edges) of...
Debugging multithreaded programs that incorporate userlevel locks (1998)
Charles E. Leiserson, Arthur C. Smith, Andrew F. Stark, Andrew F. Stark
requirements for the degrees of
An Experimental Analysis of Parallel Sorting Algorithms (1998)
Guy E. Blelloch, C. Greg Plaxton, Charles E. Leiserson, Stephen J. Smith, Bruce M. Maggs, Marco Zagha
We have developed a methodology for predicting the performance of parallel algorithms on real parallel machines. The methodology consists of two steps. First, we characterize a machine by enumerating...
Programming parallel applications in cilk (1998)
Charles E. Leiserson, Aske Plaat
Cilk (pronounced "silk") is a C-based language for multithreaded parallel programming. Cilk makes it easy to program irregular parallel applications, especially as compared with...
Using de Bruijn Sequences to Index a 1 in a Computer Word (1998)
Charles E. Leiserson, Harald Prokop, Keith H. Randall
Some computers provide an instruction to find the index of a 1 in a computer word, but many do not. This paper provides a fast and novel algorithm based on de Bruijn sequences to solve this problem....
A Minicourse on Multithreaded Programming (1998)
Charles E. Leiserson, Harald Prokop
These notes contain two lectures that teach multithreaded algorithms using a Cilklike [7, 9, 11] model. These lectures were designed for the latter part of the MIT undergraduate class 6.046...
Space-Efficient Scheduling of Multithreaded Computations (1998)
Robert D. Blumofe, Charles E. Leiserson
.<F3.854e+05> This paper considers the problem of scheduling dynamic parallel computations to achieve linear speedup without using significantly more space per processor than that required for...
Detecting Data Races in Cilk Programs that Use Locks (1998)
Guang-ien Cheng, Mingdong Feng, Charles E. Leiserson, Keith H. Randall, Andrew F. Stark
When two parallel threads holding no locks in common access the same memory location and at least one of the threads modifies the location, a "data race" occurs, which is usually a bug....
Cilk: Efficient Multithreaded Computing (1998)
Keith H. Randall, Charles E. Leiserson, H. Randall
This thesis describes Cilk, a parallel multithreaded language for programming contemporary shared memory multiprocessors (SMP's). Cilk is a simple extension of C which provides constructs for...
The Implementation of the Cilk-5 Multithreaded Language (1998)
Matteo Frigo, Charles E. Leiserson, Keith H. Randall
The fifth release of the multithreaded language Cilk uses a provably good "work-stealing" scheduling algorithm similar to the first system, but the language has been completely redesigned...
Optimizing Two-Phase, Level-Clocked Circuitry (1997)
Alexander T. Ishii, Charles E. Leiserson, Marios C. Papaefthymiou
We investigate two strategies for reducing the clock period of a two-phase, level-clocked circuit: clock tuning, which adjusts the waveforms that clock the circuit, and retiming, which relocates...
Efficient Detection of Determinacy Races in Cilk Programs (1997)
Mingdong Feng, Charles E. Leiserson
A parallel multithreaded program that is ostensibly deterministic may nevertheless behave nondeterministically due to bugs in the code. These bugs are called determinacy races, and they result when...
Efficient Detection of Determinacy Races in Cilk Programs (1997)
Mingdong Feng, Charles E. Leiserson
A parallel multithreaded program that is ostensibly deterministic may nevertheless behave nondeterministically due to bugs in the code. These bugs are called determinacy races, and they result when...
Dag-Consistent Distributed Shared Memory (1996)
Robert D. Blumofe, Matteo Frigo, Christopher F. Joerg, Charles E. Leiserson, Keith H. Randall
We introduce dag consistency, a relaxed consistency model for distributed shared memory which is suitable for multithreaded programming. We have implemented dag consistency in software for the Cilk...
An Analysis of Dag-Consistent Distributed Shared-Memory Algorithms (1996)
Robert Blumofe, Matteo Frigo, Christopher F Joerg, Charles E Leiserson, Keith H Randall
In this paper, we analyze the performance of parallel multithreaded algorithms that use dag-consistent distributed shared memory. Specifically, we analyze execution time, page faults, and space...
An Analysis of Dag-Consistent Distributed Shared-Memory Algorithms (1996)
Robert D. Blumofe, Matteo Frigo, Christopher F. Joerg, Charles E. Leiserson, Keith H. Randall
In this paper, we analyze the performance of parallel multithreaded algorithms that use dag-consistent distributed shared memory. Specifically, we analyze execution time, page faults, and space...
Dag-Consistent Distributed Shared Memory (1996)
Robert Blumofe, Matteo Frigo, Christopher F. Joerg, Charles E. Leiserson, Keith H. Randall
We introduce dag consistency, a relaxed consistency model for distributed shared memory which is suitable for multithreaded programming. We have implemented dag consistency in software for the Cilk...
Randomized Routing on Fat-Trees (1996)
Ronald I. Greenberg, Charles E. Leiserson
Fat-trees are a class of routing networks for hardware-efficient parallel computation. This paper presents a randomized algorithm for routing messages on a fat-tree. The quality of the algorithm is...
Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, Yuli Zhou
Cilk (pronounced “silk”) is a C-based runtime system for multithreaded parallel programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler, both empirically and...
Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, Yuli Zhou
Cilk (pronounced “silk”) is a C-based runtime system for multithreaded parallel programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler, both empirically and...
Dag-consistent distributed shared memory (1996)
Robert D. Blumofe, Matteo Frigo, Christopher F. Joerg, Charles E. Leiserson, Keith H. Randall
We introduce dag consistency, a relaxed consistency model for distributed shared memory which is suitable for multithreaded programming. We have implemented dag consistency in software for the Cilk...
Executing Multithreaded Programs Efficiently (1995)
Robert D. Blumofe, Charles E. Leiserson, Robert D. Blumofe
right to do so. by::::::::::::::::::::::::::::::::::::::::::::::::::::::::
Cilk: An efficient multithreaded runtime system (1995)
Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, Yuli Zhou
Cilk (pronounced “silk”) is a C-based runtime system for multithreaded parallel programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler, both empirically and...
Cilk: An efficient multithreaded runtime system (1995)
Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, Yuli Zhou
Cilk (pronounced "silk") is a C-based runtime system for multithreaded parallel programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler, both...
Executing Multithreaded Programs Efficiently (1995)
Charles E. Leiserson, Frederic R. Morgenthaler, Robert D. Blumofe, Robert D. Blumofe
Cilk: An Efficient Multithreaded Runtime System (1995)
Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, Yuli Zhou
Cilk (pronounced "silk") is a C-based runtime system for multithreaded parallel programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler, both empirically...
Heterogeneous Multithreaded Computing (1995)
Charles E. Leiserson, F. R. Morgenthaler, Howard J. Lu, Howard J. Lu
This thesis studies issues introduced by supporting multithreaded computing in environments composed of workers heterogeneous with respect to data representation, speed, and scale. We implement a...
Cilk 1.2 (Version Beta 1) Reference Manual (1995)
Robert D. Blumofe, Matteo Frigo, Michael Halbherr, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, ...
This document describes Cilk
Cilk: An Efficient Multithreaded Runtime System (1995)
Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, Yuli Zhou
Cilk (pronounced "silk") is a C-based runtime system for multithreaded parallel programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler, both empirically...
Parallel Algorithms for the Circuit Value Update Problem (1995)
Charles E. Leiserson, Keith H. Randall
The circuit value update problem is the problem of updating values in a representation of a combinational circuit when some of the inputs are changed. We assume for simplicity that each combinational...
Cilk: An efficient multithreaded runtime system (1995)
Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, Yuli Zhou
Cilk (pronounced “silk”) is a C-based runtime system for multithreaded parallel programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler, both empirically and...
Heterogeneous multithreaded computing (1995)
Howard J. Lu, Charles E. Leiserson, F. R. Morgenthaler, Howard J. Lu
This thesis studies issues introduced by supporting multithreaded computing in environments composed of workers heterogeneous with respect to data representation, speed, and scale. We implement a...
Synchronized MIMD Computing (1994)
Bradley C. Kuszmaul, Charles E. Leiserson, F. R. Morgenthaler, Bradley C. Kuszmaul
Synchronized MIMD Computing (1994)
Charles E. Leiserson, F. R. Morgenthaler, Bradley C. Kuszmaul, Bradley C. Kuszmaul
Scheduling Multithreaded Computations by Work Stealing (1994)
Robert D. Blumofe, Charles E. Leiserson
This paper studies the problem of efficiently scheduling fully strict (i.e., well-structured) multithreaded computations on parallel computers. A popular and practical method of scheduling this kind...
Synchronized MIMD Computing (1994)
Charles E. Leiserson, F. R. Morgenthaler, Bradley C. Kuszmaul, Bradley C. Kuszmaul
Efficient Out-of-Core Algorithms for Linear Relaxation Using Blocking Covers (1993)
Charles E. Leiserson, Satish Rao, Sivan Toledo
When a numerical computation fails to fit in the primary memory of a serial or parallel computer, a so-called "out-of-core" algorithm, which moves data between primary and secondary...
Space-Efficient Scheduling of Multithreaded Computations (1993)
Robert D. Blumofe, Charles E. Leiserson
. This paper considers the problem of scheduling dynamic parallel computations to achieve linear speedup without using significantly more space per processor than that required for a single-processor...
Space-Efficient Scheduling of Multithreaded Computations (Extended Abstract) (1993)
Robert D. Blumofe, Charles E. Leiserson
This paper considers the problem of scheduling dynamic parallel computations to achieve linear speedup without using significantly more space per processor than that required for a single-processor...
Managing Storage for Multithreaded Computations (1992)
Charles E. Leiserson, Campbell L. Searle, Robert D. Blumofe, Robert D. Blumofe
at the
Optimizing Two-Phase, Level-Clocked Circuitry (1992)
Extended Abstract, Alexander T. Ishii, Charles E. Leiserson, Marios C. Papaefthymiou, O(v E V
) Alexander T. Ishii NEC C&C Research Laboratories Princeton, New Jersey 08540 Charles E. Leiserson Marios C. Papaefthymiou MIT Laboratory for Computer Science Cambridge, Massachusetts 02139...
A comparison of sorting algorithms for the connection machine cm-2 (1991)
Guy E. Blelloch, Charles E. Leiserson, Stephen J. Smith, Thinking Machines Corp
Abstract We have evaluated the implementation of many parallel sorting algorithms proposed in the literature. We selected the three most promising and implemented highly tuned versions on the...
A Comparison of Sorting Algorithms for the Connection Machine CM-2 (1991)
Guy Blelloch Carnegie, Guy E. Blelloch, C. Greg Plaxton, Charles E. Leiserson, Stephen J. Smith, Bruce M. Maggs, ...
We have implemented three parallel sorting algorithms on the Connection Machine Supercomputer model CM-2: Batcher's bitonic sort, a parallel radix sort, and a sample sort similar to Reif and...
A Comparison of Sorting Algorithms for the Connection Machine CM-2 (1991)
Guy E. Blelloch, C. Greg Plaxton, Charles E. Leiserson, Stephen J. Smith, Bruce M. Maggs, Marco Zagha
We have implemented three parallel sorting algorithms on the Connection Machine Supercomputer model CM-2: Batcher's bitonic sort, a parallel radix sort, and a sample sort similar to Reif and...
Efficient Interconnection Schemes for VLSI and Parallel Computation (1989)
Charles E. Leiserson, Arthur C. Smith, Ronald I. Greenberg, Ronald I. Greenberg
This thesis is primarily concerned with two problems of interconnecting components in VLSI technologies. In the first case, the goal is to construct efficient interconnection networks for...
An application of number theory to the organization of raster-graphics memory (1986)
Benny Chor, Charles E. Leiserson, L. Rivest, James B. Shearer
Abstract. A high-resolution raster-graphics display is usually combined with processing power and a memory organization that facilitates basic graphics operations. For many applications, including...
A Hyperconcentrator Switch for Routing Bit-Serial Messages (1986)
Charles E. Leiserson, Thomas H. Cormen, Thomas H. Cormen
In highly parallel message routing networks, it is sometimes desirable to concentrate relatively few messages on many wires onto fewer wires. We have designed a VLSI chip for this purpose which is...