Charles E. Leiserson

Abstract The Implementation of the Cilk-5 Multithreaded Language (2009)

Matteo Frigo, Charles E. Leiserson, Keith H. Randall

The fth release of the multithreaded language Cilk uses a provably good \work-stealing " scheduling algorithm similar to the rst system, but the language has been completely redesigned and...

Unbounded Transactional Memory (2009)

C. Scott, Ananian Krste, Asanović Bradley, C. Kuszmaul, Charles E. Leiserson, Sean Lie

Hardware transactional memory should support unbounded transactions: transactions of arbitrary size and duration. We describe a hardware implementation of unbounded transactional memory, called UTM,...

Adaptive Work Stealing with Parallelism Feedback (2008)

Kunal Agrawal, Yuxiong He, Charles E. Leiserson

Abstract We present an adaptive work-stealing thread scheduler, A-STEAL, for fork-join multithreaded jobs, like those written using the Cilk multithreaded language or the Hood work-stealinglibrary....

Abstract Adaptive Work Stealing with Parallelism Feedback (2008)

Kunal Agrawal, Yuxiong He, Charles E. Leiserson

We present an adaptive work-stealing thread scheduler, A-STEAL, for fork-join multithreaded jobs, like those written using the Cilk multithreaded language or the Hood work-stealing library. The...

Unbounded Transactional Memory (2008)

C. Scott, Ananian Krste, Asanović Bradley, C. Kuszmaul, Charles E. Leiserson, Sean Lie

Hardware transactional memory should support unbounded transactions: transactions of arbitrary size and duration. We describe a hardware implementation of unbounded transactional memory, called UTM,...

Abstract A Comparison of Sorting Algorithms for the Connection Machine CM-2 (2008)

Guy E. Blelloch, C. Greg Plaxton, Charles E. Leiserson, Stephen J. Smith, Bruce M. Maggs, Marco Zagha

We have implemented three parallel sorting algorithms on the Connection Machine Supercomputer model CM-2: Batcher's bitonic sort, a parallel radix sort, and a sample sort similar to Reif and...

Abstract Cilk: An Efficient Multithreaded Runtime System (2008)

Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, Yuli Zhou

Cilk (pronounced “silk”) is a C-based runtime system for multithreaded parallel programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler, both empirically and...

References (2008)

Amund Tveit, Thomas H. Cormen, Charles E. Leiserson

Abstract. A lower bound of Omega(n 2 log(n)) is proved for the time complexity of calculating the inverse of a matrix nxn, over the real or complex numbers in the sequential computation case.

[David 00] David Solomon, Data Compression: The Complete Reference, Springer, 2000. (2008)

Rivest Thomas, H. Cormen, Charles E. Leiserson, Ronald L

[Abbott and Ahuja 90] A. L. Abbott and N. Ahuja, “Active surface reconstruction by integrating focus, vergence, stereo, and camera calibration”, Proceedings: Third

Min-max heaps – Deaps – Leftist heaps –Binomial heaps – Fibonacci heaps – Skew heaps- Lazy-binomial heaps. 3. Search Structures 9 (2008)

Thomas H. Corman, Charles E. Leiserson, Ronald L. Rivest, Introduction To

asymptotic notation with several parameters – conditional asymptotic notation – amortized analysis – NPcompleteness – NP-hard – recurrence equations – solving recurrence equations –...

Processing Letters, 18(3):147–150, 1984. (2008)

Ab Sanjeev Arora, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein, Introduction Algorithms, ...

modern approach. Initial draft kindly placed by the authors on the web, 2006. [AGP94] W. R. Alford, A. Granville, and C. Pomerance. There are infinitely many Carmichael numbers. Annals of Mathematics,

[3] Jørgen Bang-Jensen and Gregory Gutin. Digraphs: Theory, Algorithms and Applications. (2008)

S. Ferenczi, C. Mauduit, A. Siegel, Elwyn R. Berlekamp, Hohn H. Conway, ...

[5] József Beck. On positional games. Journal of Combinatorial Theory, Series A, 30:117–

Abstract Adaptive Work Stealing with Parallelism Feedback (2008)

Kunal Agrawal, Yuxiong He, Charles E. Leiserson

We present an adaptive work-stealing thread scheduler, A-STEAL, for fork-join multithreaded jobs, like those written using the Cilk multithreaded language or the Hood work-stealing library. The...

Provably Efficient Adaptive Scheduling For Parallel Jobs (2008)

Yuxiong He, Wen Jing Hsu, Charles E. Leiserson

Abstract — Scheduling competing jobs on multiprocessors has always been an important issue for parallel and distributed systems. The challenge is to ensure global, systemwide efficiency while...

Abstract The Implementation of the Cilk-5 Multithreaded Language (2008)

Matteo Frigo, Charles E. Leiserson, Keith H. Randall

The fth release of the multithreaded language Cilk uses a provably good \work-stealing " scheduling algorithm similar to the rst system, but the language has been completely redesigned and...

Adversarial Analyses of Window Backoff Strategies for Simple Multiple-Access Channels (2008)

Michael A. Bender, Martin Farach-colton, Simai He, Bradley C. Kuszmaul, Charles E. Leiserson

Abstract — Backoff strategies have typically been analyzed by making statistical assumptions on the distribution of problem inputs. Although these analyses have provided valuable insights into the...

Unbounded Transactional Memory (2008)

C. Scott, Ananian Krste, Asanović Bradley, C. Kuszmaul, Charles E. Leiserson, Sean Lie

Hardware transactional memory should support unbounded transactions: transactions of arbitrary size and duration. We describe a hardware implementation of unbounded transactional memory, called UTM,...

Unbounded Transactional Memory (2008)

C. Scott, Ananian Krste, Asanović Bradley, C. Kuszmaul, Charles E. Leiserson, Sean Lie

Hardware transactional memory should support unbounded transactions: transactions of arbitrary size and duration. We describe a hardware implementation of unbounded transactional memory, called UTM,...

Theory and Applications. Kluwer Academic Publishers, 1987. Another textbook on simulated annealing (2008)

Brian Borchers, Jan Korst, Simulated Annealing, Thomas L. Magnanti, As Noga Alon, ...

I've compiled this list of books on combinatorial optimization and integer programming as a source of additional reading and project ideas for students in my graduate course in combinatorial...

Abstract On-the-Fly Maintenance of Series-Parallel Relationships in Fork-Join Multithreaded Programs (2008)

Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, Charles E. Leiserson

A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two...

Abstract On-the-Fly Maintenance of Series-Parallel Relationships in Fork-Join Multithreaded Programs (2008)

Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, Charles E. Leiserson

A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two...

An Experimental Comparison of Edge-Triggering and Level-Clocking (2007)

Charles E. Leiserson, A. Gould, Keith H. Randall, Keith H. Randall

Circuits implemented with two-phase level-clocked latches have the theoretical potential to operate faster and require less state than equivalent circuits implemented with edge-triggered latches. We...

Optimizing Two-Phase, Level-Clocked Circuitry (2007)

Extend Ed, Extended Abstract, Alexander T. Ishii, Charles E. Leiserson, Marios C. Papaefthymiou, O(v E V

) Alexander T. Ishii NEC C&C Research Laboratories Princeton, New Jersey 08540 Charles E. Leiserson Marios C. Papaefthymiou MIT Laboratory for Computer Science Cambridge, Massachusetts 02139...

Certified by......................................................... (2007)

Charles E. Leiserson, R. Morgenthaler

This thesis presents the theory, design, and implementation of Cilk (pronounced "silk") and Cilk-NOW. Cilk is a C-based language and portable runtime system for programming and...

Certified by fGE] ' '/-t------- (2007)

Marios Christos Papaefthymiou, Marios Christos Papaefthymiou, Charles E. Leiserson

In this paper we investigate properties of retirning, a circuit transformation which preserves the behavior of the circuit as a whole. We present an algorithm which transforms a given combinational...

and Creating Locality in Numerical Algorithms (2007)

Charles E. Leiserson, Sivan Avraham Toledo, Sivan Avraham Toledo

How do you determine the running time of a program without actually running it? How do you design an efficient out-of-core iterative algorithm? These are the two questions answered in this thesis....

by (2007)

Charles E. Leiserson, R. Morgenthaler, Christopher F. Joerg, Christopher F. Joerg

Although cost-effective parallel machines are now commercially available, the widespread use of parallel processing is still being held back, due mainly to the troublesome nature of parallel...

Certified by......................................................... (2007)

Charles E. Leiserson, R. Morgenthaler, Robert D. Blumofe

This thesis presents the theory, design, and implementation of Cilk (pronounced "silk") and Cilk-NOW. Cilk is a C-based language and portable runtime system for programming and...

1 Efficient Out-of-Core Algorithms for Linear Relaxation Using Blocking Covers (Extended Abstract) (2007)

Charles E. Leiserson, Satish Rao, Sivan Toledo

When a numerical computation fails to fit in the primary memory of a serial or parallel computer, a so-called "out-of-core " algorithm must be used which moves data between primary...

Provably Efficient Adaptive Scheduling for Parallel Jobs (2007)

He, Yuxiong, Hsu, Wen Jing, Leiserson, Charles E.

Scheduling competing jobs on multiprocessors has always been an important issue for parallel and distributed systems. The challenge is to ensure global, system-wide efficiency while offering a level...

Provably Efficient Adaptive Scheduling for Parallel Jobs (2007)

He, Yuxiong, Hsu, Wen Jing, Leiserson, Charles E.

Scheduling competing jobs on multiprocessors has always been an important issue for parallel and distributed systems. The challenge is to ensure global, system-wide efficiency while offering a level...

An empirical evaluation of work stealing with parallelism feedback (2006)

Kunal Agrawal, Yuxiong He, Charles E. Leiserson

A-STEAL is a provably good adaptive work-stealing thread scheduler that provides parallelism feedback to a multiprocessor job scheduler. A-STEAL uses a simple multiplicative-increase,...

Provably efficient two-level adaptive scheduling (2006)

Yuxiong He, Wen-jing Hsu, Charles E. Leiserson

Abstract. Multiprocessor scheduling in a shared multiprogramming environment can be structured in two levels, where a kernel-level job scheduler allots processors to jobs and a user-level thread...

Memory models for open-nested transactions (2006)

Kunal Agrawal, Charles E. Leiserson, Jim Sukha

Open nesting provides a loophole in the strict model of atomic transactions. Moss and Hosking suggested adapting open nesting for transactional memory, and Moss and a group at Stanford have proposed...

Memory models for open-nested transactions (2006)

Kunal Agrawal, Charles E. Leiserson, Jim Sukha

Open nesting provides a loophole in the strict model of atomic transactions. Moss and Hosking suggested adapting open nesting for transactional memory, and Moss and a group at Stanford have proposed...

An empirical evaluation of work stealing with parallelism feedback (2006)

Kunal Agrawal, Yuxiong He, Charles E. Leiserson

Abstract A-STEAL is a provably good adaptive work-stealingthread scheduler that provides parallelism feedback to a multiprocessor job scheduler. A-STEAL uses a simplemultiplicative-increase,...

Memory models for open-nested transactions (2006)

Kunal Agrawal, Charles E. Leiserson, Jim Sukha

Open nesting provides a loophole in the strict model of atomic transactions. Moss and Hosking suggested adapting open nesting for transactional memory, and Moss and a group at Stanford have proposed...

Provably efficient two-level adaptive scheduling (2006)

Yuxiong He, Wen-jing Hsu, Charles E. Leiserson

Abstract. Multiprocessor scheduling in a shared multiprogramming environment can be structured in two levels, where a kernel-level job scheduler allots processors to jobs and a user-level thread...

External-memory search trees with fast insertions (2006)

Jelani Nelson, Charles E. Leiserson, Bradley C. Kuszmaul, Jelani Nelson

This thesis provides both experimental and theoretical contributions regarding externalmemory dynamic search trees with fast insertions. The first contribution is the implementation of the buffered...

An empirical evaluation of work stealing with parallelism feedback (2006)

Kunal Agrawal, Yuxiong He, Charles E. Leiserson

Abstract A-STEAL is a provably good adaptive work-stealingthread scheduler that provides parallelism feedback to a multiprocessor job scheduler. A-STEAL uses a simplemultiplicative-increase,...

An empirical evaluation of work stealing with parallelism feedback (2006)

Kunal Agrawal, Yuxiong He, Charles E. Leiserson

A-STEAL is a provably good adaptive work-stealing thread scheduler that provides parallelism feedback to a multiprocessor job scheduler. A-STEAL uses a simple multiplicative-increase,...

Programming with Exceptions in JCilk (2005)

Danaher, John S., Lee, I-Ting Angelina, Leiserson, Charles E.

JCilk extends the Java language to provide call-return semantics for multithreading, much as Cilk does for C. Java's built-in thread model does not support the passing of exceptions or return values...

Programming with Exceptions in JCilk (2005)

Danaher, John S., Lee, I-Ting Angelina, Leiserson, Charles E.

JCilk extends the Java language to provide call-return semantics for multithreading, much as Cilk does for C. Java's built-in thread model does not support the passing of exceptions or return values...

The JCilk Language for Multithreaded Computing (2005)

Danaher, John, Lee, I-Ting Angelina, Leiserson, Charles E.

JCilk extends the Java language to provide call-return semantics for multithreading, much as Cilk does for C. Java's built-in thread model does not support the passing of exceptions or return values...

Certified by..................... (2005)

Jim Sukha, Charles E. Leiserson, Bradley C. Kuszmaul, Arthur C. Smith

Memory-mapped transactions combine the advantages of both memory mapping and transactions to provide a programming interface for concurrently accessing data on disk without explicit I/O or locking...

Unbounded Transactional Memory (2005)

C. Scott Ananian, Krste Asanović, Bradley C. Kuszmaul, Charles E. Leiserson, Sean Lie

Background: Programming in a shared-memory environment often requires the use of atomic regions for program correctness. Traditionally, atomicity is achieved through critical sections protected by...

Certified by.......................................................... (2005)

John Danaher, Charles E. Leiserson, Arthur C. Smith

JCilk extends the Java language to provide call-return semantics for multithreading, much as Cilk does for C. Java’s built-in thread model does not support the passing of exceptions or return...

Certified by.......................................................... (2005)

Charles E. Leiserson

JCilk is a Java-based multithreaded programming language which extends Java to provide a dynamic threading model. Specifically, JCilk imports Cilk’s fork-join primitives spawn and sync into Java to...

Thesis Supervisor Certified by.......................................................... (2005)

Jim Sukha, Charles E. Leiserson, Bradley C. Kuszmaul, Arthur C. Smith

Memory-mapped transactions combine the advantages of both memory mapping and transactions to provide a programming interface for concurrently accessing data on disk without explicit I/O or locking...

On-the-Fly Maintenance of Series-Parallel Relationships in Fork-Join Multithreaded Programs (2004)

Bender, Michael A., Fineman, Jeremy T., Gilbert, Seth, Leiserson, Charles E.

A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two...

On-the-Fly Maintenance of Series-Parallel Relationships in Fork-Join Multithreaded Programs (2004)

Bender, Michael A., Fineman, Jeremy T., Gilbert, Seth, Leiserson, Charles E.

A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two...

Certified by.;..............;.,. v....... Accepted by.. (2004)

Siddhartha Sen, Charles E. Leiserson

This thesis addresses the problem of scheduling multiple, concurrent, adaptively parallel jobs on a multiprogrammed shared-memory multiprocessor. Adaptively parallel jobs are jobs for which the...

References (2004)

Jeff M. Phillips, Thomas H. Cormen, Charles E. Leiserson

Union-Find is a simple and incredibly useful algorithm for finding connected sets in a graph, G = (V, E) that is often taught in an advanced algorithms class. Where n = |V | and m = |E|, Union-Find...

Dynamic processor allocation for adaptively parallel jobs (2004)

Siddhartha Sen, Charles E. Leiserson, Siddhartha Sen

This thesis addresses the problem of scheduling multiple, concurrent, adaptively parallel jobs on a multiprogrammed shared-memory multiprocessor. Adaptively parallel jobs are jobs for which the...

On-the-fly maintenance of series-parallel relationships in fork-join multithreaded programs (2004)

Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, Charles E. Leiserson

A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two...

Work-Stealing Jobs (2004)

Siddhartha Sen, Charles E. Leiserson, Siddhartha Sen

This thesis addresses the problem of scheduling multiple, concurrent, adaptively parallel jobs on a multiprogrammed shared-memory multiprocessor. Adaptively parallel jobs are jobs for which the...

On-the-fly maintenance of series-parallel relationships in fork-join multithreaded programs (2004)

Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, Charles E. Leiserson

A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two...

Leiserson.On-the-fly maintenance of series-parallel relationships in fork-join multithreaded programs (2004)

Michael A. Bender, Jeremy T. Fineman, Seth Gilbert, Charles E. Leiserson

A key capability of data-race detectors is to determine whether one thread executes logically in parallel with another or whether the threads must operate in series. This paper provides two...

Adversarial Analyses of Window Backoff Strategies for Simple Multiple-Access Channels (2003)

Bender, Michael A., Farach-Colton, Martin, He, Simai, Kuszmaul, Bradley C., Leiserson, Charles E.

Backoff strategies have typically been analyzed by making statistical assumptions on the distribution of problem inputs. Although these analyses have provided valuable insights into the efficacy of...

Hardware Transactional Memory (2003)

Lie, Sean, Asanovic, Krste, Kuszmaul, Bradley C., Leiserson, Charles E.

This work shows how hardware transactional memory (HTM) can be implemented to support transactions of arbitrarily large size, while ensuring that small transactions run efficiently. Our...

A Media Player for Use in Distance Education (2003)

Huang, Kai, Leiserson, Charles E., Sarmenta, Luis F.G.

We have developed a media player for use in distance education. The player can incorporate several time-indexed sources, including video, audio, PowerPoint, and text index. We have converted all the...

A Media Player for Use in Distance Education (2003)

Huang, Kai, Leiserson, Charles E., Sarmenta, Luis F.G.

We have developed a media player for use in distance education. The player can incorporate several time-indexed sources, including video, audio, PowerPoint, and text index. We have converted all the...

Hardware Transactional Memory (2003)

Lie, Sean, Asanovic, Krste, Kuszmaul, Bradley C., Leiserson, Charles E.

This work shows how hardware transactional memory (HTM) can be implemented to support transactions of arbitrarily large size, while ensuring that small transactions run efficiently. Our...

Adversarial Analyses of Window Backoff Strategies for Simple Multiple-Access Channels (2003)

Bender, Michael A., Farach-Colton, Martin, He, Simai, Kuszmaul, Bradley C., Leiserson, Charles E.

Backoff strategies have typically been analyzed by making statistical assumptions on the distribution of problem inputs. Although these analyses have provided valuable insights into the efficacy of...

Transactions Everywhere (2003)

Kuszmaul, Bradley C., Leiserson, Charles E.

Arguably, one of the biggest deterrants for software developers who might otherwise choose to write parallel code is that parallelism makes their lives more complicated. Perhaps the most basic...

Transactions Everywhere (2003)

Kuszmaul, Bradley C., Leiserson, Charles E.

Arguably, one of the biggest deterrants for software developers who might otherwise choose to write parallel code is that parallelism makes their lives more complicated. Perhaps the most basic...

Transactions everywhere (2003)

Bradley C. Kuszmaul, Charles E. Leiserson, Sma Fellow

Abstract — Arguably, one of the biggest deterrants for software developers who might otherwise choose to write parallel code is that parallelism makes their lives more complicated. Perhaps the most...

Systolic Arrays for (VLSI). (2002)

Kung,H. T., Leiserson,Charles E.

A systolic system is a network of processors which rhythmically compute and pass data through the system. Physiologists use the work 'systole' to refer to the rhythmically recurrent contraction of...

A Layout for the Shuffle-Exchange Network. (2002)

Hoey,Dan, Leiserson,Charles E.

This paper describes a technique for producing a VLSI layout of the shuffle-exchange graph. It is based on the layout procedure which lays out a graph by bisecting the graph, recursively laying out...

Optimal Placement for River Routing. (2002)

Leiserson,Charles E., Pinter,Roy Y.

Programs for integrated circuit layout typically have two phases: placement and routing. The router should produce as efficient a layout as possible, but of course the quality of the routing depends...

Wafer-Scale Integration of Systolic Arrays, (2002)

Leighton,Frank Thomson, Leiserson,Charles E.

VLSI technologists are fast developing wafer-scale integration. Rather than partitioning a silicon wafer into chips as is usually done, the idea behind wafer-scale integration is to assemble an...

A Coherent VLSI Design Environment. (2002)

Glasser,Lance A., Leiserson,Charles E., Rivest,Ronald L.

This report covers the period from April 1, 1985 through September 30, 1985. The research discussed here is described in more detail in several published and unpublished reports cited. The CAD frame...

Randomized Routing on Fat-Trees, (2002)

Greenberg, Ronald I., Leiserson, Charles E.

Fat-trees are a class of routing networks for hardware-efficient parallel computation. This paper presents a randomized algorithm for routing messages on a fat-tree. The quality of the algorithm is...

Randomized Routing on Fat Trees. (2002)

Greenberg,Ronald I., Leiserson,Charles E.

Fat-trees are a class of routing networks for hardware-efficient parallel computation. This paper presents a randomized algorithm for routing messages on a fat-tree. The quality of the algorithm is...

Retiming Synchronous Circuitry. (2002)

Leiserson,Charles E., Saxe,James B.

This paper shows how the technique of retiming can be used to transform a given synchronous circuit into a more efficient circuit under a variety of different cost criteria. We model a circuit as a...

A Hyperconcentrator Switch for Routing Bit-Serial Messages (Extended Abstract), (2002)

Cormen,Thomas H., Leiserson,Charles E.

In highly parallel message routing networks, it is sometimes desirable to concentrate relatively few messages on many wires onto fewer wires. We have designed a VLSI chip for this purpose which is...

Communication-Efficient Parallel Graph Algorithms. (2002)

Leiserson,Charles E., Maggs,Bruce M.

Communication bandwidth is a resource ignored by most parallel random-access machine (PRAM) models. This paper shows that many graph problems can be solved in parallel, not only with polylogarithmic...

A Hyperconcentrator Switch for Routing Bit-Serial Messages. (2002)

Cormen,Thomas H., Leiserson,Charles E.

In highly parallel message routing networks, it is sometimes desirable to concentrate relatively few messages on many wires onto fewer wires. We have designed a VLSI chip for this purpose which is...

A Coherent VLSI Design Environment. (2002)

Dally, William J., Glasser, Lance A., Leiserson, Charles E.

The CAD frame Schema has been outfitted with a uniform protocol or aggregate data structures, which includes new control structures for examining and searching through the data structures. Circuit...

Using Cilk to Write Multiprocessor Chess Programs (2001)

Don Dailey, Charles E. Leiserson

This paper overviews the Cilk language, illustrating how Cilk supports the programming of parallel game-tree search and other chess mechanisms

Portable High-Performance Programs (1999)

Matteo Frigo, Charles E. Leiserson

right notice and this permission notice are preserved on all copies.

Cache-oblivious algorithms (Extended Abstract) (1999)

Matteo Frigo, Charles E. Leiserson, Harald Prokop, Sridhar Ramachandran

This paper presents asymptotically optimal algorithms for rectangular matrix transpose, FFT, and sorting on computers with multiple levels of caching. Unlike previous optimal algorithms, these...

Cache-Oblivious Algorithms (Extended Abstract) (1999)

Matteo Frigo, Charles E. Leiserson, Harald Prokop, Sridhar Ramachandran, Z W(l

This paper presents asymptotically optimal algorithms for rectangular matrix transpose, FFT, and sorting on computers with multiple levels of caching. Unlike previous optimal algorithms, these...

Cache-Oblivious Algorithms (Extended Abstract) (1999)

Matteo Frigo, Charles E. Leiserson, Harald Prokop, Sridhar Ramachandran, Z W(l

SUBMITTED FOR PUBLICATION. Matteo Frigo Charles E. Leiserson Harald Prokop Sridhar Ramachandran MIT Laboratory for Computer Science, 545 Technology Square, Cambridge, MA 02139...

Cache-oblivious algorithms (extended abstract (1999)

Matteo Frigo, Charles E. Leiserson, Harald Prokop, Sridhar Ramachandran

Abstract This paper presents asymptotically optimal algorithms for rectangular matrix transpose, FFT, and sorting on computers with multiple levels of caching. Unlike previous optimal algorithms,...

Cache-oblivious algorithms (extended abstract (1999)

Matteo Frigo, Charles E. Leiserson, Harald Prokop, Sridhar Ramachandran

Abstract This paper presents asymptotically optimal algorithms for rectangular matrix transpose, FFT, and sorting on computers with multiple levels of caching. Unlike previous optimal algorithms,...

Optimixing Synchronous Systems. (1998)

Leiserson,Charles E., Saxe,John B.

The complexity of integrated-circuit chips produced today makes it feasible to build inexpensive, special-purpose subsystems that rapidly solve sophisticated problems on behalf of a general-purpose...

Wafer-Scale Integration of Systolic Arrays, (1998)

Leighton,Frank Thomson, Leiserson,Charles E.

VLSI technologies are fast developing wafer-scale integration. Rather than partitioning a silicon wafer into chips as is usually done, the idea behind wafer-scale integration is to assemble an entire...

A Space-Efficient Algorithm for Finding the Connected Components of Rectangles in the Plane. (1998)

Leiserson,Charles E., Phillips,Cynthia A.

An algorithm is presented for determining the connectivity of a set of N rectangles in the plane, a problem central to avoiding aliasing in VLSI design rule checkers. Previous algorithms for this...

Retiming Synchronous Circuitry. (1998)

Leiserson, Charles E., Saxe, James B.

This paper describes a circuit transformation called retiming in which registers are added at some points in a circuit and removed from others in such a way that he functional behavior of the circuit...

The Organization of Permutation Architectures with Bussed Interconnections. (1998)

Kilian, Joe, Kipnis, Shlomo, Leiserson, Charles E.

This paper explores the problem of efficiently permuting data stored in VLSI chips in accordance with a predetermined set of permutations. By connecting chips with shared bus interconnections, as...

The Organization of Permutation Architectures with Bussed Interconnections. (1998)

Kilian, Joe, Kipnis, Sholomo, Leiserson, Charles E.

This paper explores the problem of efficiently permuting data stored in VLSI chips in accordance with a predetermined set of permutations. By connecting chips with shared bus interconnections, as...

VLSI Theory and Parallel Supercomputing. (1998)

Leiserson, Charles E.

Since its inception, VLSI theory has expanded in many fruitful and interesting directions. One major branch is layout theory which studies the efficiency with which graphs can be embedded in the...

Proceedings of the MIT Student Workshop on VLSI and Parallel Systems Held in Dedham, Massachusetts on 21 July 1992, (1998)

Leiserson, Charles E.

Partial contents: Micron-Scale Display Technology; Virtual Memory for Data-Parallel Computing; The Optimal Synthesis of VLSI Array Architectures from Algorithmic Descriptions; Clay-1: A Distributed...

A Timing Analysis of Level-Clocked Circuitry, (1998)

Ishii, Alexander T., Leiserson, Charles E.

This paper presents an algorithm for verifying proper timing in VLSI circuits where latches are controlled by the levels (high or low) of the controlling clocks rather than the transitions (edges) of...

An Experimental Analysis of Parallel Sorting Algorithms (1998)

Guy E. Blelloch, C. Greg Plaxton, Charles E. Leiserson, Stephen J. Smith, Bruce M. Maggs, Marco Zagha

We have developed a methodology for predicting the performance of parallel algorithms on real parallel machines. The methodology consists of two steps. First, we characterize a machine by enumerating...

Programming parallel applications in cilk (1998)

Charles E. Leiserson, Aske Plaat

Cilk (pronounced "silk") is a C-based language for multithreaded parallel programming. Cilk makes it easy to program irregular parallel applications, especially as compared with...

Using de Bruijn Sequences to Index a 1 in a Computer Word (1998)

Charles E. Leiserson, Harald Prokop, Keith H. Randall

Some computers provide an instruction to find the index of a 1 in a computer word, but many do not. This paper provides a fast and novel algorithm based on de Bruijn sequences to solve this problem....

A Minicourse on Multithreaded Programming (1998)

Charles E. Leiserson, Harald Prokop

These notes contain two lectures that teach multithreaded algorithms using a Cilklike [7, 9, 11] model. These lectures were designed for the latter part of the MIT undergraduate class 6.046...

Space-Efficient Scheduling of Multithreaded Computations (1998)

Robert D. Blumofe, Charles E. Leiserson

.<F3.854e+05> This paper considers the problem of scheduling dynamic parallel computations to achieve linear speedup without using significantly more space per processor than that required for...

Detecting Data Races in Cilk Programs that Use Locks (1998)

Guang-ien Cheng, Mingdong Feng, Charles E. Leiserson, Keith H. Randall, Andrew F. Stark

When two parallel threads holding no locks in common access the same memory location and at least one of the threads modifies the location, a "data race" occurs, which is usually a bug....

Cilk: Efficient Multithreaded Computing (1998)

Keith H. Randall, Charles E. Leiserson, H. Randall

This thesis describes Cilk, a parallel multithreaded language for programming contemporary shared memory multiprocessors (SMP's). Cilk is a simple extension of C which provides constructs for...

The Implementation of the Cilk-5 Multithreaded Language (1998)

Matteo Frigo, Charles E. Leiserson, Keith H. Randall

The fifth release of the multithreaded language Cilk uses a provably good "work-stealing" scheduling algorithm similar to the first system, but the language has been completely redesigned...

Optimizing Two-Phase, Level-Clocked Circuitry (1997)

Alexander T. Ishii, Charles E. Leiserson, Marios C. Papaefthymiou

We investigate two strategies for reducing the clock period of a two-phase, level-clocked circuit: clock tuning, which adjusts the waveforms that clock the circuit, and retiming, which relocates...

Efficient Detection of Determinacy Races in Cilk Programs (1997)

Mingdong Feng, Charles E. Leiserson

A parallel multithreaded program that is ostensibly deterministic may nevertheless behave nondeterministically due to bugs in the code. These bugs are called determinacy races, and they result when...

Efficient Detection of Determinacy Races in Cilk Programs (1997)

Mingdong Feng, Charles E. Leiserson

A parallel multithreaded program that is ostensibly deterministic may nevertheless behave nondeterministically due to bugs in the code. These bugs are called determinacy races, and they result when...

Dag-Consistent Distributed Shared Memory (1996)

Robert D. Blumofe, Matteo Frigo, Christopher F. Joerg, Charles E. Leiserson, Keith H. Randall

We introduce dag consistency, a relaxed consistency model for distributed shared memory which is suitable for multithreaded programming. We have implemented dag consistency in software for the Cilk...

An Analysis of Dag-Consistent Distributed Shared-Memory Algorithms (1996)

Robert Blumofe, Matteo Frigo, Christopher F Joerg, Charles E Leiserson, Keith H Randall

In this paper, we analyze the performance of parallel multithreaded algorithms that use dag-consistent distributed shared memory. Specifically, we analyze execution time, page faults, and space...

An Analysis of Dag-Consistent Distributed Shared-Memory Algorithms (1996)

Robert D. Blumofe, Matteo Frigo, Christopher F. Joerg, Charles E. Leiserson, Keith H. Randall

In this paper, we analyze the performance of parallel multithreaded algorithms that use dag-consistent distributed shared memory. Specifically, we analyze execution time, page faults, and space...

Dag-Consistent Distributed Shared Memory (1996)

Robert Blumofe, Matteo Frigo, Christopher F. Joerg, Charles E. Leiserson, Keith H. Randall

We introduce dag consistency, a relaxed consistency model for distributed shared memory which is suitable for multithreaded programming. We have implemented dag consistency in software for the Cilk...

Randomized Routing on Fat-Trees (1996)

Ronald I. Greenberg, Charles E. Leiserson

Fat-trees are a class of routing networks for hardware-efficient parallel computation. This paper presents a randomized algorithm for routing messages on a fat-tree. The quality of the algorithm is...

Abstract (1996)

Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, Yuli Zhou

Cilk (pronounced “silk”) is a C-based runtime system for multithreaded parallel programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler, both empirically and...

Abstract (1996)

Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, Yuli Zhou

Cilk (pronounced “silk”) is a C-based runtime system for multithreaded parallel programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler, both empirically and...

Dag-consistent distributed shared memory (1996)

Robert D. Blumofe, Matteo Frigo, Christopher F. Joerg, Charles E. Leiserson, Keith H. Randall

We introduce dag consistency, a relaxed consistency model for distributed shared memory which is suitable for multithreaded programming. We have implemented dag consistency in software for the Cilk...

Executing Multithreaded Programs Efficiently (1995)

Robert D. Blumofe, Charles E. Leiserson, Robert D. Blumofe

right to do so. by::::::::::::::::::::::::::::::::::::::::::::::::::::::::

Cilk: An efficient multithreaded runtime system (1995)

Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, Yuli Zhou

Cilk (pronounced “silk”) is a C-based runtime system for multithreaded parallel programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler, both empirically and...

Cilk: An efficient multithreaded runtime system (1995)

Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, Yuli Zhou

Cilk (pronounced &quot;silk&quot;) is a C-based runtime system for multithreaded parallel programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler, both...

Cilk: An Efficient Multithreaded Runtime System (1995)

Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, Yuli Zhou

Cilk (pronounced "silk") is a C-based runtime system for multithreaded parallel programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler, both empirically...

Heterogeneous Multithreaded Computing (1995)

Charles E. Leiserson, F. R. Morgenthaler, Howard J. Lu, Howard J. Lu

This thesis studies issues introduced by supporting multithreaded computing in environments composed of workers heterogeneous with respect to data representation, speed, and scale. We implement a...

Cilk: An Efficient Multithreaded Runtime System (1995)

Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, Yuli Zhou

Cilk (pronounced "silk") is a C-based runtime system for multithreaded parallel programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler, both empirically...

Parallel Algorithms for the Circuit Value Update Problem (1995)

Charles E. Leiserson, Keith H. Randall

The circuit value update problem is the problem of updating values in a representation of a combinational circuit when some of the inputs are changed. We assume for simplicity that each combinational...

Cilk: An efficient multithreaded runtime system (1995)

Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, Yuli Zhou

Cilk (pronounced “silk”) is a C-based runtime system for multithreaded parallel programming. In this paper, we document the efficiency of the Cilk work-stealing scheduler, both empirically and...

Heterogeneous multithreaded computing (1995)

Howard J. Lu, Charles E. Leiserson, F. R. Morgenthaler, Howard J. Lu

This thesis studies issues introduced by supporting multithreaded computing in environments composed of workers heterogeneous with respect to data representation, speed, and scale. We implement a...

Scheduling Multithreaded Computations by Work Stealing (1994)

Robert D. Blumofe, Charles E. Leiserson

This paper studies the problem of efficiently scheduling fully strict (i.e., well-structured) multithreaded computations on parallel computers. A popular and practical method of scheduling this kind...

Efficient Out-of-Core Algorithms for Linear Relaxation Using Blocking Covers (1993)

Charles E. Leiserson, Satish Rao, Sivan Toledo

When a numerical computation fails to fit in the primary memory of a serial or parallel computer, a so-called "out-of-core" algorithm, which moves data between primary and secondary...

Space-Efficient Scheduling of Multithreaded Computations (1993)

Robert D. Blumofe, Charles E. Leiserson

. This paper considers the problem of scheduling dynamic parallel computations to achieve linear speedup without using significantly more space per processor than that required for a single-processor...

Space-Efficient Scheduling of Multithreaded Computations (Extended Abstract) (1993)

Robert D. Blumofe, Charles E. Leiserson

This paper considers the problem of scheduling dynamic parallel computations to achieve linear speedup without using significantly more space per processor than that required for a single-processor...

Optimizing Two-Phase, Level-Clocked Circuitry (1992)

Extended Abstract, Alexander T. Ishii, Charles E. Leiserson, Marios C. Papaefthymiou, O(v E V

) Alexander T. Ishii NEC C&C Research Laboratories Princeton, New Jersey 08540 Charles E. Leiserson Marios C. Papaefthymiou MIT Laboratory for Computer Science Cambridge, Massachusetts 02139...

A comparison of sorting algorithms for the connection machine cm-2 (1991)

Guy E. Blelloch, Charles E. Leiserson, Stephen J. Smith, Thinking Machines Corp

Abstract We have evaluated the implementation of many parallel sorting algorithms proposed in the literature. We selected the three most promising and implemented highly tuned versions on the...

A Comparison of Sorting Algorithms for the Connection Machine CM-2 (1991)

Guy Blelloch Carnegie, Guy E. Blelloch, C. Greg Plaxton, Charles E. Leiserson, Stephen J. Smith, Bruce M. Maggs, ...

We have implemented three parallel sorting algorithms on the Connection Machine Supercomputer model CM-2: Batcher's bitonic sort, a parallel radix sort, and a sample sort similar to Reif and...

A Comparison of Sorting Algorithms for the Connection Machine CM-2 (1991)

Guy E. Blelloch, C. Greg Plaxton, Charles E. Leiserson, Stephen J. Smith, Bruce M. Maggs, Marco Zagha

We have implemented three parallel sorting algorithms on the Connection Machine Supercomputer model CM-2: Batcher's bitonic sort, a parallel radix sort, and a sample sort similar to Reif and...

Efficient Interconnection Schemes for VLSI and Parallel Computation (1989)

Charles E. Leiserson, Arthur C. Smith, Ronald I. Greenberg, Ronald I. Greenberg

This thesis is primarily concerned with two problems of interconnecting components in VLSI technologies. In the first case, the goal is to construct efficient interconnection networks for...

An application of number theory to the organization of raster-graphics memory (1986)

Benny Chor, Charles E. Leiserson, L. Rivest, James B. Shearer

Abstract. A high-resolution raster-graphics display is usually combined with processing power and a memory organization that facilitates basic graphics operations. For many applications, including...

A Hyperconcentrator Switch for Routing Bit-Serial Messages (1986)

Charles E. Leiserson, Thomas H. Cormen, Thomas H. Cormen

In highly parallel message routing networks, it is sometimes desirable to concentrate relatively few messages on many wires onto fewer wires. We have designed a VLSI chip for this purpose which is...