Chapter 7 Workloads for Programmable Network Interfaces (2008)
Patrick Crowley, Marc E. Fiuczynski, Jean-loup Baer, Brian Bershad
simultaneous multithreaded microprocessor, networking applications Abstract: Network equipment vendors are increasingly incorporating a programmable microprocessor on network interfaces to meet the...
Fundamentals of Programming Languages (2008)
Wayne Amsbury, Jean-loup Baer, Randal Bryant, Peter Calingaert, M. Carberry, H. Khalil, ...
Etude critique et données de compilation du langage Cobol (2008)
Le but de cette étude est de présenter rapidement (en français) le langage Cobol, de le comparer aux langages commerciaux qui l'ont précédé et d'en faire la critique en tant que langage...
Abstract The Structure and Performance of Interpreters (2008)
Theodore H. Romer, Dennis Lee, Geoffrey M. Voelker, Alec Wolman, Wayne A. Wong, Jean-loup Baer, ...
Interpreted languages have become increasingly popular due to demands for rapid program development, ease of use, portability, and safety. Beyond the general impression that they are...
Abstract The Structure and Performance of Interpreters (2008)
Theodore H. Romer, Dennis Lee, Geoffrey M. Voelker, Alec Wolman, Wayne A. Wong, Jean-loup Baer, ...
Interpreted languages have become increasingly popular due to demands for rapid program development, ease of use, portability, and safety. Beyond the general impression that they are...
Chapter 7 Workloads for Programmable Network Interfaces (2008)
Patrick Crowley, Marc E. Fiuczynski, Jean-loup Baer, Brian Bershad
simultaneous multithreaded microprocessor, networking applications Abstract: Network equipment vendors are increasingly incorporating a programmable microprocessor on network interfaces to meet the...
Abstract Instruction Cache Fetch Policies for Speculative Execution (2008)
Current trends in processor design are pointing to deeper and wider pipelines and superscalar architectures. The efficient use of these resources requires speculative execution, a technique whereby...
Abstract Instruction Cache Fetch Policies for Speculative Execution (2008)
dlee,baer¡ Current trends in processor design are pointing to deeper and wider pipelines and superscalar architectures. The efficient use of these resources requires speculative execution, a...
The Impact of Timeliness for Hardware-based Prefetching from Main Memory (2007)
Among the techniques to hide or tolerate memory latency, data prefetching has been shown to be quite effective. However, this efficiency is often limited to prefetching into the first-level cache....
A Hybrid Framework for Network Processor System Analysis (2007)
Patrick Crowley, Jean-loup Baer
This paper introduces a modeling framework for network processing systems. The framework is composed of in-dependent application, system and traffic models which describe router functionality, system...
Memory hierarchy design for a multiprocessor look-up engine (2003)
Jean-loup Baer, Douglas Low, Patrick Crowley, Neal Sidhwaney
We investigate the implementation of IP look-up for core routers using multiple microengines and a tailored memory hierarchy. The main architectural concerns are limiting the number of and contention...
Worst-Case Execution Time Estimation of Hardware-assisted Multithreaded Processors (2003)
Patrick Crowley, Jean-loup Baer
This paper introduces a method for bounding the worst-case performance of programs running on multithreaded processors, such as the embedded cores found within network processors (NPs). Worst-case...
Memory hierarchy design for a multiprocessor look-up engine (2003)
Jean-loup Baer, Douglas Low, Patrick Crowley, Neal Sidhwaney
We investigate the implementation of IP look-up for core routers using multiple microengines and a tailored memory hierarchy. The main architectural concerns are limiting the number of and contention...
Joshua Abram Redstone, Joshua Abram Redstone, Henry Levy, Henry Levy, Susan Eggers, Susan Eggers, ...
This is to certify that I have examined this copy of a doctoral dissertation by
James D. Fix, James D. Fix, James D. Fix, Richard Ladner, Richard Ladner, Richard Anderson, ...
Reading Committee: Date: and that any and all revisions required by the final examining committee have been made.
Abstract Techniques Utilizing Memory Reference Characteristics for Improved Performance (2002)
Wayne A. Wong, Wayne A. Wong, Jean-loup Baer, Jean-loup Baer, Carl Ebeling, Richard Ladner, ...
and have found that it is complete and satisfactory in all respects,
A Modeling Framework for Network Processor Systems (2002)
Patrick Crowley, Jean-loup Baer
This paper introduces a modeling framework for network processing systems. The framework is composed of independent application, system and traffic models which describe router functionality, system...
Characterizing processor architectures for programmable network interfaces (2000)
Patrick Crowley, Marc E. Fiuczynski, Jean-loup Baer, Brian N. Bershad
The rapid advancements of networking technology have boosted potential bandwidth to the point that the cabling is no longer the bottleneck. Rather, the bottlenecks lie at the crossing points, the...
On the performance of multithreaded architectures for network processors (2000)
Patrick Crowley, Marc E. Fiuczynski, Jean-loup Baer
With the ever-increasing performance and flexibility requirements seen in today's networks, we have seen the development of programmable network processors. Network processors are used both in...
Characterizing Processor Architectures for Programmable Network Interfaces (2000)
Patrick Crowley, Marc E. Fiuczynski, Jean-loup Baer, Brian N. Bershad
The rapid advancements of networking technology have boosted potential bandwidth to the point that the cabling is no longer the bottleneck. Rather, the bottlenecks lie at the crossing points, the...
Characterizing processor architectures for programmable network interfaces (2000)
Patrick Crowley, Marc E. Fiuczynski, Jean-loup Baer, Brian N. Bershad
The rapid advancements of networking technology have boosted potential bandwidth to the point that the cabling is no longer the bottleneck. Rather, the bottlenecks lie at the crossing points, the...
Reducing startup latency in web and desktop applications (1999)
Dennis Lee, Jean-loup Baer, Brian Bershad, Tom Anderson
reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein.
On the use of trace sampling for architectural studies of desktop applications (1999)
Patrick Crowley, Jean-loup Baer
This paper examines the feasibility of performing architectural studies with trace sampling for a suite of desktop application traces on Windows NT. This paper makes three contributions: we compare...
Reducing startup latency in web and desktop applications (1999)
Dennis Lee, Jean-loup Baer, Brian Bershad, Tom Anderson
Application startup latency has become a performance problem for both desktop applications and web applications. In this paper, we show that much of the latency experienced during application startup...
Workloads for Programmable Network Interfaces (1999)
Patrick Crowley Marc, Marc E. Fiuczynski, Jean-loup Baer, Brian Bershad
: Network equipment vendors are increasingly incorporating a programmable microprocessor on network interfaces to meet the performance and functionality requirements of present and emerging...
On the Performance Potential of Dynamic Cache Line Sizes (1999)
Eric Anderson, Peter Van Vleet, Lindsay Brown, Jean-loup Baer, Anna R. Karlin
In this paper we present offline algorithms for determining the optimal sequence of loads, superloads and bypasses for direct-mapped caches. We evaluate potential gains in terms of miss rate and...
On the use of trace sampling for architectural studies of desktop applications (1999)
Patrick Crowley, Jean-loup Baer
This paper examines the feasibility of performing architectural studies with trace sampling for a suite of desktop application traces on Windows NT. This paper makes three contributions: we compare...
A Notation for Describing Multiple Views of VLSI Circuits. (1998)
Baer, Jean-Loup, Liem, Meei-Chiueh, McMurchie, Larry, Nottrott, Rudolf, Snyder, Lawrence
A declarative hierarchical notation is introduced that allows the parametric representation of entire families of VLSI circuits. Layout, schematic diagrams and network structure are all accommodated...
Execution characteristics of desktop applications on Windows NT (1998)
Dennis C. Lee, Patrick J. Crowley, Jean-loup Baer, Thomas E. Anderson, Brian N. Bershad
This paper examines the performance of desktop applications running on the Microsoft Windows NT operating system on Intel x86 processors, and contrasts these applications to the programs in the...
Trace Sampling for Desktop Applications on Windows NT (1998)
Patrick Crowley, Jean-loup Baer
This paper examines trace sampling for a suite of desktop application traces on Windows NT. This paper makes two contributions: we compare the accuracy of several sampling techniques to determine...
On the Use of Trace Sampling for Architectural Studies of Desktop Applications (1998)
Patrick Crowley And, Patrick Crowley, Jean-loup Baer
This paper examines the feasibility of performing architectural studies with trace sampling for a suite of desktop application traces on Windows NT. This paper makes three contributions: we compare...
On the Use of Trace Sampling for Architectural Studies of Desktop Applications (1998)
Patrick Crowley And, Patrick Crowley, Jean-loup Baer
This paper examines the feasibility of performing architectural studies with trace sampling for a suite of desktop application traces on Windows NT. This paper makes three contributions: we compare...
On the Use of Trace Sampling for Architectural Studies of Desktop Applications (1998)
Patrick Crowley, Jean-loup Baer
INTRODUCTION Trace-driven simulation is a common approach for evaluating memory systems. Unfortunately, it also demands large amounts of space and time, particularly for large caches and long running...
This paper presents methods to reduce memory latency in the main memory subsystem below the board-level cache. We consider conventional page-mode DRAMs and cached DRAMs. Evaluation is performed via...
This paper presents methods to reduce memory latency in the main memory subsystem below the board-level cache. We consider conventional page-mode DRAMs and cached DRAMs. Evaluation is performed via...
On the Effectiveness of Code Reordering Algorithms for theAlpha and IA32 Architectures (1997)
Ori Gershony, Jean-loup Baer, Dennis Lee
The impact of instruction cache misses and branch mispredictions on performance is becoming increasingly important for processors that issue multiple instructions per cycle. Mechanisms that address...
The Structure and Performance of Interpreters (1996)
Theodore H. Romer, Dennis Lee, Geoffrey M. Voelker, Alec Wolman, Wayne A. Wong, Jean-loup Baer, ...
Interpreted languages have become increasingly popular due to demands for rapid program development, ease of use, portability, and safety. Beyond the general impression that they are...
The Structure and Performance of Interpreters (1996)
Theodore Romer Dennis, Dennis Lee, Geoffrey M. Voelker, Alec Wolman, Wayne A. Wong, Jean-loup Baer, ...
Interpreted languages have become increasingly popular due to demands for rapid program development, ease of use, portability, and safety. Beyond the general impression that they are...
The Structure and Performance of Interpreters (1996)
Theodore Romer Dennis, Dennis Lee, Geoffrey M. Voelker, Alec Wolman, Wayne A. Wong, Jean-loup Baer, ...
Interpreted languages have become increasingly popular due to demands for rapid program development, ease of use, portability, and safety. Beyond the general impression that they are...
Recent developments in shared-memory multiprocessor systems advocate using off-the-shelf hardware to provide basic communication mechanisms and using software to implement cache coherence policies....
The Structure and Performance of Interpreters (1996)
Theodore Romer Dennis, Dennis Lee, Geoffrey M. Voelker, Alec Wolman, Wayne A. Wong, Jean-loup Baer, ...
Interpreted languages have become increasingly popular due to demands for rapid program development, ease of use, portability, and safety. Beyond the general impression that they are...
Effective Hardware-based Data Prefetching for High-performance Processors (1995)
Abstract-Memory latency and bandwidth are progressing at a much slower pace than processor performance. In this paper, we describe and evaluate the performance of three variations of a hardware...
Instruction Cache Fetch Policies for Speculative Execution (1995)
Dennis Lee, Jean-loup Baer, Brad Calder, Dirk Grunwald
Current trends in processor design are pointing to deeper and wider pipelines and superscalar architectures. The efficient use of these resources requires speculative execution, a technique whereby...
Instruction Cache Fetch Policies for Speculative Execution (1995)
Dennis Lee, Jean-loup Baer, Brad Calder, Dirk Grunwald
Current trends in processor design are pointing to deeper and wider pipelines and superscalar architectures. The efficient use of these resources requires speculative execution, a technique whereby...
On the Performance of a Bus-based Multiprocessor Cluster Architecture (1995)
Craig Anderson, Jean-loup Baer
The focus of this paper is on the evaluation of a hierarchical cluster architecture where a cluster consists of a single bus shared-memory multiprocessor and where the interconnect is a tree...
A Parallel Trace-driven Simulator: Implementation and Performance (1994)
The simulation of parallel architectures requires an enormous amount of CPU cycles and, in the case of trace-driven simulation, of disk storage. In this paper, we consider the evaluation of the...
Design and Evaluation of a Subblock Cache Coherence Protocol for Bus-Based Multiprocessors (1994)
Craig Anderson, Jean-loup Baer
Parallel applications exhibit a wide variety of memory reference patterns. Designing a memory architecture that serves all applications well is not easy. However, because tolerating or reducing...
Cache-Based Data Distribution Constrained Scheduling (1994)
Sang L. Min, Jae H. Nam, Myung S. Park, Jean-loup Baer
The primary goal of processor scheduling is to assign tasks in a parallel program to processors, so as to minimize the execution time. Most existing approaches to processor scheduling for...
A Comparative Study of Conservative and Optimistic Trace-driven Simulations (1994)
In this paper, we consider the evaluation of the memory hierarchy of multiprocessor systems via parallel trace-driven simulation. We study two parallel simulation schemes: a conservative one using an...
Optimistic Trace-driven Simulation (1994)
Parallel simulation of multiprocessor architectures is a promising direction because a parallel system provides the high computation and storage capabilities that are required by detailed...
A performance study of software and hardware data prefetching schemes (1994)
Prefetching, i.e., exploiting the overlap of processor computations with data accesses, is one of several approaches for tolerating memory latencies. Prefetching can be either hardware-based or...
A Multi-Level Hierarchical Cache Coherence Protocol for Multiprocessors (1993)
Craig Anderson, Jean-loup Baer
In order to meet the computational needs of the next decade, shared-memory processors must be scalable. Though single shared-bus architectures have been successful in the past, limited bus bandwidth...
A Performance Study of Memory Consistency Models (1992)
Richard N. Zucker, Richard N. Zucker, Jean-loup Baer, Jean-loup Baer
Recent advances in technology are such that the speed of processors is increasing faster than memory latency is decreasing. Therefore the relative cost of a cache miss is becoming more important....
Reducing Memory Latency via Non-blocking and Prefetching Caches (1992)
Tien-Fu Chen, Tien-fu Chen, Jean-loup Baer, Jean-loup Baer
Non-blocking caches and prefetching caches are two techniques for hiding memory latency by exploiting the overlap of processor computations with data accesses. A non-blocking cache allows execution...
On Synchronization Patterns in Parallel Programs (1991)
Jean-loup Baer, Jean-loup Baer, Richard N. Zucker, Richard N. Zucker
Efficient synchronization is a key element in obtaining good speed-up from parallel programs. The overhead introduced by synchronization, especially lock manipulation, can sometimes remove any...
Cache coherence protocols: Evaluation using a multiprocessor simulation model (1986)
James Archibald, Jean-loup Baer
Using simulation, we examine the efficiency of several distributed, hardware-based solutions to the cache coherence problem in shared-bus multiprocessors. For each of the approaches, the associated...
Computer Systems Architecture (1980)
Contenido: Repaso histórico de sistemas de arquitectura de computadoras; Descripción de sistemas de computadoras; Algoritmos aritméticos; Procesadores centrales de largo alcance; La memoria...
Etude critique et données de compilation du langage Cobol (1963)
Le but de cette étude est de présenter rapidement (en français) le langage Cobol, de le comparer aux langages commerciaux qui l'ont précédé et d'en faire la critique en tant que langage...
Etude critique et données de compilation du langage Cobol (1963)
Le but de cette étude est de présenter rapidement (en français) le langage Cobol, de le comparer aux langages commerciaux qui l'ont précédé et d'en faire la critique en tant que langage...
Etude critique et données de compilation du langage Cobol (1963)
Le but de cette étude est de présenter rapidement (en français) le langage Cobol, de le comparer aux langages commerciaux qui l'ont précédé et d'en faire la critique en tant que langage...