Scott Mahlke

Details der Publikationsliste

Zeitraum

1991 - 2009

Anzahl

82

Co-Autoren

A System Solution for High-Performance, Low Power SDR Abstract (2009)

Yuan Lin, Hyunseok Lee, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, ...

One central challenge in the realization of Software Defined Radio (SDR) is to provide a programmable solution that meets the challenging high-performance, low-power requirements, while providing an...

Hierarchical Coarse-grained Stream Compilation for Software Defined Radio ABSTRACT (2008)

Yuan Lin, Manjunath Kudlur, Scott Mahlke, Trevor Mudge

Software Defined Radio (SDR) is an emerging embedded domain where the physical layer of wireless protocols is implemented in software rather than the traditional application specific hardware. The...

Modulo Scheduling for Highly Customized Datapaths to Increase Hardware Reusability ABSTRACT (2008)

Kevin Fan, Hyunchul Park, Manjunath Kudlur, Scott Mahlke

In the embedded domain, custom hardware in the form of ASICs is often used to implement critical parts of applications when performance and energy efficiency goals cannot be met with software...

VEAL: Virtualized Execution Accelerator for Loops (2008)

Nathan Clark, Amir Hormati, Scott Mahlke

Performance improvement solely through transistor scaling is becoming more and more difficult, thus it is increasingly common to see domain specific accelerators used in conjunction with general...

Abstract Control CPR: A Branch Height Reduction Optimization for EPIC Architectures (2008)

Michael Schlansker, Scott Mahlke, Richard Johnson

The challenge of exploiting high degrees of instruction-level parallelism is often hampered by frequent branching. Both exposed branch latency and low branch throughput can restrict parallelism....

Exploiting Narrow Accelerators with Data-Centric Subgraph Mapping (2008)

Amir Hormati, Nathan Clark, Scott Mahlke

The demand for high performance has driven acyclic computation accelerators into extensive use in modern embedded and desktop architectures. Accelerators that are ideal from a software perspective,...

Analog ASICs (2008)

Yuan Lin, Hyunseok Lee, Mark Woh, Yoav Harel, Sangwon Seo, Rob Mullenix, ...

GPP+DSP+ASICs GPP+DSP+ASICs GPP+DSP+ASICs GPP+DSP+ASIC GPP+DSPs Analog ASICs Software Defined Radio (SDR): Use of software routines instead of ASICs for physical layer operations of wireless 802.11...

Exploiting Narrow Accelerators with Data-Centric Subgraph Mapping (2008)

Amir Hormati, Nathan Clark, Scott Mahlke

The demand for high performance has driven acyclic computation accelerators into extensive use in modern embedded and desktop architectures. Accelerators that are ideal from a software perspective,...

SPEX: A programming language for software defined radio (2008)

Yuan Lin, Robert Mullenix, Mark Woh, Scott Mahlke, Trevor Mudge, Alastair Reid, ...

High-throughput, low-power Software Defined Radio(SDR) solutions require multi-core SIMD DSP processors to meet real-time performance requirements. Given the difficulty in programming traditional...

StageNet: A Reconfigurable CMP Fabric for Resilient Systems ABSTRACT (2008)

Shantanu Gupta, Shuguang Feng, Jason Blome, Scott Mahlke

Though CMOS feature size scaling has been the source of dramatic performance gains, this scaling has lead to mounting reliability concerns due to increasing power densities and on-chip temperatures....

Architecting a Reliable CMP Switch Architecture KYPROS CONSTANTINIDES, STEPHEN PLAZA, JASON BLOME, (2008)

Valeria Bertacco, Scott Mahlke, Todd Austin, Bin Zhang, Michael Orshansky

As silicon technologies move into the nanometer regime, transistor reliability is expected to wane as devices become subject to extreme process variation, particle-induced transient errors, and...

Hierarchical Coarse-grained Stream Compilation for Software Defined Radio ABSTRACT (2008)

Yuan Lin, Manjunath Kudlur, Scott Mahlke, Trevor Mudge

Software Defined Radio (SDR) is an emerging embedded domain where the physical layer of wireless protocols is implemented in software rather than the traditional application specific hardware. The...

Probabilistic Predicate-Aware Modulo Scheduling Mikhail Smelyanskiy davidson¢ (2008)

Scott Mahlke, Edward S. Davidson

Predicated execution enables the removal of branches by converting segments of branching code into sequences of conditional operations. An important side effect of this transformation is that the...

The Next Generation Challenge for Software Defined Radio (2008)

Mark Woh, Sangwon Seo, Hyunseok Lee, Yuan Lin, Scott Mahlke, Chaitali Chakrabarti, ...

Abstract. Wireless communication for mobile terminals has been a high performance computing challenge. It requires almost super computer performance while consuming very little power. This...

BY SEPARATING CONTROL AND DATA PROCESSING AND BY EMPLOYING ULTRAWIDE SIMD (2008)

Yuan Lin, Hyunseok Lee, Mark Woh, Yoav Harel, Scott Mahlke, Trevor Mudge, ...

...... Communication has become one of the central uses of computing technology, and applications that facilitate interpersonal communication, such as desktop publishing, graphic design, e-mail, and...

Compiler-directed Synthesis of Multifunction Loop Accelerators (2008)

Kevin Fan, Manjunath Kudlur, Hyunchul Park, Scott Mahlke

Complex algorithms and increased functionality are expanding the computation demands of embedded systems. Hardware accelerators are commonly used to meet these demands by executing critical...

A System Solution for High-Performance, Low Power SDR (2008)

Yuan Lin, Yuan Lin Hyunseok, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, ...

One central challenge in the realization of Software Defined Radio (SDR) is to provide a programmable solution that meets the challenging high-performance, low-power requirements, while providing an...

Uncovering hidden loop level parallelism in sequential applications (2008)

Hongtao Zhong, Mojtaba Mehrara, Steve Lieberman, Scott Mahlke

As multicore systems become the dominant mainstream computing technology, one of the most difficult challenges the industry faces is the software. Applications with large amounts of explicit...

Self-calibrating online wearout detection (2007)

Jason Blome, Shuguang Feng, Shantanu Gupta, Scott Mahlke

Technology scaling, characterized by decreasing feature size, thinning gate oxide, and non-ideal voltage scaling, will become a major hindrance to microprocessor reliability in future technology...

Data access partitioning for fine-grain parallelism on multicore architectures (2007)

Michael Chu, Rajiv Ravindran, Scott Mahlke

The recent design shift towards multicore processors has spawned a significant amount of research in the area of program parallelization. The future abundance of cores on a single chip requires...

Liquid simd: Abstracting simd hardware using lightweight dynamic mapping (2007)

Nathan Clark, Amir Hormati, Sami Yehia, Scott Mahlke, Krisztián Flautner

Microprocessor designers commonly utilize SIMD accelerators and their associated instruction set extensions to provide substantial performance gains at a relatively low cost for media applications....

BulletProof: A Defect-Tolerant CMP Switch Architecture (2006)

Kypros Constantinides, Stephen Plaza, Jason Blome, Bin Zhang, Valeria Bertacco, Scott Mahlke, ...

As silicon technologies move into the nanometer regime, transistor reliability is expected to wane as devices become subject to extreme process variation, particle-induced transient errors, and...

SODA: A Low-power Architecture For Software Radio (2006)

Yuan Lin, Yuan Lin Hyunseok, Mark Woh, Yoav Harel, Scott Mahlke, Trevor Mudge, ...

The physical layer of most wireless protocols is traditionally implemented in custom hardware to satisfy the heavy computational requirements while keeping power consumption to a minimum. These...

Design and Implementation of Turbo Decoders for Software Defined Radio (2006)

Yuan Lin, Scott Mahlke, Trevor Mudge, Chaitali Chakrabarti

Software Defined Radio(SDR) is an emerging paradigm for wireless terminals, in which the physical layer of communication protocols is implemented in software rather than by ASICs. Many of the current...

Modulo Graph Embedding: Mapping Applications onto Coarse-Grained Reconfigurable Architectures (2006)

Hyunchul Park, Kevin Fan, Manjunath Kudlur, Scott Mahlke

Coarse-grained reconfigurable architectures (CGRAs) present an appealing hardware platform by providing the potential for high computation throughput, scalability, low cost and energy efficiency....

Scalable Subgraph Mapping for Acyclic Computation Accelerators (2006)

Nathan Clark, Amir Hormati, Scott Mahlke, Sami Yehia

Computer architects are constantly faced with the need to improve performance and increase the efficiency of computation in their designs. To this end, it is increasingly common to see acyclic...

Streamroller: Automatic Synthesis of Prescribed Throughput Accelerator Pipelines (2006)

Manjunath Kudlur, Kevin Fan, Scott Mahlke

In this paper, we present a methodology for designing a pipeline of accelerators for an application. The application is modeled using sequential C language with simple stylizations. The synthesis of...

Increasing Hardware Efficiency with Multifunction Loop Accelerators (2006)

Kevin Fan, Manjunath Kudlur, Hyunchul Park, Scott Mahlke

To meet the conflicting goals of high-performance low-cost embedded systems, critical application loop nests are commonly executed on specialized hardware accelerators. These loop accelerators are...

Online timing analysis for wearout detection (2006)

Jason A. Blome, Shuguang Feng, Shantanu Gupta, Scott Mahlke

CMOS feature size scaling has long been the source of dramatic performance gains. However, because voltage levels have not scaled in step, feature size scaling has come at the cost of increased...

Soda: A low-power architecture for software radio (2006)

Yuan Lin, Hyunseok Lee, Mark Woh, Yoav Harel, Scott Mahlke, Trevor Mudge, ...

The physical layer of most wireless protocols is traditionally implemented in custom hardware to satisfy the heavy computational requirements while keeping power consumption to a minimum. These...

BulletProof: A defect-tolerant cmp switch architecture (2006)

Kypros Constantinides, Stephen Plaza, Jason Blome, Bin Zhang, Valeria Bertacco, Scott Mahlke, ...

As silicon technologies move into the nanometer regime, transistor reliability is expected to wane as devices become subject to extreme process variation, particle-induced transient errors, and...

Scalable subgraph mapping for acyclic computation accelerators (2006)

Nathan Clark, Amir Hormati, Scott Mahlke

Computer architects are constantly faced with the need to improve performance and increase the efficiency of computation in their designs. To this end, it is increasingly common to see acyclic...

A microarchitectural analysis of soft error propagation in a production-level embedded microprocessor (2005)

Jason Blome, Scott Mahlke, Daryl Bradley, Krisztián Flautner

Current trends in device scaling continue to cause an increasing risk of transient faults in microprocessors due to high energy strikes from radiated particles. In this work, we present a thorough...

Assessing SEU vulnerability via circuit-level timing analysis (2005)

Kypros Constantinides, Stephen Plaza, Jason Blome, Bin Zhang, Valeria Bertacco, Scott Mahlke, ...

Recently, there has been a growing concern that, in relation to process technology scaling, the soft-error rate will become a major challenge in designing reliable systems. In this work, we introduce...

A distributed control path architecture for vliw processors (2005)

Hongtao Zhong, Scott Mahlke, Michael Schlansker

VLIW architectures are popular in embedded systems because they offer high-performance processing at low cost and energy. The major problem with traditional VLIW designs is that they do not scale...

An Architecture Framework for Transparent Instruction Set Customization in Embedded Processors (2005)

Nathan Clark, Jason Blome, Michael Chu, Scott Mahlke, Stuart Biles, Krisztian Flautner

Instruction set customization is an e#ective way to improve processor performance. Critical portions of application dataflow graphs are collapsed for accelerated execution on specialized hardware....

Exploring the Design Space of LUT-based Transparent Accelerators (2005)

Sami Yehia, Nathan Clark, Scott Mahlke, Krisztian Flautner

Instruction set customization accelerates the performance of applications by compressing the length of critical dependence paths and reducing the demands on processor resources. With instruction set...

Assessing SEU Vulnerability via Circuit-Level Timing Analysis (2005)

Kypros Constantinides, Stephen Plaza, Jason Blome, Bin Zhang, Valeria Bertacco, Scott Mahlke, ...

Recently, there has been a growing concern that, in relation to process technology scaling, the soft-error rate will become a major challenge in designing reliable systems. In this work, we introduce...

A Distributed Control Path Architecture (2005)

For Vliw Processors, Hongtao Zhong, Kevin Fan, Scott Mahlke, Michael Schlansker

VLIW architectures are popular in embedded systems because they offer high-performance processing at low cost and energy. The major problem with traditional VLIW designs is that they do not scale...

Cost Sensitive Modulo Scheduling in a Loop Accelerator Synthesis System (2005)

Kevin Fan, Manjunath Kudlur, Hyunchul Park, Scott Mahlke

Scheduling algorithms used in compilers traditionally focus on goals such as reducing schedule length and register pressure or producing compact code. In the context of a hardware synthesis system...

Software Defined Radio - A High Performance Embedded Challenge (2005)

Hyunseok Lee, Yuan Lin, Yoav Harel, Mark Woh, Scott Mahlke, Trevor Mudge, ...

Wireless communication is one of the most computationally demanding workloads. It is performed by mobile terminals ("cell phones") and must be accomplished by a small battery powered system.

A Microarchitectural Analysis of Soft Error Propagation in a Production-Level Embedded Microprocessor (2005)

Jason Blome, Scott Mahlke, Daryl Bradley, Krisztian Flautner

Current trends in device scaling continue to cause an increasing risk of transient faults in microprocessors due to high energy strikes from radiated particles. In this work, we present a thorough...

An architecture framework for transparent instruction set customization in embedded processors (2005)

Nathan Clark, Jason Blome, Michael Chu, Scott Mahlke, Stuart Biles, Krisztián Flautner

Instruction set customization is an effective way to improve processor performance. Critical portions of application dataflow graphs are collapsed for accelerated execution on specialized hardware....

FLASH: Foresighted latency-aware scheduling heuristic for processors with customized datapaths (2004)

Manjunath Kudlur, Kevin Fan, Michael Chu, Rajiv Ravindran, Nathan Clark, Scott Mahlke

Application-specific instruction set processors (ASIPs) have the potential to meet the challenging cost, performance, and power goals of future embedded processors by customizing the hardware to suit...

Application-Specific Processing on a General-Purpose Core via Transparent Instruction Set Customization (2004)

Nathan Clark, Manjunath Kudlur, Hyunchul Park, Scott Mahlke, Krisztian Flautner

Application-specific instruction set extensions are an effective way of improving the performance of processors. Critical computation subgraphs can be accelerated by collapsing them into new...

A Programmable Vector Coprocessor Architecture for Wireless Applications (2004)

Yuan Lin, Nadev Baron, Hyunseok Lee, Scott Mahlke, Trevor Mudge

The physical layers of most wireless protocols are traditionally implemented in ASICs due to the heavy computation requirements. These solutions are costly to design and hardwired solutions that o#er...

Automatic Synthesis of Customized Local Memories for Multicluster Application Accelerators (2004)

Manjunath Kudlur, Kevin Fan, Michael Chu, Scott Mahlke

Distributed local memories, or scratchpads, have been shown to e#ectively reduce cost and power consumption of application-specific accelerators while maintaining performance. The design of the local...

Memory System Design Space Exploration for Low-Power, Real-Time Speech Recognition (2004)

Rajeev Krishna, Scott Mahlke, Todd Austin

The recent proliferation of computing technology has brought added interest to natural I/O interface technologies such as speech recognition. Unfortunately, the computational and memory demands of...

Application-Specific Processing on a General-Purpose Core via Transparent Instruction Set Customization (2004)

Nathan Clark, Manjunath Kudlur, Hyunchul Park, Scott Mahlke, Krisztian Flautner

Application-specific instruction set extensions are an effective way of improving the performance of processors. Critical computation subgraphs can be accelerated by collapsing them into new...

Automatic Synthesis of Customized Local Memories for Multicluster Application Accelerators (2004)

Manjunath Kudlur, Kevin Fan, Michael Chu, Scott Mahlke

Distributed local memories, or scratchpads, have been shown to e#ectively reduce cost and power consumption of application-specific accelerators while maintaining performance. The design of the local...

FLASH: Foresighted latency-aware scheduling heuristic for processors with customized datapaths (2004)

Manjunath Kudlur, Kevin Fan, Michael Chu, Rajiv Ravindran, Nathan Clark, Scott Mahlke

Application-specific instruction set processors (ASIPs) have the potential to meet the challenging cost, performance, and power goals of future embedded processors by customizing the hardware to suit...

A Programmable Vector Coprocessor Architecture for Wireless Applications (2004)

Yuan Lin, Nadev Baron, Hyunseok Lee, Scott Mahlke, Trevor Mudge

The physical layers of most wireless protocols are traditionally implemented in ASICs due to the heavy computation requirements. These solutions are costly to design and hardwired solutions that...

Automatic Design of Application Specific Instruction Set Extensions Through Dataflow Graph Exploration (2003)

Clark, Nathan, Zhong, Hongtao, Tang, Wilkin, Mahlke, Scott

General-purpose processors are often incapable of achieving the challenging cost, performance, and power demands of high-performance applications. To meet these demands, most systems employ a number...

Processor acceleration through automated instruction set customization (2003)

Nathan Clark, Hongtao Zhong, Scott Mahlke

Application-specific extensions to the computational capabilities of a processor provide an efficient mechanism to meet the growing performance and power demands of embedded applications. Hardware,...

Processor Acceleration Through Automated Instruction Set Customization (2003)

Nathan Clark, Hongtao Zhong, Scott Mahlke

Application-specific extensions to the computational capabilities of a processor provide an e#cient mechanism to meet the growing performance and power demands of embedded applications. Hardware, in...

Architectural Optimizations for Low-Power, Real-Time Speech Recognition (2003)

Rajeev Krishna, Scott Mahlke, Todd Austin

The proliferation of computing technology to low power domains such as hand--held devices has lead to increased interest in portable interface technologies, with particular interest in speech...

Systematic register bypass customization for application-specific processors (2003)

Kevin Fan, Nathan Clark, Michael Chu, K. V. Manjunath, Rajiv Ravindran, Mikhail Smelyanskiy, ...

Register bypass provides additional datapaths to eliminate data hazards in processor pipelines. The difficulty with register bypass is that the cost of the bypass network is substantial and grows...

Processor acceleration through automated instruction set customization (2003)

Nathan Clark, Hongtao Zhong, Scott Mahlke

Application-specific extensions to the computational capabilities of a processor provide an efficient mechanism to meet the growing performance and power demands of embedded applications. Hardware,...

Region-based hierarchical operation partitioning for multicluster processors (2003)

Michael Chu, Kevin Fan, Scott Mahlke

Clustered architectures are a solution to the bottleneck of centralized register files in superscalar and VLIW processors. The main challenge associated with clustered architectures is compiler...

Automatically Generating Custom Instruction Set Extensions (2002)

Nathan Clark, Wilkin Tang, Scott Mahlke

General-purpose processors that are utilized as cores are often incapable of achieving the challenging cost, performance, and power demands of high-performance audio, video, and networking...

Insights into the memory demands of speech recognition algorithms (2002)

Rajeev Krishna, Scott Mahlke, Todd Austin

The vision of pervasive computing is one of invisible computers interacting with humans in all aspects of their lives. These invisible computers can be embedded in anything from specialized portable...

Insights into the memory demands of speech recognition algorithms (2002)

Rajeev Krishna, Scott Mahlke, Todd Austin

The vision of pervasive computing is one of invisible computers interacting with humans in all aspects of their lives. These invisible computers can be embedded in anything from specialized portable...

Education (2001)

Advisor Prof, Scott Mahlke, Advisor Prof, Rajat Moona

To work in a challenging environment involved in the design and development of architecture and compilation techniques for high performance processors.

Bitwidth sensitive code generation in a custom embedded accelerator design system (2001)

Scott Mahlke, Rajiv Ravindran, Michael Schlansker, Robert Schreiber, Timothy Sherwood

An ever larger variety of embedded ASICs is being designed and deployed to satisfy an explosively growing demand for new

Bitwidth Cognizant Architecture Synthesis of Custom Hardware Accelerators (2001)

Scott Mahlke, Rajiv Ravindran, Michael Schlansker, Robert Schreiber, Timothy Sherwood

applicationspecific design, architecture synthesis, bitwidth, clustering, embedded system, hardware accelerator, operation scheduling, resource allocation PICO is a system for automatically...

Bitwidth Cognizant Architecture Synthesis of Custom Hardware Accelerators (2001)

Scott Mahlke, Rajiv Ravindran, Michael Schlansker, Robert Schreiber, Timothy Sherwood

application-speci c design, architecture synthesis, bitwidth, clustering, embedded system, hardware accelerator, operation scheduling, resource allocation PICO is a system for automatically...

Bitwidth Cognizant Architecture Synthesis of Custom Hardware Accelerators (2001)

Scott Mahlke, Rajiv Ravindran, Michael Schlansker, Robert Schreiber, Timothy Sherwood

PICO is a system for automatically synthesizing embedded hardware accelerators from loop nests speci ed in the C programming language. A key issue confronted when designing such accelerators is the...

Bitwidth Cognizant Architecture Synthesis of Custom Hardware Accelerators (2001)

Scott Mahlke, Rajiv Ravindran, Michael Schlansker, Robert Schreiber, Timothy Sherwood

Abstract—Program-in chip-out (PICO) is a system for automatically synthesizing embedded hardware accelerators from loop nests specified in the C programming language. A key issue confronted when...

High-Level Synthesis of Nonprogrammable Hardware Accelerators (2000)

Robert Schreiber Shail, Robert Schreiber, Shail Aditya, B. Ramakrishna, Rau Vinod Kathail, Scott Mahlke, ...

The PICO-N system automatically synthesizes embedded nonprogrammable accelerators to be used as co-processors for functions expressed as loop nests in C. The output is synthesizable VHDL that defines...

High-Level Synthesis of Nonprogrammable Hardware Accelerators (2000)

Robert Schreiber, B. Ramakrishna Rau, Darren Cronquist, Mukund Sivaraman, Robert Schreibery, Shail Aditya, ...

high-level synthesis, ASIC, systolic array The PICO-NPA system automatically synthesizes nonprogrammable accelerators (NPAs) to be used as co-processors for functions expressed as loop nests in C....

Control CPR: A Branch Height Reduction Optimization for EPIC Architectures (1999)

Michael Schlansker, Scott Mahlke, Richard Johnson

The challenge of exploiting high degrees of instruction-level parallelism is often hampered by frequent branching. Both exposed branch latency and low branch throughput can restrict parallelism....

Control CPR: A branch height reduction optimization for EPIC architectures (1999)

Michael Schlansker, Michael Schlansker, Scott Mahlke, Scott Mahlke, Richard Johnson, Richard Johnson

ILP, critical path reduction, compilers © Copyright Hewlett-Packard Company 1999 The challenge of exploiting high degrees of instructionlevel parallelism is often hampered by frequent branching....

Automatic and efficient evaluation of memory hierarchies for embedded systems (1999)

Santosh Abraham, Scott Mahlke, Santosh G. Abraham, Scott A. Mahlke

hierarchical evaluation, automatic design, embedded system, cache simulation, cache modeling © Copyright Hewlett-Packard Company 1999 Automation is the key to the design of future embedded systems...

Achieving High Levels of Instruction-Level Parallelism With Reduced Hardware Complexity (1997)

Michael S. Schlansker, B. Ramakrishna Rau, Scott Mahlke, Vinod Kathail, Richard Johnson, Sadun Anik, ...

instruction-level parallelism, VLIW processors, superscalar processors, overlapped execution, out-of-order execution, speculative execution, branch prediction, instruction scheduling, compile-time...

Register connection: A new approach to adding registers into instruction set architectures (1993)

Tokuzo Kiyohara, Scott Mahlke, William Chen, Roger Bringmann, Richard Hank, Sadun Anik, ...

Code optimization and scheduling for superscalar and superpipelined processors often increase the register requirement of programs. For existing instruction sets with a small to moderate number of...

Register Connection: A New Approach to Adding Registers into Instruction Set Architectures (1993)

Tokuzo Kiyohara, Scott Mahlke, William Chen, Roger Bringmann, Richard Hank, Sadun Anik, ...

Code optimization and scheduling for superscalar and superpipelined processors often increase the register requirement of programs. For existing instruction sets with a small to moderate number of...

Sentinel Scheduling for VLIW and Superscalar Processors (1992)

Scott Mahlke, William Y. Chen, B. Ramakrishna, Rau Michael, S. Schlansker

Speculative execution is an important source of parallelism for VLIW and superscalar processors. A serious challenge with compiler-controlled speculative execution is to accurately detect and report...

Scalar Program Performance on Multiple-Instruction-Issue Processors with a Limited Number of Registers (1992)

Scott Mahlke, William Y. Chen, Pohua P. Chang

In this paper the performance of multiple-instructionissue processors with variable register file sizes is examined for a set of scalar programs. We make several important observations. First,...

The Effect Of Compiler Optimizations On Available Parallelism In Scalar Programs (1991)

Scott Mahlke, Nancy J. Warter, William Y. Chen, Pohua P. Chang

In this paper we analyze the effect of compiler optimizations on fine grain parallelism in scalar programs. We characterize three levels of optimization: classical, superscalar, and multiprocessor....

The Effect Of Compiler Optimizations On Available Parallelism In Scalar Programs (1991)

Scott Mahlke, Nancy J. Warter, William Y. Chen, Pohua P. Chang

In this paper we analyze the effect of compiler optimizations on fine grain parallelism in scalar programs. We characterize three levels of optimization: classical, superscalar, and multiprocessor....