Modulo Scheduling for Highly Customized Datapaths to Increase Hardware Reusability ABSTRACT (2008)
Kevin Fan, Hyunchul Park, Manjunath Kudlur, Scott Mahlke
In the embedded domain, custom hardware in the form of ASICs is often used to implement critical parts of applications when performance and energy efficiency goals cannot be met with software...
Compiler-directed Synthesis of Multifunction Loop Accelerators (2008)
Kevin Fan, Manjunath Kudlur, Hyunchul Park, Scott Mahlke
Complex algorithms and increased functionality are expanding the computation demands of embedded systems. Hardware accelerators are commonly used to meet these demands by executing critical...
Modulo Graph Embedding: Mapping Applications onto Coarse-Grained Reconfigurable Architectures (2006)
Hyunchul Park, Kevin Fan, Manjunath Kudlur, Scott Mahlke
Coarse-grained reconfigurable architectures (CGRAs) present an appealing hardware platform by providing the potential for high computation throughput, scalability, low cost and energy efficiency....
Increasing Hardware Efficiency with Multifunction Loop Accelerators (2006)
Kevin Fan, Manjunath Kudlur, Hyunchul Park, Scott Mahlke
To meet the conflicting goals of high-performance low-cost embedded systems, critical application loop nests are commonly executed on specialized hardware accelerators. These loop accelerators are...
Cost Sensitive Modulo Scheduling in a Loop Accelerator Synthesis System (2005)
Kevin Fan, Manjunath Kudlur, Hyunchul Park, Scott Mahlke
Scheduling algorithms used in compilers traditionally focus on goals such as reducing schedule length and register pressure or producing compact code. In the context of a hardware synthesis system...
Nathan Clark, Manjunath Kudlur, Hyunchul Park, Scott Mahlke, Krisztian Flautner
Application-specific instruction set extensions are an effective way of improving the performance of processors. Critical computation subgraphs can be accelerated by collapsing them into new...
Nathan Clark, Manjunath Kudlur, Hyunchul Park, Scott Mahlke, Krisztian Flautner
Application-specific instruction set extensions are an effective way of improving the performance of processors. Critical computation subgraphs can be accelerated by collapsing them into new...
Polymorphic Pipeline Array: A Flexible Multicore Accelerator for Mobile Multimedia Applications.
Mobile computing in the form of smart phones, netbooks, and PDAs has become an integral part of our everyday lives. Moving ahead to the next generation of mobile devices, we believe that multimedia...