In recent years multi-core
computer systems have left the realm of high-performance computing
and virtually all of today's desktop computers and embedded computing
systems are equipped with several processing cores. Still, no
single parallel programming model has found widespread support
and parallel programming remains an art for the majority of application
programmers. In addition, there exists a plethora of sequential
legacy applications for which automatic parallelization is the
only realistic hope to benefit from the potentially increased
processing power of modern multi-core systems.
In this talk we present a novel approach to extracting and exploiting
parallelism from sequential applications. We use profiling to
overcome the limitations of static data and control flow analysis
enabling more aggressive parallelization. A key contribution of
this work is a whole-program representation that supports profiling,
parallelism extraction and exploitation. We demonstrate how this
enhances conventional parallelization by incorporating support
for array and coupled reduction operations as well as multi-level
loop partitioning and pipeline stage replication.
We have applied our technique targeting two different forms of
parallelism, namely data and pipeline parallelism. First, we demonstrate
the effectiveness of our parallelisation strategy in extracting
data-level parallelism using the NAS and SPEC FP benchmarks. Our
approach not only yields significant improvements when compared
with state-of-the-art parallelizing compilers, but comes close
to and sometimes exceeds the performance of manually parallelized
codes. Second, we present an enhanced code generation methodology
which targets both pipeline and data parallelism. We have evaluated
on a set of multimedia and stream processing benchmarks and demonstrate
speedups of up to 4.7 on a eight-core Intel Xeon machine.
Parallelizing Sequential Applications Using a Profile-driven Approach

14.06.2010
Date : 14.06.2010
Time: 10:15-11:30
Location : Seminar Room I, FORTH, Heraklion, Crete
Host : Dimitrios S. Nikolopoulos
Georgios Tournavitis is currently a PhD student at the Institute for
Computing
Systems Architecture (ICSA) of the University of Edinburgh. His research
interests lie in the general areas of compilation and programming
languages for
parallel architectures. More specifically, he is interested in
compiler-based
and runtime techniques that enable compilers to extract high-level
parallelization skeletons from sequential applications. Most recently he
also started working on compiler-directed optimizations for saving
static-power
in the cache hierarchy of Chip Multi-Processors. He holds an Engineering
Diploma and an MSc in Computer Engineering from the University of
Patras,
Greece. As part of his MSc project he designed and implemented a multi-
threaded
Software Distributed Shared Memory (SDSM) system for clusters of
Multi-Processors.