my email address: my last name at cis dot upenn dot edu

(215) 746-4223

Levine Hall 572
3330 Walnut Street

Philadelphia, PA 19104-3409

I work on making multiprocessors easier to program by leveraging changes in both computer architectures and parallel programming models.

I am looking for new PhD students interested in systems and computer architecture. If you are interested in these topics please apply to our PhD program and drop me an email as well.

Teaching

In Fall 2023 I'm teaching CIS 6010: GPGPU Programming and Architecture.

Students

I'm lucky to be working with the following great students:

Kelly Shiptoski (PhD)
Yuxuan Zhang (PhD)

Former students

Omar Navarro Leija (PhD 2022. First employment: Bolt Labs)
Gautam Mohan (Master's 2020. First employment: Amazon)
Yuanfeng Peng (PhD 2019). First employment: Google
Nicholas Renner (Master's 2019, now a PhD student at NYU)
Nimit Singhania (PhD 2018, co-advised with Rajeev Alur). First employment: Google
Christian DeLozier (PhD 2018). First employment: Assistant Professor at United States Naval Academy
Kavya Lakshminarayanan (Master's 2018) First employment: Microsoft
Richard Zang (Master's 2018) First employment: Microsoft
Sana Kamboj (Master's 2017) First employment: Qualcomm
Ariel Eizenberg (Master's 2016) First employment: Government of Israel
Brooke Fugate (Master's 2015, co-advised with André DeHon)
Liang Luo (Master's 2015, then a PhD student at the University of Washington)
Akshitha Sriraman (Master's 2015, then a PhD student at the University of Michigan)

Recent Publications full list

Many of the paper links below use the ACM's Author-izer service, which tracks download statistics and provides a small kickback to various ACM Special Interest Groups for each download.

OCOLOS: Online COde Layout OptimizationSOCOLOS: Online COde Layout OptimizationS
Yuxuan Zhang, Tanvir Ahmed Khan, Gilles Pokam, Baris Kasikci, Heiner Litz and Joseph Devietti

ACM IEEE International Symposium on Microarchitecture (MICRO '22), October 2022

[abstract]

Selected for IEEE Micro Top Picks 2023

The processor front-end has become an increasingly important bottleneck in recent years due to growing application code footprints, particularly in data centers. First-level instruction caches and branch prediction engines have not been able to keep up with this code growth, leading to more front-end stalls and lower Instructions Per Cycle (IPC). Profile-guided optimizations performed by compilers represent a promising approach, as they rearrange code to maximize instruction cache locality and branch prediction efficiency along a relatively small number of hot code paths. However, these optimizations require continuous profiling and rebuilding of applications to ensure that the code layout matches the collected profiles. If an application’s code is frequently updated, it becomes challenging to map profiling data from a previous version onto the latest version, leading to ignored profiling data and missed optimization opportunities.
In this paper, we propose OCOLOS, the first online code layout optimization system for unmodified applications written in unmanaged languages. OCOLOS allows profile-guided optimization to be performed on a running process, instead of being performed offline and requiring the application to be re-launched. By running online, profile data is always relevant to the current execution and always maps perfectly to the running code. OCOLOS demonstrates how to achieve robust online code replacement in complex multithreaded applications like MySQL and MongoDB, without requiring any application changes. Our experiments show that OCOLOS can accelerate MySQL by up to 1.41x, the Verilator hardware simulator by up to 2.20x, and a build of the Clang compiler by up to 1.14x.