CIS 6010: Special Topics in Computer Architecture: GPGPU Architecture and Programming Fall 2023

Course Information

instructor: Joe Devietti
when: Monday/Wednesday 12-1:30pm
where: Towne 305
contact: email, canvas

office hours:

by appointment

Course Description

Graphics Processing Units (GPUs) have become extremely popular and are used to accelerate an increasingly diverse set of non-graphics workloads. This seminar will examine modern GPU architectures, the programming models used to write general-purpose code for GPUs, and the complexities of programming such highly parallel architectures. There will be a special emphasis on concurrency correctness issues as they relate to GPUs, including GPU memory consistency models and GPU concurrency bugs. Graduate-level coursework in computer architecture (e.g., CIS 5710) will be very helpful.

Course Materials

No textbooks are required; links to all readings will be provided at this website.

Grading

Project: 50%
Participation: 30%
Assignments: 20%

There will be no exams.

Submit homework via Canvas.

The class project can be done in groups of up to 3. The project is open-ended: it should be something related to GPUs but the specifics are up to you. Choosing a project that incorporates your interests (research or otherwise) is a great idea! Here are some project ideas:

Rewrite your matrix multiply code from the homeworks to operate on 16-bit (“half”) floating point elements instead of 32-bit floats. Update the cuBLAS code correspondingly, and use Tensor Cores to accelerate your implementation.
Build a series of scalable locking implementations in CUDA, from simple spin-locks to something like MCS locks. The lack of coherence on GPUs should add an interesting wrinkle. Useful resources are Michael Scott’s webpage and the SSync library from EPFL.
Choose a GPU-related paper (e.g., one that we’ve read in class, though others are fine as well) that has source code available, and try to reproduce some of the results from it.
Pick a non-trivial open Github issue for an application written in CUDA, and try to resolve it. As a starting point, here are some popular GH repositories with CUDA code.
Port an application or algorithm of interest to you to CUDA, and benchmark its performance.

Course Schedule

This schedule is subject to change

Date	Topic	Presenter
Wed 30 Aug	Intro	Joe
Mon 4 Sep	no class - Labor Day
Wed 6 Sep	General-Purpose Graphics Processor Architectures (accessible via Penn VPN), Chapters 1 & 2	Joe
Mon 11 Sep	” Sections 3.1 - 3.3	Joe
Wed 13 Sep	” Section 3.4 - 3.6	Joe
Mon 18 Sep	” Chapter 4	Joe
Wed 20 Sep	Real-world GPU design	Joe
Mon 25 Sep	no class - Yom Kippur
Wed 27 Sep	CUDA Programming Guide	Joe
Mon 2 Oct	GEMM and HW1	Joe
Wed 4 Oct	CUDA topics, Roofline Model	Joe
Mon 9 Oct	A Primer on Memory Consistency and Cache Coherence, Chapters 3 (SC)	Joe
Wed 11 Oct	MCM Primer (Chapter 4, TSO)	Joe
Mon 16 Oct	MCM Primer (Chapter 5, XC)	Joe
Wed 18 Oct	Dynamic Warp Formation	Katelyn & Nathan
Mon 23 Oct	The Dual-Path Execution Model for Efficient GPU Control Flow	Chengjun & Shuhan
Wed 25 Oct	Dynamic Warp Subdivision for Integrated Branch and Memory Divergence Tolerance	Dvisha & Prateek & Siddhant
Mon 30 Oct	Heterogeneous-Race-Free Memory Models	Paul & Zihao
Wed 1 Nov	GPU concurrency: Weak Behaviours and Programming Assumptions slides	Dvisha & Katelyn & Ryan
Mon 6 Nov	A Formal Analysis of the NVIDIA PTX Memory Consistency Model	Chengjun & Harish & Shuhan
Wed 8 Nov	Cache Coherence for GPU Architectures	Zhiyao & Zhilei
Mon 13 Nov	Cache-Conscious Wavefront Scheduling	Harish & Nathan & Ryan
Wed 15 Nov	SIMR: Single Instruction Multiple Request Processing for Energy-Efficient Data Center Microservices	Linus & Xitong & Yinda
Mon 20 Nov	Understanding The Security of Discrete GPUs	Linus & Zhiyao
Wed 22 Nov	no class - Thanksgiving
Mon 27 Nov	GPUfs: integrating a file system with GPUs	Prateek & Siddhant & Zihao
Wed 29 Nov	GPUnet: Networking Abstractions for GPU Programs	Xitong & Yinda
Mon 4 Dec	gpucc: An Open-Source GPGPU Compiler	Paul & Zhilei
Wed 6 Dec	Project Presentations
Mon 11 Dec	Project Presentations