Wed 17 Jun 2026 16:10 - 16:30 at Flatirons 3 - GPU Programming Chair(s): Gilbert Louis Bernstein

We introduce Kuiper, a language for safe and verified efficient CPU/GPU programming embedded as an
extensible library within the F* dependently typed language. We rely on F*’s support for dependent types and its
associated Pulse concurrent separation logic to develop a program logic in which to prove CPU/GPU programs
safe, data-race free, and functionally correct. Our model of the GPU includes several intricacies, including the
memory hierarchy, kernel launches, and synchronization within a single comprehensive framework. To do so,
we extend the Pulse program logic with a novel notion of located resources and a new connective to structure
reasoning about massively parallel programs, and present new proof rules to lift the per-thread view of GPU
kernels to an end-to-end correctness specification.

We have used Kuiper to program and prove correct a variety of GPU kernels, including full functional
correctness proofs of an optimized matrix multiplication using two levels of block tiling and tensor cores. In
doing so, we have developed a range of libraries to enable programs and proofs at a high level of abstraction
but without imposing any runtime overhead. These allow Kuiper programs to be polymorphic (over types,
operations, memory layout, and more) and compile to efficient, specialized CUDA code, while enabling a
novel form of verified auto-tuning. Our experimental evaluation confirms that Kuiper programs match the
performance of their handwritten CUDA counterparts and are competitive with closed source, state-of-the-art
kernels in cuBLAS.

Wed 17 Jun

Displayed time zone: Mountain Time (US & Canada) change

16:10 - 17:50
GPU ProgrammingPLDI Research Papers at Flatirons 3
Chair(s): Gilbert Louis Bernstein University of Washington
16:10
20m
Talk
Kuiper: Correct and Efficient GPU Programming with Dependent Types and Separation Logic
PLDI Research Papers
Guido Martínez Microsoft Research, Bastian Köpcke TU Berlin, Jonáš Fiala ETH Zurich, Gabriel Ebner Microsoft Research, Tahina Ramananandro Microsoft Research, Michel Steuwer TU Berlin, Tyler Sorensen Microsoft Research, Nikhil Swamy Microsoft Research
DOI Pre-print
16:30
20m
Talk
Modular GPU Programming with Typed PerspectivesDistinguished Paper
PLDI Research Papers
Manya Bansal Massachusetts Institute of Technology, Daniel Sainati University of Pennsylvania, Joseph W. Cutler University of Pennsylvania, Saman Amarasinghe Massachusetts Institute of Technology, Jonathan Ragan-Kelley Massachusetts Institute of Technology
DOI
16:50
20m
Talk
[TOPLAS] StreamAlloc: A Framework for Analyzing and Transforming CUDA Code to Enable Asynchronous Execution
PLDI Research Papers
Soumik Kumar Basu IIT Hyderabad, Jyothi Vedurada IIT Hyderabad
17:10
20m
Talk
SIMT-Step Execution: A Flexible Operational Semantics For GPU Subgroup Behavior
PLDI Research Papers
Zheyuan Chen University of California at Santa Cruz, Naomi Rehman University of California at Santa Barbara, Guido Martínez Microsoft Research, Tyler Sorensen Microsoft Research; University of California at Santa Cruz
DOI Pre-print
17:30
20m
Talk
Uniformity Analysis in the WebGPU Shading Language
PLDI Research Papers
James Lee-Jones Imperial College London, John Wickerson Imperial College London, Alastair F. Donaldson Imperial College London
DOI