Kuiper: Correct and Efficient GPU Programming with Dependent Types and Separation Logic (PLDI 2026 - PLDI Research Papers)

Mon 15 - Fri 19 June 2026 Boulder, Colorado, United States

Who

Guido Martínez, Bastian Köpcke, Jonáš Fiala, Gabriel Ebner, Tahina Ramananandro, Michel Steuwer, Tyler Sorensen, Nikhil Swamy

Track

PLDI 2026 PLDI Research Papers

Time Zone

The program is currently displayed in (GMT-06:00) Mountain Time (US & Canada).

Use conference time zone: (GMT-06:00) Mountain Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Wed 17 Jun 2026 16:10 - 16:30 at Flatirons 3 - GPU Programming Chair(s): Gilbert Louis Bernstein

Abstract

We introduce Kuiper, a language for safe and verified efficient CPU/GPU programming embedded as an
extensible library within the F* dependently typed language. We rely on F*’s support for dependent types and its
associated Pulse concurrent separation logic to develop a program logic in which to prove CPU/GPU programs
safe, data-race free, and functionally correct. Our model of the GPU includes several intricacies, including the
memory hierarchy, kernel launches, and synchronization within a single comprehensive framework. To do so,
we extend the Pulse program logic with a novel notion of located resources and a new connective to structure
reasoning about massively parallel programs, and present new proof rules to lift the per-thread view of GPU
kernels to an end-to-end correctness specification.

We have used Kuiper to program and prove correct a variety of GPU kernels, including full functional
correctness proofs of an optimized matrix multiplication using two levels of block tiling and tensor cores. In
doing so, we have developed a range of libraries to enable programs and proofs at a high level of abstraction
but without imposing any runtime overhead. These allow Kuiper programs to be polymorphic (over types,
operations, memory layout, and more) and compile to efficient, specialized CUDA code, while enabling a
novel form of verified auto-tuning. Our experimental evaluation confirms that Kuiper programs match the
performance of their handwritten CUDA counterparts and are competitive with closed source, state-of-the-art
kernels in cuBLAS.

Link to Preprint

https://mtzguido.github.io/pubs/kuiper-pldi26.pdf

DOI

https://doi.org/10.1145/3808280

Guido Martínez

Microsoft Research

United States

Bastian Köpcke

TU Berlin

Germany

Jonáš Fiala

ETH Zurich

Switzerland

Gabriel Ebner

Microsoft Research

United States

Tahina Ramananandro

Microsoft Research

United States

Michel Steuwer

TU Berlin

Germany

Tyler Sorensen

Microsoft Research

United States

Nikhil Swamy

Microsoft Research

United States

Time Zone

The program is currently displayed in (GMT-06:00) Mountain Time (US & Canada).

Use conference time zone: (GMT-06:00) Mountain Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Wed 17 Jun
Displayed time zone: Mountain Time (US & Canada) change

16:10 - 17:50	GPU ProgrammingPLDI Research Papers at Flatirons 3 Chair(s): Gilbert Louis Bernstein University of Washington

16:10 20m Talk		Kuiper: Correct and Efficient GPU Programming with Dependent Types and Separation Logic PLDI Research Papers Guido Martínez Microsoft Research, Bastian Köpcke TU Berlin, Jonáš Fiala ETH Zurich, Gabriel Ebner Microsoft Research, Tahina Ramananandro Microsoft Research, Michel Steuwer TU Berlin, Tyler Sorensen Microsoft Research, Nikhil Swamy Microsoft Research DOI Pre-print
16:30 20m Talk		Modular GPU Programming with Typed PerspectivesDistinguished Paper PLDI Research Papers Manya Bansal Massachusetts Institute of Technology, Daniel Sainati University of Pennsylvania, Joseph W. Cutler University of Pennsylvania, Saman Amarasinghe Massachusetts Institute of Technology, Jonathan Ragan-Kelley Massachusetts Institute of Technology DOI
16:50 20m Talk		[TOPLAS] StreamAlloc: A Framework for Analyzing and Transforming CUDA Code to Enable Asynchronous Execution PLDI Research Papers Soumik Kumar Basu IIT Hyderabad, Jyothi Vedurada IIT Hyderabad
17:10 20m Talk		SIMT-Step Execution: A Flexible Operational Semantics For GPU Subgroup Behavior PLDI Research Papers Zheyuan Chen University of California at Santa Cruz, Naomi Rehman University of California at Santa Barbara, Guido Martínez Microsoft Research, Tyler Sorensen Microsoft Research; University of California at Santa Cruz DOI Pre-print
17:30 20m Talk		Uniformity Analysis in the WebGPU Shading Language PLDI Research Papers James Lee-Jones Imperial College London, John Wickerson Imperial College London, Alastair F. Donaldson Imperial College London DOI