AI-powered agents are increasingly being deployed in production settings: they plan, call tools, maintain state, and act over live services and data. Yet the formal foundations needed to make these systems safe, reliable, and trustworthy remain underdeveloped. Ensuring correctness in these settings demands a new discipline, which we term agentic engineering: a research agenda that brings together ideas from programming languages, formal verification, software engineering, and neuro-symbolic reasoning to specify, analyze, test, monitor, and repair agents at scale.

This workshop aims to bring together researchers and practitioners interested in the foundations and practice of safe agentic systems. Our goal is to foster a community around principled approaches for building agents with precise specifications, verifiable behaviors, and runtime safeguards that hold up in real-world deployments. More broadly, the workshop seeks to strengthen connections across the programming languages, formal methods, software engineering, and machine learning communities in order to advance the principles of safe and reliable agentic engineering.

The workshop will feature invited talks and peer-reviewed contributions spanning theory, systems, tools, and practical experience.

Plenary

This program is tentative and subject to change.

You're viewing the program in a time zone which is different from your device's time zone change time zone

Mon 15 Jun

Displayed time zone: Mountain Time (US & Canada) change

09:00 - 10:10
KeynotePAgE at Meadows CD
Chair(s): Shraddha Barke Microsoft Research, Redmond
09:00
10m
Day closing
Opening remarks
PAgE
Shraddha Barke Microsoft Research, Redmond
09:10
60m
Keynote
Formal Methods for Frontier AI Systems
PAgE
Gagandeep Singh University of Illinois Urbana-Champaign
10:10 - 10:40
Coffee BreakCatering at Flatirons Foyer
10:10
30m
Coffee break
Break
Catering

10:40 - 12:20
PAgE Session 1PAgE at Meadows CD
Chair(s): Shraddha Barke Microsoft Research, Redmond
10:40
20m
Talk
Agentic Code Reasoning
PAgE
Shubham Ugare Meta, Satish Chandra Meta Platforms, Inc.
Pre-print
11:00
20m
Talk
Towards Verified Code Reasoning by LLMs
PAgE
Meghana Aparna Sistla Google DeepMind, Gogul Balakrishnan Google, Patrick Rondon Google, José Pablo Cambronero Google, USA, Michele Tufano Google, Satish Chandra Meta Platforms, Inc.
11:20
20m
Talk
Testing LLM-Generated Distributed Protocol Code
PAgE
Ankush Das Boston University, Brendan Coyne Boston University
11:40
20m
Talk
The Next Frontier for AI-Generated Kernels: Correctness
PAgE
Guido Martínez Microsoft Research, Tyler Sorensen University of California at Santa Cruz
DOI Pre-print
12:00
20m
Talk
Testing, Credible Compilation, and Verification in the Axon Verified Compiler in Lean and Claude Code
PAgE
Martin C. Rinard Massachusetts Institute of Technology
12:20 - 13:40
12:20
80m
Lunch
Lunch
Catering

13:40 - 15:20
PAgE Session 2PAgE at Meadows CD
13:40
20m
Talk
Combining Agentic AI and Lightweight Formal Methods To Find Bugs in a Production Hypervisor
PAgE
Hiroyuki Katsura University of Cambridge, Kayvan Memarian University of Cambridge, Peter Sewell University of Cambridge
14:00
20m
Talk
From Workarounds to Root Causes: Experience Using Agentic Workflows to Debug Complex Browser GPU Compiler Stacks
PAgE
Abhijit Ramesh UC Santa Cruz, Reese Levine University of California at Santa Cruz, Tyler Sorensen University of California at Santa Cruz
14:20
20m
Talk
Event-based Design Abstractions for Agent Harnesses
PAgE
McCoy Becker , Matin Ghavami Massachusetts Institute of Technology, Fabian Zaiser Massachusetts Institute of Technology, Timothy O'Donnell , Martin C. Rinard Massachusetts Institute of Technology, Joshua B. Tenenbaum Massachusetts Institute of Technology, Vikash Mansinghka Massachusetts Institute of Technology
14:40
20m
Talk
Lumos: Let there be Language Model System Certification
PAgE
Isha Chaudhary , Vedaant Jain , Prineet Parhar , Kavya Sachdeva , Avaljot Singh University of Illinois Urbana-Champaign, Sayan Ranu , Gagandeep Singh University of Illinois Urbana-Champaign
Pre-print
15:00
20m
Talk
Lazy Validation and Self-Healing for Agentic Programs
PAgE
Theodoros Tsampouris , Eleftherios Ioannidis Microsoft Research, Andreas Symeonidis Aristotle University of Thessaloniki
15:20 - 15:50
Coffee BreakCatering at Flatirons Foyer
15:20
30m
Coffee break
Break
Catering

15:50 - 17:00
PAgE KeynotePAgE at Meadows CD
15:50
60m
Keynote
Can Coding Agents Write Verifiably Correct Software?
PAgE
Shan Lu Microsoft; University of Chicago
16:50
10m
Day closing
Closing Remarks
PAgE
Shraddha Barke Microsoft Research, Redmond

Call for Papers

PAgE (Principles of Agentic Engineering) aims to bring together the formal methods and AI communities to advance the principles of safe agentic engineering. It will feature peer-reviewed papers and invited talks from experts in the field. We welcome submissions describing research results, artifacts, datasets, case studies, and experience reports. Topics of interest include, but are not limited to:

  • Specifications and type-safe interfaces for tool use

  • Memory and state management for agents

  • Reliability & groundedness in planning and tool use

  • Static & dynamic analysis and verification of agentic plans

  • Safe integration with software engineering workflows

  • Testing and debugging of agentic workflows

  • Runtimes, compilers and virtual machines for agentic programs

  • Evaluation methods and benchmarks for trustworthy agents, including safety, reliability, and cost/latency tradeoffs

  • Safety, security, and privacy of multi-agent systems

  • Neuro-symbolic and formal methods approaches for agent reasoning and control

We also welcome submissions that explore the use of AI agents in the research and development process itself, including agent-assisted ideation, implementation, experimentation, artifact generation, and writing. Such submissions should clearly describe the workflow used, the extent of human oversight, and any lessons learned regarding reliability, reproducibility, and safety. Please ensure that all submissions are anonymized for review.

Submission Tracks

1. Archived Papers:

Archived submissions should present completed research on topics related to the focus of the workshop. Full papers may be up to 10 pages, excluding bibliography, and must use the standard SIGPLAN conference two-column format. Full papers may be up to 10 pages, excluding bibliography, and should present completed research on topics related to the focus of the workshop. They will be reviewed according to the usual criteria of relevance, soundness, novelty, and significance. Accepted archived papers will be published in the ACM Digital Library. Formatting instructions and templates can be found on the SIGPLAN Author Information page.

2. Non-archival Submissions:

We also invite non-archival submissions describing work in progress, position papers, experience reports, negative results, demos, artifact or dataset descriptions, and talk proposals. These submissions may be up to 6 pages, excluding bibliography, and should also use the standard SIGPLAN conference two-column format. Accepted non-archival submissions will be presented at the workshop but will not appear in the ACM Digital Library.

Please note the extended deadline.

Contributions should be submitted via HotCRP: https://page2026.hotcrp.com/u/1/

For any questions, please contact the workshop chairs (add sbarke@microsoft.com).