LLM-Integrated Declarative Program Analysis (SOAP 2026)

Mon 15 - Fri 19 June 2026 Boulder, Colorado, United States

Who

Sara Baradaran, Amirmohammad Nazari, Mukund Raghothaman

Track

SOAP 2026

Abstract

Program analysis tools such as CodeQL enable programmers to express questions about their codebases in the form of declarative queries, which are then evaluated over structured representations of the code. These are versatile tools having broad applications in bug finding and vulnerability discovery. Still, they are limited in their ability to answer questions that rely on semantic judgments which cannot be expressed or decided using program analysis alone, such as identifying string literals that contain private information or violations of variable naming conventions.

In this paper, we present SemQL, a system which extends declarative program analysis frameworks with the ability to invoke an LLM as an external oracle. SemQL allows developers to write queries which combine structural reasoning with semantic extra-analytic judgments. We show the real-world value of such a system by collecting a set of analytic questions that require semantic reasoning beyond what is deducible simply from the structure of the code. We present an algorithm which efficiently evaluates these queries while minimizing costly oracle invocations, and demonstrate its effectiveness in practical program analysis tasks.

LLM-Integrated Declarative Program Analysis

Sara Baradaran

University of Southern California

United States

Amirmohammad Nazari

University of Southern California

United States

Mukund Raghothaman

University of Southern California

Tracks

Co-hosted Conferences

Workshops