Lumos: Let there be Language Model System Certification
This program is tentative and subject to change.
As Language Model Systems (LMS) are deployed across a wide range of applications, aligning them with human ethics has be- come crucial. We introduce a domain-specific language, Lumos, for specifying and statistically certifying LMS behaviors. Lumos is the first imperative probabilistic programming language over graphs, with constructs to sample independent and identically distributed prompts for statistical certification of LMS. It abstracts large prompt distributions as graphs, forming random prompts from subgraphs.
Lumos gives a systematic way to reason about, specify, and certify novel properties. Using Lumos, we develop the first safety specifications for vision-language models (VLMs) in autonomous driving scenarios and the first specifications for privacy preser- vation in LMS responses. We show that the state-of-the-art VLM Qwen-VL exhibits critical safety failures, producing incorrect and unsafe responses with at least 90% probability in right-turn scenar- ios under rainy driving conditions, revealing substantial safety risks. We further integrate a prompt-level deterministic verifier to obtain guarantees over the privacy of the LLM generation distribution over a prompt distribution. Our results reveal that Qwen-3 (4B) has at least 24% chance of revealing private email information under adversarial prompt distributions. Lumos is simple to program in, requiring only a few constructs, as evidenced by state-of-the-art large language models generating correct Lumos specifications in zero-shot settings. Lumos is the first principled, language-based framework for specifying and certifying LMS behaviors, paving the way towards widespread adoption of LMS certification.