Discovering and Learning Probabilistic Models of Black-Box AI Capabilities

Daniel Bramblett, Rushang Karia, Adrian Ciotinga, Ruthvick Suresh, Pulkit Verma, YooJung Choi, and Siddharth Srivastava.
(preprint), 2025

arXiv

Abstract

Black-box AI (BBAI) systems such as foundational models are increasingly being used for sequential decision making. To ensure that such systems are safe to operate and deploy, it is imperative to develop efficient methods that can provide a sound and interpretable representation of the BBAI’s capabilities. This paper shows that PDDL-style representations can be used to efficiently learn and model an input BBAI’s planning capabilities. It uses the Monte-Carlo tree search paradigm to systematically create test tasks, acquire data, and prune the hypothesis space of possible symbolic models. Learned models describe a BBAI’s capabilities, the conditions under which they can be executed, and the possible outcomes of executing them along with their associated probabilities. Theoretical results show soundness, completeness and convergence of the learned models. Empirical results with multiple BBAI systems illustrate the scope, efficiency, and accuracy of the presented methods.

Citation

@article{Patil25,
    author    = {Bramblett, Daniel and Karia, Rushang and Ciotinga, Adrian and Suresh, Ruthvick and Verma, Pulkit and Choi, YooJung and Srivastava, Siddharth},
    title     = {Discovering and Learning Probabilistic Models of Black-Box AI Capabilities},
    journal   = {arXiv preprint arXiv:2512.16733},
    year      = {2025},
    month     = {dec},
}

Share on

Twitter Facebook LinkedIn