GSoC 2025 Midterm - An AI Agent for Jenkins Failure Diagnosis
Hello, Jenkins Community!
I’m Chirag Gupta, and this is the midterm update for my Google Summer of Code 2025 project: "Domain-specific LLM based on actual Jenkins usage using ci.jenkins.io data. The project’s vision is to accelerate the often complex process of diagnosing build failures in Jenkins using AI.
For a detailed overview, please refer to the project page.
We’ve just crossed the midterm evaluations, so I’d like to share a progress update on the project.
The Pivot: From Fine-Tuning to a Flexible Agentic System
One of the most significant developments has been a pivot from the original goal of fine-tuning one or more LLMs, to building a more future-proof and universal agentic architecture.
Why the change?
-
Flexibility & User Choice: A single fine-tuned model locks users in. Our new agentic framework allows users to plug in any capable LLM, from cloud services like OpenAI and Claude to self-hosted models.
-
Future-Proofing: A specialized, fine-tuned model for Jenkins is still a future goal, but it can now be integrated as just one of many options within the agent, rather than being the entire system.
-
Extensibility: The agent’s capabilities are now defined by its tools, not just its training data. This makes it far easier to add new functionalities over time, like interacting with live Jenkins instances.
Midterm Accomplishments: A Functional Prototype
We have successfully developed a fully functional prototype that establishes this new core architecture. This prototype proves the viability of the agent-based diagnosis model.
-
Interactive CLI: We built a user-friendly command-line interface using
Typer
andRich
. It guides the user through the diagnosis process, handles file I/O, and presents the final report in an easy to read formatted way. -
Multi-Agent Pipeline: The core logic operates on a "Chain of Responsibility" model:
-
A Router Agent first classifies the failure type.
-
A Specialist Agent then uses a suite of tools to perform an in-depth investigation.
-
An optional Critic Agent enables a self-correction loop, reviewing the diagnosis for quality and forcing a retry if the report is flawed.
-
-
Advanced RAG Tool: We integrated a sophisticated Retrieval-Augmented Generation (RAG) pipeline using
LightRAG
. This tool provides the agent with external knowledge and features a hybrid stack of localsentence-transformers
for embeddings andCohere
for high-quality reranking. -
Robust Logging & Sandboxing: The CLI features a dual-logging system for both application debugging and detailed AI interaction auditing. For safety and reproducibility, each diagnosis runs in an isolated, timestamped directory, ensuring the user’s original workspace files are never touched.
What’s Next? The Road to the Target Architecture
The next phase will focus on evolving the prototype into the powerful, integrated system envisioned in our target architecture.
-
Expanding LLM Backend Support: We will build out the provider-agnostic LLM adapter to include support for a wider range of backends. This will give users the freedom to choose their preferred provider based on cost, performance, or privacy needs, including direct integrations for OpenAI, Anthropic (Claude), and Groq.
-
RAG for Jenkins Knowledge: Build a comprehensive vector store from the official Jenkins documentation, wikis, and community discussions to give the agent deep domain knowledge with different-different embeddings model to suit the users needs.
-
Comprehensive Evaluation Framework: Create a framework using techniques like "LLM-as-a-Judge" to rigorously test the quality of the diagnoses and produce valuable insights for the community.
Acknowledgements
A heartfelt thank you to my mentors, Kris Stern, Shivay Lamba, Bruno Verachten, Harsh Pratap Singh, and Vutukuri Sreenivas. Their expertise, guidance, and timely reviews have been really helpful in refining the project’s technical roadmap and navigating the challenges.
I’d also like to thank the organization admins Kris Stern, Bruno Verachten, and Alyssa Tong for always checking in and offering help; your kindness and support mean a lot.
Excited for the second phase of the project!
Project Repository: chiru12/jenkins-domain-LLM