Industry
AI · Product Operations
Client
Personal Project
OpsPilot — A Multi-Agent AI System for Autonomous Product Operations

I built this because I wanted to understand agentic AI from the inside, not the outside.
The problem didn't start with technology. It started with a conversation. Talking to senior PMs and ICs at other companies, I kept hearing the same complaint: the actual thinking work, the prioritisation, the strategy, the decisions that require real judgment, was getting crowded out by the operational layer surrounding it. Reading messages. Interpreting intent. Deciding what each one needs. First-drafting from scratch. Thirty times a day across Gmail and Slack. AI assistants exist. But they wait to be asked. You still read the message, decide what it needs, frame the prompt, and review the output. The assistant reduces drafting time. It doesn't reduce the cognitive overhead of triage. You're still the operator. I wanted to build something that didn't wait. A system that reads the signal, understands what it requires, decides what to produce, and produces it without being prompted for each task individually. That became OpsPilot. The Wrong First Version The first version was a single prompt, one call to Gemini doing classification, routing, and generation simultaneously. It worked on clean inputs. It fell apart on realistic ones. The specific failure: on ambiguous messages like "can we get a PRD for the search bar?", the model began generating content before resolving whether this was a real request or a passing comment. Structurally correct output. Wrong in substance. Classification and generation were corrupting each other by running inside the same context at the same time. The fix was four separate agents, each with exactly one job: • Input Agent: reads raw message, converts to standard format • Classifier Agent: one question only, what type of task is this? • Decision Agent: routes to the correct output type • Execution Agent: generates the final artifact Accuracy improved. Debuggability improved dramatically. When something broke, I could see exactly which agent caused it instead of tracing through a monolithic prompt. What It Does Mission Control: all Gmail and Slack messages in one unified view • Auto-Execute: one click triggers the full pipeline, no prompting per task • Glass-Box Reasoning: transparent log of how the AI interpreted each message and why it routed the way it did. Built deliberately because black-box outputs break trust, and if you can't see the reasoning you can't catch the errors • Document Analysis Workspace: upload PDFs or images and have a full conversation with the document, ask specific questions, request sections, get contextual answers • Knowledge Base: all outputs saved, editable, and downloadable across sessions • Gmail OAuth: processes real inbox messages, not simulated data

