Agents Need CI, Not Vibes: Evaluating Microsoft 365 Copilot Agents

Microsoft 365 Copilot agents are crossing the line from demo artifacts into software products. Once that happens, manual spot checks are not enough. A production agent needs a release discipline: evaluation datasets, judge configuration, thresholds, CI/CD gates, evidence packages, and regression memory. Not as governance theatre. As the shortest safe path from “nice demo” to “we can ship this and explain why.” This is the blueprint I would use to move a Copilot agent from vibe-based confidence to governed delivery. ...

May 15, 2026 · 18 min · 3701 words · Pavel Nasovich

Copilot Cowork Under the Hood: Frontier, Work IQ, and the OneDrive Skills Model

On March 9, 2026, Microsoft introduced Copilot Cowork as the move from “Copilot can answer” to “Copilot can carry work forward.” On March 30, 2026, Microsoft said Cowork was available through the Frontier program. As of the Microsoft Learn and Support documentation updated in late March and early April 2026, Cowork is still explicitly documented as a preview/prerelease capability, gated through Frontier and still evolving. S1 S2 S3 S5 S8 That date sequence matters, because Cowork is not just another prompt box. It is Microsoft 365 Copilot’s first serious “plan to action” surface for long-running work: you describe an outcome, Cowork turns it into a plan, grounds it in your tenant context, loads skills, asks for approvals on sensitive steps, and keeps state in a visible task view while it works. S1 S2 S4 S8 S10 ...

April 8, 2026 · 23 min · 4754 words · Pavel Nasovich