Enterprise MCP Gateway on Azure: A Production Blueprint for Secure Tool Calling

Most teams are still wiring MCP the wrong way. They let every client talk directly to every tool server, bolt on auth late, and discover too late that “agent integration” silently became a new control plane with no owner, no inventory, and no reliable audit trail. Azure is now mature enough to do this properly, but the platform story is split across API Management, App Service or Functions authorization, Microsoft Foundry, and Microsoft Entra. The hard part is not learning each product in isolation. The hard part is deciding where identity, mediation, delegation, and logging must live so a tool call is still explainable after the fifth preview feature lands. [S1] [S2] [S3] [S4] [S5] [S6] ...

April 10, 2026 · 22 min · 4546 words · Pavel Nasovich

Beyond Pattern Matching: What Formal Reasoning Engines Can Verify Today

Remove the verifier and most “formal reasoning engines” collapse into persuasive autocomplete. The real progress from 2023 to March 2026 did not come from models suddenly learning pure deduction. It came from changing the system boundary: retrieval narrowed the search space, proof assistants and solvers rejected invalid steps, and repair loops turned deterministic failures into usable feedback. That pattern runs from LeanDojo to AlphaProof to VERINA and WybeCoder. S1 S2 S7 S9 That distinction matters because the headline numbers are finally good enough to expose both the progress and the limit. On March 16, 2026, the latest VERINA revision still showed a large gap between “code that runs” and “code that is proved”: the best model reached 72.6% code correctness and 52.3% specification soundness/completeness, but only 4.9% proof success in one trial. On March 31, 2026, WybeCoder pushed much further by making verification itself agentic, solving 74% of Verina tasks at moderate compute. The lesson is blunt: if correctness matters, the winning move is not “more chain-of-thought.” It is a tighter verifier loop. S7 S9 ...

April 9, 2026 · 13 min · 2673 words · Pavel Nasovich

Copilot Cowork Under the Hood: Frontier, Work IQ, and the OneDrive Skills Model

On March 9, 2026, Microsoft introduced Copilot Cowork as the move from “Copilot can answer” to “Copilot can carry work forward.” On March 30, 2026, Microsoft said Cowork was available through the Frontier program. As of the Microsoft Learn and Support documentation updated in late March and early April 2026, Cowork is still explicitly documented as a preview/prerelease capability, gated through Frontier and still evolving. S1 S2 S3 S5 S8 That date sequence matters, because Cowork is not just another prompt box. It is Microsoft 365 Copilot’s first serious “plan to action” surface for long-running work: you describe an outcome, Cowork turns it into a plan, grounds it in your tenant context, loads skills, asks for approvals on sensitive steps, and keeps state in a visible task view while it works. S1 S2 S4 S8 S10 ...

April 8, 2026 · 23 min · 4754 words · Pavel Nasovich

PlugMem Under the Hood: Why Knowledge-Centric Memory Changes LLM Agents

Most agent-memory systems still do the lazy thing: store raw interaction history, retrieve a few chunks, and hope the base model compresses the mess at inference time. PlugMem starts from a much stronger assumption. The useful part of experience is sparse, structured, and should be compiled before retrieval. That is why this paper matters. PlugMem was submitted to arXiv on February 6, 2026, published on the Microsoft Research site on March 6, 2026, and the PDF metadata marks it as an ICML 2026 proceedings paper. As of April 5, 2026, the code and benchmark artifacts are public. The claim is ambitious but concrete: a single task-agnostic memory module, attached unchanged to very different agents, can beat both raw-memory baselines and several task-specific memory systems while using much less agent-side context. S1 S2 S3 S4 ...

April 5, 2026 · 16 min · 3405 words · Pavel Nasovich

I Ran pi on Gemma 4 26B A4B via llama.cpp. Here Is What Broke First

On April 4, 2026, I ran pi against a local llama.cpp endpoint serving unsloth/gemma-4-26B-A4B-it-GGUF:UD-Q4_K_XL. I wanted a clear answer to one question: has a quantized open model crossed the line from “cute local demo” to “serious enough to matter” for agentic coding? My answer is yes, but only if you are honest about where it breaks. This stack is already good enough to read local instructions, load skills, use tools, follow a plan, write files, run commands, and recover from some failures. It is not good enough to be trusted on precision-heavy work without hard validation. The first things to degrade were not general fluency or vibe. The first things to degrade were exactness, path resolution, layout arithmetic, and long-context responsiveness. ...

April 4, 2026 · 11 min · 2333 words · Pavel Nasovich

TurboQuant Under the Hood: Google's 3-Bit Attack on the LLM Memory Wall

Most AI efficiency launches are either smaller weights, benchmark theater, or a kernel trick dressed up as a new paradigm. TurboQuant is more interesting than that. On March 24, 2026, Google Research published TurboQuant as a practical compression stack for KV caches and vector search. The public claim was blunt: at least 6x KV-cache reduction, up to 8x attention-logit speedup on H100, and no training or fine-tuning required. Underneath the marketing, the real contribution is cleaner and more important: Google found a way to make extreme low-bit vector quantization behave like a systems primitive instead of a fragile research demo. S1 ...

March 26, 2026 · 15 min · 3033 words · Pavel Nasovich

Microsoft's Agentic Modernization Stack: Azure Copilot, GitHub Copilot, and the Control Plane Nobody Is Talking About

Microsoft’s biggest AI play in 2026 is not a new model, a new IDE, or a new assistant. It is an emerging connected modernization control plane: Azure Copilot owns migration and operational intelligence, GitHub Copilot owns application transformation execution, and Operations Center gives enterprises a single surface to observe, steer, and govern the resulting cloud estate. Each layer is individually useful. Together they describe something more interesting: a vertical stack that can convert a multi-year legacy migration programme into a continuous, agentic workflow — with humans kept in the decision seat rather than removed from it. ...

March 24, 2026 · 15 min · 3106 words · Pavel Nasovich

The Real GitHub Copilot Publishing Factory: How I Turned a Hugo Blog into a Repo-Aware Content System

Most “Copilot for blogging” setups are fake. They give the model a nicer prompt, maybe a scaffold script, and then act surprised when the output breaks the repo. That approach fails the moment the repository has real structure: Hugo page bundles instead of one flat posts/ folder local images and downloadable assets theme overrides on top of a vendored submodule deploy config and build rules old posts with inconsistent front matter styles companion materials like quizzes, flashcards, or social copy I wanted something stricter: a repo where GitHub Copilot could take a scoped topic, research it, scaffold the right bundle, write into the right files, validate the result, and stop before touching generated output. ...

March 24, 2026 · 12 min · 2380 words · Pavel Nasovich

When the Scanner Turned: Inside the Trivy Supply Chain Attack and the Rise of CanisterWorm

In March 2026, attackers turned Aqua Security’s Trivy ecosystem into a credential-harvesting distribution channel. This was not one bug, one poisoned package, or one bad release. It was a chained failure across GitHub Actions trust, secret rotation, mutable tags, runner memory, registry publishing, and npm’s default willingness to execute third-party code. On February 27 and February 28, 2026, the Trivy story started the way a lot of modern software compromises start: not with a zero-day in the scanner, but with automation glued together too loosely around trust. An autonomous agent dubbed hackerbot-claw found a dangerous pull_request_target pattern in Aqua Security’s Trivy repository, exploited it, and stole a privileged aqua-bot token. That first breach was bad enough on its own. The real disaster came after the first incident was supposedly contained. ...

March 24, 2026 · 17 min · 3422 words · Pavel Nasovich

GitHub-Native Autonomous Intake for Copilot: From Structured Issues to Draft PRs

Most autonomous content demos are fake. They show a model taking a prompt and emitting a draft, but they skip the part that actually matters in a working repository: intake structure, validation, repo rules, PR flow, and failure handling. For this blog, I wanted a GitHub-native pipeline where an idea could start as a structured issue, get normalized into a deterministic brief, be assigned to GitHub Copilot, and come back as a draft PR that still respected the repo. ...

March 24, 2026 · 9 min · 1891 words · Pavel Nasovich