Posts

Microsoft Agent Framework in 2026: Enterprise Architecture Playbook

Reading time: ~45 min | Audience: platform leads, principal engineers, AI architects, security teams | Primary goal: build agent systems that stay stable under real enterprise pressure Preface: Why I Wrote This Version Quick context on why this exists. I have read too many agent posts that sound convincing and then collapse in real enterprise environments. Usually they miss one of three things: They stop at demos. They show code without operations. They draw architecture without incident behavior. This version is for teams that already shipped something and now need clear answers in design review, security review, and on-call: ...

The Future of Code in 2026: 8 Agentic Trends Reshaping Software Development

Audio Version Your browser does not support the audio element. Play or download the audio overview Executive Summary In 2026, software teams are moving from AI-assisted coding to AI-orchestrated delivery. The winning pattern is not “humans out”; it is “humans at the highest-leverage layer” where architecture, decomposition, quality gates, and risk decisions happen. This post synthesizes the NotebookLM research notebook b4a11a4b-0e9f-4615-ad14-96d0c8bee177 and highlights eight shifts changing engineering strategy right now. ...

Beyond RAG: The Power of Temporal Memory for AI Agents with Graphiti

Introduction: The Memory Gap in Modern AI Systems The central challenge facing the next generation of advanced AI agents is the “Context Retention Challenge.” While architectures like Retrieval-Augmented Generation (RAG) have given agents access to vast external knowledge bases, they are often architecturally insufficient for dynamic enterprise environments where data, user preferences, and operational context evolve continuously. Traditional RAG systems, often powered by vector databases, are fundamentally reliant on static data sources. This design treats each interaction as an isolated event, preventing the system from building long-term memory or modeling the complex, relational dependencies inherent in the real world. ...

Azure AI Foundry: Building Multi-Agent Systems Without Losing Your Mind (Much)

Hey folks! So, picture this: it’s 3 AM, I’m staring at my fourth attempt to orchestrate multiple AI agents, and my code looks like someone tried to solve the traveling salesman problem with spaghetti. The agents are talking to each other… sometimes. When they feel like it. When Mercury is in retrograde. And then Microsoft drops Azure AI Foundry and promises it’ll solve all my multi-agent orchestration problems. The Problem With Building AI Agents (Or: Why I Started Drinking More Coffee) Let me paint you a picture. You want to build an AI system that actually does something useful. Not just a chatbot that tells you the weather - we’ve all built that demo. I’m talking about real work. Multiple agents, each doing their specialized thing, talking to each other, handling errors, not hallucinating too much. ...

From Mermaid Gantt to Enterprise Roadmaps: A Journey Through Python and Procrastination

Hey folks! So, picture this: it’s a regular day, and someone drops a Mermaid Gantt diagram in my lap. “Can you make this… prettier?” they ask. “You know, for the executives. Make it look professional.” I look at this ASCII-art-meets-timeline monstrosity and think: how hard could it be? Narrator: It was about to become a whole journey. The Problem: When Gantt Charts Aren’t Gantt-y Enough You know Mermaid, right? That thing where you write text and it becomes diagrams? It’s great for documentation. Throw some code in your markdown, boom - instant diagram. But here’s the thing: Mermaid Gantt charts look like… well, they look like what a developer thinks executives want to see. ...

Microsoft Presidio: How I Learned to Stop Writing Regex and Love the Recognizer

So, What is This Presidio Thing? Alright, let’s get technical. Presidio is an open-source library from Microsoft for finding and anonymizing PII. Think of it like a two-part system: The Analyzer Engine: This is the detective. It scans your text and looks for anything that looks like PII—names, phone numbers, credit cards, you name it. It’s not just one big regex; it’s a whole team of “recognizers” that use a mix of NLP (like spaCy), regular expressions, and even checksums to be extra sure. The Anonymizer Engine: This is the guy with the big black marker. Once the Analyzer finds the PII, the Anonymizer scrubs it out. But it’s smarter than just deleting things. You can tell it how to anonymize. You can redact, mask, replace with fake data, or even encrypt it if you need to get the original data back later. The best part? It’s all pluggable. You don’t like the default NLP model? Swap it out. Need to find a super-specific, internal-only customer ID format? You can write your own recognizer for it. This is good, because it means you’re not stuck with whatever Microsoft decided was a good idea in 2018. ...

II-Search-4B: A Love Letter to Small Models (Or How I Learned to Stop Worrying and Embrace 4B Parameters)

Okay so. Picture this. It’s August 2025, I’m drowning in API costs from o3, my ideas’s runway is… let’s not talk about it… and I stumble across this random Hugging Face model called II-Search-4B. Four billion parameters. FOUR. In an era where everyone’s flexing their 405B models like it’s a #### measuring contest. My first thought? “This is gonna s*ck.” My second thought, three weeks later? “Holy crap this actually works.” ...

Qwen3-30B-A3B Deep Dive: How 128 Experts Achieve Frontier Performance at 10% Active Parameters

Qwen3-30B-A3B represents a paradigm shift in large language model efficiency, achieving flagship-level performance with only 3.3 billion active parameters from a 30.5 billion total parameter pool. This Mixture-of-Experts (MoE) model, released by Alibaba’s Qwen team, demonstrates that intelligent parameter activation can outperform brute-force scaling, scoring 91.0 on ArenaHard while using 10x fewer active parameters than comparable dense models. The model’s hybrid thinking architecture enables controllable reasoning depth, supporting both rapid responses and deep analytical tasks through dynamic computational allocation. ...

How I Stopped My Copilot Studio Agent From Hallucinating (And You Can Too)

Hey folks! So, picture this: it’s Friday afternoon, I’m about to deploy this beautiful support agent for our internal tools, and everything looks perfect. The agent answers questions, explains our processes, knows all our custom fields… Then someone from QA asks about a feature that doesn’t exist. And my agent? It writes a whole dissertation about this imaginary feature. Complete with parameters, best practices, and “common troubleshooting steps.” Narrator: There were no troubleshooting steps. There was no feature. ...

Measuring GitHub Copilot Productivity: What Actually Works (and What Doesn't)

So you’re trying to figure out if GitHub Copilot is worth it? Join the club. I’ve been down this rabbit hole for the past few months, and honestly, it’s messier than the vendor slides suggest. Here’s the thing - everyone wants that magic number. “Copilot will make your developers X% more productive!” But after digging through actual data from teams using it (and yeah, running our own experiments), the reality is… complicated. ...