Microsoft Presidio: How I Learned to Stop Writing Regex and Love the Recognizer

So, What is This Presidio Thing? Alright, let’s get technical. Presidio is an open-source library from Microsoft for finding and anonymizing PII. Think of it like a two-part system: The Analyzer Engine: This is the detective. It scans your text and looks for anything that looks like PII—names, phone numbers, credit cards, you name it. It’s not just one big regex; it’s a whole team of “recognizers” that use a mix of NLP (like spaCy), regular expressions, and even checksums to be extra sure. The Anonymizer Engine: This is the guy with the big black marker. Once the Analyzer finds the PII, the Anonymizer scrubs it out. But it’s smarter than just deleting things. You can tell it how to anonymize. You can redact, mask, replace with fake data, or even encrypt it if you need to get the original data back later. The best part? It’s all pluggable. You don’t like the default NLP model? Swap it out. Need to find a super-specific, internal-only customer ID format? You can write your own recognizer for it. This is good, because it means you’re not stuck with whatever Microsoft decided was a good idea in 2018. ...

August 19, 2025 · 6 min · 1243 words · Pavel Nasovich

II-Search-4B: A Love Letter to Small Models (Or How I Learned to Stop Worrying and Embrace 4B Parameters)

Okay so. Picture this. It’s August 2025, I’m drowning in API costs from o3, my ideas’s runway is… let’s not talk about it… and I stumble across this random Hugging Face model called II-Search-4B. Four billion parameters. FOUR. In an era where everyone’s flexing their 405B models like it’s a #### measuring contest. My first thought? “This is gonna s*ck.” My second thought, three weeks later? “Holy crap this actually works.” ...

August 8, 2025 · 8 min · 1574 words · Pavel Nasovich

Qwen3-30B-A3B Deep Dive: How 128 Experts Achieve Frontier Performance at 10% Active Parameters

Qwen3-30B-A3B represents a paradigm shift in large language model efficiency, achieving flagship-level performance with only 3.3 billion active parameters from a 30.5 billion total parameter pool. This Mixture-of-Experts (MoE) model, released by Alibaba’s Qwen team, demonstrates that intelligent parameter activation can outperform brute-force scaling, scoring 91.0 on ArenaHard while using 10x fewer active parameters than comparable dense models. The model’s hybrid thinking architecture enables controllable reasoning depth, supporting both rapid responses and deep analytical tasks through dynamic computational allocation. ...

August 7, 2025 · 7 min · 1403 words · Pavel Nasovich

How I Stopped My Copilot Studio Agent From Hallucinating (And You Can Too)

Hey folks! So, picture this: it’s Friday afternoon, I’m about to deploy this beautiful support agent for our internal tools, and everything looks perfect. The agent answers questions, explains our processes, knows all our custom fields… Then someone from QA asks about a feature that doesn’t exist. And my agent? It writes a whole dissertation about this imaginary feature. Complete with parameters, best practices, and “common troubleshooting steps.” Narrator: There were no troubleshooting steps. There was no feature. ...

August 2, 2025 · 17 min · 3604 words · Pavel Nasovich

Measuring GitHub Copilot Productivity: What Actually Works (and What Doesn't)

So you’re trying to figure out if GitHub Copilot is worth it? Join the club. I’ve been down this rabbit hole for the past few months, and honestly, it’s messier than the vendor slides suggest. Here’s the thing - everyone wants that magic number. “Copilot will make your developers X% more productive!” But after digging through actual data from teams using it (and yeah, running our own experiments), the reality is… complicated. ...

August 1, 2025 · 4 min · 720 words · Pavel Nasovich

Figma MCP: How I Learned to Stop Worrying and Let AI Read My Designs

Hey folks! So, picture this: it’s 2 AM, I’m on my third energy drink, and my PM messages me - “can you make the button look EXACTLY like the Figma design?” For the 47th time. That day. Narrator: He could not, in fact, make it look exactly like the design. But then I discovered Figma MCP, and let me tell you - it’s like someone finally gave AI glasses to actually SEE what designers meant instead of guessing. Today I’m gonna share how to set this thing up without losing your sanity. Mostly. ...

July 31, 2025 · 9 min · 1771 words · Pavel Nasovich

I Fed My Entire Codebase to an AI With Repomix. Here's What I Learned

A journey into the lazy, brilliant, and slightly terrifying world of AI-assisted development Look, let’s be honest. We’ve all been there. It’s 1 AM, you’re staring at a janky codebase that’s grown more tangled than your headphone cables, and you have a brilliant, desperate idea: “I’ll just ask ChatGPT.” You start copy-pasting files, but by the third one, the AI has the memory of a goldfish and asks, “So, what were we talking about again?” Context window slammed shut. Face, meet palm. I needed a better way to get my AI assistant to understand the beautiful mess I’d created. That’s when I stumbled upon Repomix—a tool that promised to package my entire repository into a single, “AI-friendly” file. The premise is simple, absurd, and undeniably a product of our times: we now need specialized tools just to format our code for our robot overlords. My inner cynic called it a “digital meat grinder.” My inner lazy genius called it a “superpower.” As it turns out, they were both right. ...

July 24, 2025 · 7 min · 1363 words · Pavel Nasovich

GitHub Copilot Agent Mode: EU Data Residency & AI Act Compliance Checklist

1 What “Agent mode” actually does GitHub Copilot’s Agent mode lets developers type a high-level goal; the LLM then plans, edits code, invokes tools and loops until tests pass⁴ (learn.microsoft.com). Behind the scenes, Visual Studio, VS Code and Copilot Chat call the same Azure OpenAI endpoint used by Copilot Chat and Copilot for Azure⁵ (learn.microsoft.com). ...

July 8, 2025 · 4 min · 685 words · Pavel Nasovich

ByteDance's AI breakthrough reshapes how computers are used

How UI-TARS actually works UI-TARS represents a fundamental departure from traditional GUI automation tools by integrating perception, reasoning, action, and memory into a single end-to-end model. Unlike frameworks that rely on wrapped commercial models with predefined workflows, UI-TARS uses a pure-vision approach that processes screenshots directly. The architecture comprises four tightly integrated components: Perception System: Processes screenshots to understand GUI elements, their relationships, and context. The model identifies buttons, text fields, and interactive components with sub-5-pixel accuracy, allowing for precise interactions. ...

May 13, 2025 · 9 min · 1889 words · Pavel Nasovich

Running Local LLMs on Microsoft Surface Pro 11: NPU Acceleration with DeepSeek Models

Introduction The computing industry is witnessing a paradigm shift with the integration of dedicated AI accelerators in consumer devices. Microsoft’s Copilot+ PCs, including the Surface Pro 11, represent a strategic investment in on-device AI processing capabilities. The ability to run sophisticated Large Language Models (LLMs) locally, without constant cloud connectivity, offers compelling advantages in terms of privacy, latency, and offline functionality. This report investigates the practical aspects of developing and deploying local LLMs on the Surface Pro 11 SD X Elite with 32GB RAM, focusing specifically on leveraging the Neural Processing Unit (NPU) acceleration through ONNX runtime and the implementation of the DeepSeek R1 7B and 14B distilled models. By examining the developer experience, performance characteristics, and comparing with Apple’s M4 silicon, we aim to provide a comprehensive understanding of the current state and future potential of on-device AI processing. ...

May 12, 2025 · 18 min · 3751 words · Pavel Nasovich