RunAnywhere (YC W26): The Real Bet Behind Fast AI Inference on Apple Silicon

Preface: How I Read This Research Pack The local research bundle on RunAnywhere is broad, but it is not uniform. Some files are direct performance summaries, some are opinionated strategy memos, and some are clearly derivative study aids built from the same underlying source set. After reading the full bundle, then re-checking the public web evidence on March 12, 2026, my conclusion is narrower and more useful: RunAnywhere is not just a “fastest inference on Apple Silicon” demo. It is trying to become the runtime, packaging, and fleet-management layer for on-device AI, with MetalRT acting as the Apple-Silicon flagship proof point. S1 S2 S3 S4 ...

March 11, 2026 · 16 min · 3211 words · Pavel Nasovich

The Great Immich Migration: From v1.113.0 to v2.5.6

How a “simple” photo library upgrade turned into a deep dive through PostgreSQL version migrations, deprecated vector extensions, and the kind of database surgery you hope to never need. The Starting Point My Immich instance had been happily humming along at v1.113.0 for months on my Unraid server. I was running the community all-in-one imagegenius/immich container variant, which bundles the server, microservices, machine learning, and Redis into one image, backed by an NVIDIA GPU for CUDA-accelerated ML and a shared PostgreSQL 14 instance that also served a pile of other workloads. ...

March 9, 2026 · 7 min · 1380 words · Pavel Nasovich

FinOps Toolkit Framework Playbook: Secure Hubs, AI Agents, and a 90-Day Execution Model

Preface: Why This Version Exists Most FinOps programs fail in the same place: they build good dashboards and still ship bad decisions. The root cause is rarely tooling. It is usually one of these: Ingestion is not trustworthy (missing prices, missing months, duplicates after scope changes). Ownership is fuzzy (nobody is on the hook for a recommendation becoming a change). The loop is discontinuous (big cost projects twice a year instead of an operating rhythm). This playbook focuses on the parts that create trust: data contracts, scope design, versioning, and operational gates. ...

March 2, 2026 · 13 min · 2693 words · Pavel Nasovich

The Linear Revolution at ICLR 2026: Mamba-3, EFLA, and the End of the Quadratic Bottleneck

Audio Version Your browser does not support the audio element. Play/download concise audio version Download full deep-dive audio 1. Introduction: From Rio to the Future of Efficiency The Fourteenth International Conference on Learning Representations (ICLR 2026) in Rio de Janeiro has solidified a paradigm shift that many of us in the AI architecture space have long anticipated: the transition from “approximate” efficiency to “exact” sub-quadratic modeling. For years, the industry accepted the quadratic compute and linear memory bottlenecks of standard Transformers as an unavoidable tax on quality. Rio 2026 has definitively challenged this notion. S4 S5 ...

March 1, 2026 · 6 min · 1200 words · Pavel Nasovich

Why Your OpenCode Skill Shows No Output (And How to Fix It)

Why Your OpenCode Skill Shows No Output (And How to Fix It) TL;DR: OpenCode’s run mode captures subprocess stdout and only returns it after the command finishes. If your skill launches a long-running pipeline, the user sees nothing for minutes — or hours. The fix: write a plain-text progress log that users can tail -f from a second terminal. The 30-Second Fix If you just want the solution: from datetime import datetime from pathlib import Path def log_progress(output_folder: str, msg: str) -> None: log_path = Path(output_folder) / "progress.log" log_path.parent.mkdir(parents=True, exist_ok=True) with open(log_path, "a") as f: f.write(f"[{datetime.now().strftime('%H:%M:%S')}] {msg}\n") f.flush() # Critical for tail -f! Then tell users: tail -f <output>/progress.log ...

February 28, 2026 · 6 min · 1205 words · Pavel Nasovich

Microsoft Agent Framework in 2026: Enterprise Architecture Playbook

Reading time: ~45 min | Audience: platform leads, principal engineers, AI architects, security teams | Primary goal: build agent systems that stay stable under real enterprise pressure Preface: Why I Wrote This Version Quick context on why this exists. I have read too many agent posts that sound convincing and then collapse in real enterprise environments. Usually they miss one of three things: They stop at demos. They show code without operations. They draw architecture without incident behavior. This version is for teams that already shipped something and now need clear answers in design review, security review, and on-call: ...

February 23, 2026 · 15 min · 3074 words · Pavel Nasovich

The Future of Code in 2026: 8 Agentic Trends Reshaping Software Development

Audio Version Your browser does not support the audio element. Play or download the audio overview Executive Summary In 2026, software teams are moving from AI-assisted coding to AI-orchestrated delivery. The winning pattern is not “humans out”; it is “humans at the highest-leverage layer” where architecture, decomposition, quality gates, and risk decisions happen. This post synthesizes the NotebookLM research notebook b4a11a4b-0e9f-4615-ad14-96d0c8bee177 and highlights eight shifts changing engineering strategy right now. ...

February 13, 2026 · 4 min · 718 words · Pavel Nasovich

Beyond RAG: The Power of Temporal Memory for AI Agents with Graphiti

Introduction: The Memory Gap in Modern AI Systems The central challenge facing the next generation of advanced AI agents is the “Context Retention Challenge.” While architectures like Retrieval-Augmented Generation (RAG) have given agents access to vast external knowledge bases, they are often architecturally insufficient for dynamic enterprise environments where data, user preferences, and operational context evolve continuously. Traditional RAG systems, often powered by vector databases, are fundamentally reliant on static data sources. This design treats each interaction as an isolated event, preventing the system from building long-term memory or modeling the complex, relational dependencies inherent in the real world. ...

November 26, 2025 · 13 min · 2656 words · Pavel Nasovich

Azure AI Foundry: Building Multi-Agent Systems Without Losing Your Mind (Much)

Hey folks! So, picture this: it’s 3 AM, I’m staring at my fourth attempt to orchestrate multiple AI agents, and my code looks like someone tried to solve the traveling salesman problem with spaghetti. The agents are talking to each other… sometimes. When they feel like it. When Mercury is in retrograde. And then Microsoft drops Azure AI Foundry and promises it’ll solve all my multi-agent orchestration problems. The Problem With Building AI Agents (Or: Why I Started Drinking More Coffee) Let me paint you a picture. You want to build an AI system that actually does something useful. Not just a chatbot that tells you the weather - we’ve all built that demo. I’m talking about real work. Multiple agents, each doing their specialized thing, talking to each other, handling errors, not hallucinating too much. ...

August 27, 2025 · 7 min · 1365 words · Pavel Nasovich

From Mermaid Gantt to Enterprise Roadmaps: A Journey Through Python and Procrastination

Hey folks! So, picture this: it’s a regular day, and someone drops a Mermaid Gantt diagram in my lap. “Can you make this… prettier?” they ask. “You know, for the executives. Make it look professional.” I look at this ASCII-art-meets-timeline monstrosity and think: how hard could it be? Narrator: It was about to become a whole journey. The Problem: When Gantt Charts Aren’t Gantt-y Enough You know Mermaid, right? That thing where you write text and it becomes diagrams? It’s great for documentation. Throw some code in your markdown, boom - instant diagram. But here’s the thing: Mermaid Gantt charts look like… well, they look like what a developer thinks executives want to see. ...

August 25, 2025 · 6 min · 1133 words · Pavel Nasovich