As of May, 2026, the strongest pattern in AI coding is not “give the agent a bigger context window.” It is the emergence of a controlled agent operating layer around the repository.
That layer has a few recognizable parts: canonical instructions in version control, path-scoped rules near the code they govern, task specs before implementation, bounded subagents, MCP/tool allowlists, sandboxing, audit logs, cost-aware model routing, and a verification loop that does not confuse “the agent says it passed” with evidence.
The practical thesis is simple: agent performance now depends more on context governance, tool control, and verification than on raw model size or raw context volume. Bigger windows help, but unfiltered context also carries stale decisions, wrong assumptions, secrets, prompt-injection payloads, and cost. The winning teams are building memory hygiene and deterministic controls into the SDLC itself.
At a Glance
| Area | 2026 pattern | Failure mode |
|---|---|---|
| Context | Short root instructions plus linked specs, ADRs, and path-local rules | One giant instruction file that rots and crowds out task context |
| AGENTS.md / CLAUDE.md / Copilot instructions | Vendor-neutral root contract bridged into tool-specific files | Duplicated instructions that drift across tools |
| Multi-agent work | One lead agent, bounded subagents, explicit file ownership | Free-form swarms editing the same files |
| MCP and tools | Default-deny allowlist, read-only first, logged calls | Unreviewed servers with broad write tools |
| Security | Agents run like junior developers with shell access, not trusted automata | Production credentials, broad network, no audit trail |
| Token economics | Route expensive models to high-ambiguity work and cheap/local models to routine work | Long-running agents burning frontier tokens on search and formatting |
| Local models | Useful for search, summarization, boilerplate, tests, and migration helpers | Assuming local models replace frontier review everywhere |
| Brownfield codebases | Combine code intelligence, deterministic migration tools, and agent review | Spending tokens on migrations that tools can already perform |
The Context Problem Is Not a Window Problem
The discussion around coding-agent context loss often starts with the right symptom and the wrong remedy. Yes, agents forget. They lose earlier constraints after compaction. They reread files. They follow stale assumptions. They sometimes ignore repository rules that were clearly stated an hour earlier.
But “more context” is only part of the answer. The harder problem is context selection.
OpenAI’s Codex engineering write-up is blunt on this point: a huge AGENTS.md becomes counterproductive because it competes with the task, code, and relevant docs. The better pattern is a short AGENTS.md as a map, with structured docs as the system of record.
Claude Code’s current docs land in the same place from another angle. CLAUDE.md can import files, exist at multiple levels, and reload root project memory after compaction, but Anthropic also tells teams to treat memory files like code: prune them, debug them, and keep them clear enough that rules do not get lost.
GitHub’s Copilot instruction model also points away from a single prompt blob. It supports organization, repository, path-specific, and agent instructions, with path-specific .github/instructions/**/*.instructions.md, .github/copilot-instructions.md, and AGENTS.md all participating in repository customization.
So the 2026 answer is not a bigger prompt. It is a three-layer context system.
Layer 1: The Canonical Repo Contract
Use root AGENTS.md as the portable, vendor-neutral contract. The open AGENTS.md project describes it as a README for agents: a predictable place for context and instructions that help coding agents work in a project.
The format also matters politically. In December 2025, OpenAI, Anthropic, and Block helped launch the Agentic AI Foundation under the Linux Foundation, with AGENTS.md, MCP, and goose as founding contributions. That does not make every tool perfectly interoperable, but it signals that repo-level agent instructions are becoming infrastructure, not a niche preference.
A good root file should stay short:
# AGENTS.md
## Operating rules
- Work on a feature branch.
- Do not push to main.
- Open a PR; do not self-merge.
- Do not edit secrets, production config, migrations, or CI without explicit approval.
## Project map
- Backend: services/api
- Frontend: apps/web
- Shared contracts: packages/contracts
- ADRs: docs/adr
- Feature specs: specs
## Commands
- Install: pnpm install
- Typecheck: pnpm typecheck
- Unit tests: pnpm test
- Lint: pnpm lint
## Definition of done
- Tests added or updated.
- Existing checks pass.
- PR includes risk, rollback, and validation evidence.
This file should not be a project encyclopedia. It is the table of contents and safety contract.
Tool-specific files then bridge into it:
# CLAUDE.md
@AGENTS.md
## Claude Code
- Use plan mode for changes touching auth, billing, or migrations.
- Ask before dependency, infra, or workflow changes.
# .github/copilot-instructions.md
Read AGENTS.md first. Follow the repository workflow and post the validation evidence in the PR.
The point is not to pretend all agents behave the same. The point is to prevent each tool from getting a different version of the project truth.
Layer 2: Path-Scoped Rules
Monorepos need local rules near the code they govern.
/
+-- AGENTS.md
+-- CLAUDE.md
+-- .github/
| +-- copilot-instructions.md
| +-- instructions/
| +-- frontend.instructions.md
| +-- database.instructions.md
| +-- security.instructions.md
+-- apps/web/AGENTS.md
+-- services/api/AGENTS.md
+-- packages/contracts/AGENTS.md
Path-specific rules are where you put facts like:
# apps/web/AGENTS.md
- Use existing components from packages/ui.
- Do not add a new CSS framework.
- Prefer server components unless client state is required.
- Before finishing: pnpm --filter web test && pnpm --filter web lint.
This keeps the root file small and lets frontend, backend, database, and security rules evolve independently. It also matches the way GitHub and Claude now expose scoped instruction surfaces.
Layer 3: Dynamic Work State
Do not put volatile work state into global instructions. Put it in structured artifacts:
specs/<feature>/
+-- spec.md
+-- plan.md
+-- tasks.md
+-- acceptance.md
+-- validation.md
docs/adr/
memory/
backlog.md
This is where the source discussion’s backlog.md instinct is correct. Agents need a durable way to recover what happened before, but not by dumping the whole backlog into every turn.
The research signal is now catching up with the practice. SWE Context Bench frames the problem directly: coding agents need to accumulate, retrieve, and apply prior experience across related repository tasks, and current benchmarks have historically treated tasks too independently.
The operating rule I would use:
Every task begins by loading the relevant spec, plan, ADRs, and nearby instructions. Nothing else gets loaded by default.
The Agentic SDLC Loop
The reliable workflow in 2026 looks like this:
flowchart LR A["Issue / transcript / requirement"] --> B["Spec"] B --> C["Plan"] C --> D["Tasks"] D --> E["Tests / acceptance checks"] E --> F["Implementation"] F --> G["Validation evidence"] G --> H["PR review"] H --> I["Spec / ADR / memory update"]
This loop is showing up everywhere under different names.
GitHub Spec Kit calls it specification-driven development: specs become the primary artifact, implementation plans translate intent into code, and acceptance scenarios become tests.
Claude Code recommends the same shape operationally: explore, plan, implement, then commit, with planning used when scope is uncertain or multi-file.
GitHub Copilot cloud agent guidance emphasizes good task descriptions, MCP for additional context, and custom agents for recurring workflows with focused expertise and scoped tools.
The important caveat: tests are necessary, but not sufficient.
UTBoost’s evaluation of SWE-bench found insufficient tests in real benchmark tasks and hundreds of erroneous patches that were incorrectly labeled as passing. That maps directly to production work: a green test suite is not a proof of correctness when tests are incomplete.
So the definition of done for agent work should include:
- Unit, integration, and relevant end-to-end checks.
- Lint and typecheck.
- Changed-files summary with rationale.
- Risk and rollback notes.
- Human review before merge.
- For UI, browser or visual evidence.
- For security-sensitive work, secret scan, dependency scan, and dedicated review.
Multi-Agent Orchestration: Specialists, Not a Swarm
The productive multi-agent pattern is not “launch seven agents and hope.” It is bounded specialization with explicit ownership.
Good roles:
| Role | Owns | Should not own |
|---|---|---|
| Lead / orchestrator | Task plan, branch, integration, final validation | Blindly accepting subagent patches |
| Researcher | Source discovery, code archaeology, constraints | Editing production code |
| Planner | Implementation plan, file ownership, risk list | Unreviewed implementation |
| Backend implementer | API, services, tests in assigned paths | Frontend or infra without handoff |
| Frontend implementer | UI files, browser checks, visual fixes | Backend contract changes without agreement |
| QA agent | Test gaps, reproduction, regression checks | Declaring success from source reading alone |
| Security reviewer | Threats, permissions, secret exposure, dependency risk | Shipping the change |
| Migration agent | Deterministic refactoring tool runs, diff inspection | Rewriting thousands of lines manually with tokens |
Claude’s own best-practice docs now describe parallel sessions, worktrees, writer/reviewer patterns, and allowed-tool fan-out for batch work.
GitHub custom agents are also moving this direction. Agent profiles are Markdown files with YAML front matter, and the tools field controls what the agent can access, including MCP tools. If a custom agent inherits every configured tool, that is a design smell for enterprise use.
A practical agent assignment should look closer to this:
role: frontend-implementer
allowed_paths:
- apps/web/**
- packages/ui/**
denied_paths:
- infra/**
- .github/workflows/**
- secrets/**
allowed_tools:
- read
- edit
- shell:test
- playwright-local
requires_approval:
- dependency changes
- migrations
- external network
The two safest orchestration patterns are:
- Single lead, bounded subagents: the lead owns the plan and branch; subagents inspect, test, or propose patches; the lead integrates; CI validates; humans review.
- Parallel worktrees: backend, frontend, and QA agents work in isolated checkouts with disjoint write scopes, then the lead reconciles.
MCP Is the Tool Plane, So Treat It Like One
MCP is becoming the standard integration plane for agent tools and data. The official specification defines clients and servers exchanging JSON-RPC messages, with servers exposing resources, prompts, and tools, and with authorization support for HTTP transports.
That is powerful. It is also the place where a coding agent crosses from “suggesting text” into “taking action.”
GitHub’s Copilot cloud agent docs say Copilot can use tools from MCP servers autonomously once they are configured, and warn teams to review third-party MCP servers and explicitly restrict the tools field to the necessary tooling.
OWASP’s MCP Security Cheat Sheet is more concrete. It calls out tool poisoning, rug-pull attacks, tool shadowing, supply-chain compromise, message tampering, and sandbox escapes. It also recommends logging MCP tool invocations with parameters, user context, and timestamps.
The enterprise policy should be boring:
mcp_policy:
default: deny
allowed_servers:
- github-readonly
- gitlab-readonly
- sourcegraph-code-search
- playwright-local
- searxng-internal
write_servers:
- github-pr-only
forbidden:
- arbitrary-shell-over-mcp
- unreviewed-public-mcp
- personal-google-drive
- production-database
required_controls:
- owner
- purpose
- data_classification
- allowed_repositories
- allowed_actions
- auth_scope
- logging_location
- review_date
Read-only first. Write tools only where the action is reviewable, reversible, and logged.
Security Model: Junior Developer With Shell Access
The safe mental model is:
A coding agent is a fast junior developer with shell access and unreliable judgment.
That sounds harsh, but it leads to good controls:
- No production credentials.
- No direct production database access.
- No broad organization tokens.
- No unreviewed third-party MCP servers.
- No self-merge.
- No protected-file edits without approval.
- Ephemeral dev environments or containers.
- Branch protections and required review.
- Secret scanning, SAST, dependency scanning, and tests before merge.
- Prompt, tool, shell, file-write, and network-call logs.
OWASP’s Agentic Applications Top 10 exists because autonomous systems now plan, act, and make decisions across workflows. That is exactly what coding-agent pipelines do.
GitHub’s own cloud-agent documentation is a useful example of mature controls: restricted repository scope, branch constraints, secrets only from the dedicated copilot environment, signed commits, session-log links, default firewalling, CodeQL, secret scanning, and dependency analysis. It also states the limitation plainly: generated code still requires review and testing, especially in critical or sensitive applications.
The firewall is a mitigation, not a total boundary. GitHub documents that the cloud-agent firewall applies to processes started via the agent’s Bash tool and does not cover MCP servers or configured setup steps, so MCP governance and setup-step review still matter.
For regulated or critical-infrastructure environments, OWASP should not be the only reference point. NIST’s AI RMF is broader governance scaffolding for trustworthiness and risk management, and NIST started work in April 2026 on an AI RMF profile for trustworthy AI in critical infrastructure. That matters because enterprise coding agents are not just developer tools once they can read repositories, call tools, and affect release pipelines.
Hooks are useful, but they are not magic. Claude Code hooks run commands with the user’s full permissions, so they must be reviewed like any other script with filesystem access.
The correct split is:
| Mechanism | Use it for |
|---|---|
| Instructions | Intent and conventions |
| Skills | Repeatable procedures |
| Hooks | Deterministic enforcement and audit |
| CI | Objective validation |
| Human review | Accountability |
Do not rely on a prompt for a rule that can be enforced by a hook, script, branch protection rule, or CI check.
Token Economics Changed the Architecture
The cost discussion is no longer theoretical. GitHub announced that all Copilot plans move to usage-based billing on June 1, 2026, replacing premium request units with GitHub AI Credits calculated from input, output, and cached tokens using listed model API rates.
That changes agent design.
A quick chat and a multi-hour autonomous session cannot be treated as the same unit of work anymore. Long-running agents that reread the repo, summarize logs repeatedly, spawn subagents casually, and run frontier models for mechanical tasks will turn into visible cost centers.
The practical model-routing policy:
| Use frontier models for | Use cheaper or local models for |
|---|---|
| Ambiguous architecture | File discovery |
| Root-cause analysis across modules | Log summarization |
| Security-sensitive changes | Formatting and markdown cleanup |
| API and data-model design | Boilerplate generation |
| Migration planning | Search over known code patterns |
| Final PR review | Mechanical migrations with deterministic tools |
Caching helps, but only if the cached prefix is stable. Anthropic’s prompt-caching docs show the economics clearly: cache writes cost more than base input, cache reads cost less, and the default cache lifetime is short unless you explicitly pay for longer duration.
So context strategy becomes cost strategy:
- Cache stable tool schemas, system prompts, architecture summaries, and coding standards.
- Do not cache volatile task state, timestamps, failed assumptions, or giant unfiltered documents.
- Track cost per PR, cost per accepted change, and cost per failed run.
- Limit subagent fan-out.
- Restart or compact around task boundaries instead of carrying stale context forever.
Local Models Are Useful, But Not a Religion
Local execution is now practical for many auxiliary agent tasks.
llama.cpp supports local inference with an OpenAI-compatible server, GGUF models, quantization from very low bit widths through 8-bit, and hardware backends including Metal, CUDA, Vulkan, SYCL, and CPU/GPU hybrid execution.
Open-weight coding models are also becoming more agent-shaped. The Qwen3-Coder-Next technical report describes an open-weight coding model trained for coding-agent workflows, with efficient active-parameter inference and executable-environment training. Those are vendor and paper claims, but the direction is clear: local models are no longer only autocomplete toys.
The best enterprise use is not “replace frontier models.” It is measured routing.
Create an internal benchmark:
| Task type | Count |
|---|---|
| Real bug fixes | 20 |
| Code search and navigation | 20 |
| Test generation | 10 |
| Refactoring | 10 |
| Documentation and summarization | 10 |
Measure pass rate, human edit distance, wall time, cost, and security violations. Then route by observed performance in your repo, not by leaderboard reputation.
Brownfield Codebases Need Tools, Not Tokens
For 10-year-old enterprise systems, the agent should not be the thing that rewrites everything by hand. It should be the orchestrator around code intelligence, deterministic migration tools, tests, and review.
The minimum stack for brownfield work:
| Need | Better primitive | Agent role |
|---|---|---|
| Cross-repo discovery | Sourcegraph, ripgrep, language servers, code indexes | Find relevant files, summarize architecture, explain risk |
| Mechanical Java migrations | OpenRewrite recipes | Select recipe, run it, inspect diffs, fix residual failures |
| .NET modernization | GitHub Copilot modernization chat agent / Visual Studio tooling | Produce plan, apply targeted fixes, validate each commit |
| Large refactors | Batch-change tooling and worktrees | Split ownership, run tests, reconcile conflicts |
| Regression safety | CI, generated tests, production traces | Expand checks before trusting the patch |
OpenRewrite is the clearest example: its Java 21 migration guide shows an automated recipe path from Java 17 to Java 21, with Gradle and Maven integration. That is exactly the kind of deterministic transformation an agent should invoke rather than reimplement with tokens.
Microsoft’s .NET docs now say .NET Upgrade Assistant is officially deprecated and recommend the GitHub Copilot modernization chat agent in Visual Studio 2026 or Visual Studio 2022 17.14.16+. The interesting detail is not the branding change; it is the operating model: analyze projects and dependencies, produce a migration plan, apply automated fixes, and commit each step so humans can validate or roll back.
The rule for migrations:
Use deterministic tools for the broad mechanical change. Use agents for planning, orchestration, residual repair, tests, and review.
Research and Crawling Need a Safe Retrieval Stack
Agents need external knowledge, but every external page is untrusted input.
A safe retrieval stack looks like this:
- Internal docs and code search.
- Approved GitHub/GitLab/issue-tracker MCP.
- Internal code intelligence such as Sourcegraph.
- Self-hosted metasearch such as SearXNG.
- Approved web search API.
- Firecrawl or Playwright for pages requiring scraping or rendering.
- Human approval for scripts, downloads, auth flows, or untrusted execution.
Sourcegraph’s MCP server gives agents programmatic access to code search, navigation, and analysis capabilities from a Sourcegraph instance.
SearXNG is useful when teams want a self-hostable metasearch layer that aggregates engines without storing user information.
Firecrawl provides search, scrape, and interact capabilities with LLM-ready markdown, structured JSON, screenshots, and MCP integration.
Playwright MCP is useful for browser automation because its default mode works through accessibility snapshots rather than coordinate guessing; vision mode can be enabled when needed.
GitLab’s MCP server is also moving into the enterprise tool plane, with OAuth Dynamic Client Registration and explicit warnings that users are responsible for guarding against prompt injection and should use MCP tools only on trusted GitLab objects.
The rule is consistent across these tools:
Treat retrieved content as data, not instructions.
That includes web pages, PDFs, README files, issue comments, scraped markdown, and MCP tool results.
A Strong 2026 Agent-Ready Repo
This is the template I would use for a serious brownfield or enterprise codebase:
/
+-- AGENTS.md
+-- CLAUDE.md
+-- .github/
| +-- copilot-instructions.md
| +-- instructions/
| | +-- frontend.instructions.md
| | +-- backend.instructions.md
| | +-- database.instructions.md
| | +-- security.instructions.md
| +-- agents/
| +-- qa.agent.md
| +-- security-reviewer.agent.md
| +-- migration-agent.agent.md
+-- specs/
| +-- <feature>/
| +-- spec.md
| +-- plan.md
| +-- tasks.md
| +-- acceptance.md
| +-- validation.md
+-- docs/
| +-- adr/
| +-- architecture.md
| +-- runbooks/
+-- memory/
| +-- MEMORY.md
| +-- project-map.md
| +-- recurring-decisions.md
+-- scripts/
| +-- agent-check.sh
| +-- test-affected.sh
| +-- summarize-diff.sh
+-- mcp/
+-- approved-servers.md
+-- policy.yaml
+-- threat-model.md
The key is that the repository itself becomes the agent’s operating system. The agent does not need to remember everything in conversation because the durable state is in files, and the dangerous actions are governed by scripts, hooks, CI, branch protections, and review.
Cross-Validation Table
| Claim | Sources that agree | Caveat |
|---|---|---|
| Repo-level instructions are now a core primitive | GitHub Copilot docs, OpenAI Codex guidance, Claude memory docs, AGENTS.md format | Keep them short; nested and path-scoped files beat monoliths. |
| Correct reusable context improves agents | SWE Context Bench, Claude memory docs, GitHub instruction surfaces | Incorrect or unfiltered context can be neutral or harmful. |
| Spec-driven development is the safer alternative to vibe coding | GitHub Spec Kit, Claude best practices, GitHub cloud-agent workflows | Specs still need review; vague specs become another failure point. |
| Multi-agent work needs role and tool boundaries | Claude parallel-session guidance, GitHub custom agents | Full autonomy remains unreliable for production-quality work. |
| MCP is powerful but high-risk | MCP spec, GitHub MCP docs, GitLab MCP docs, OWASP MCP guidance | Treat tool descriptions, tool results, and retrieved content as untrusted. |
| Agentic usage requires cost governance | GitHub usage-based billing, Anthropic prompt caching, local-model tooling | Pricing and model availability can change quickly. |
| Local models are useful for auxiliary work | llama.cpp, Qwen3-Coder-Next research, internal benchmark strategy | Validate in your own repo; vendor benchmarks are not enough. |
| Tests are essential but insufficient | GitHub/Claude/OpenAI validation guidance, UTBoost research | Passing tests can still miss semantically wrong patches. |
Recommended Adoption Path
Phase 1: make one repo agent-ready.
Implement AGENTS.md, .github/copilot-instructions.md, CLAUDE.md importing AGENTS.md, path-specific instructions, a basic MCP allowlist, branch rules, CI validation, and a PR template with validation evidence.
Success metric:
An agent can pick up a small bug, find the relevant files, implement the fix, run checks, and open a PR without being reminded of branch, test, or review rules.
Phase 2: add the spec-driven workflow.
Implement specs/<feature>/spec.md, plan.md, tasks.md, and validation.md.
Success metric:
Every non-trivial agent change has acceptance criteria, a test plan, and validation evidence.
Phase 3: add bounded agents.
Add a QA agent, security-reviewer agent, migration/refactoring agent, and documentation agent with explicit tools and path ownership.
Success metric:
Subagents reduce lead-agent context load and improve review quality without causing file conflicts or runaway cost.
Phase 4: enterprise hardening.
Add an MCP gateway or allowlist, tool-call audit logs, sandbox policy, secret isolation, cost telemetry, model-routing policy, and prompt-injection/red-team tests.
Success metric:
Security can answer who gave which agent what access, what it did, what it changed, what it spent, and who approved the merge.
Operating Rules
Context rules:
- Root instructions fit in roughly two pages.
- Path-specific instructions live near code.
- Specs hold task state.
- ADRs hold decisions.
- Memory files hold reusable lessons.
- Stale instructions are pruned monthly.
- The backlog is referenced selectively, not dumped wholesale.
Agent rules:
- One owner per task.
- One worktree per implementation agent.
- Subagents inspect, test, or patch within bounded scope.
- The lead integrates.
- No agent self-merges.
- No production credentials.
MCP rules:
- Default deny.
- OAuth or tightly scoped tokens where possible.
- Read-only first.
- Explicit
toolsallowlists. - Log every tool call.
- Treat tool output as untrusted.
- Human approval for destructive or write operations.
Cost rules:
- Frontier models for ambiguity and high-risk decisions.
- Cheaper or local models for search, formatting, summarization, boilerplate, and repetitive checks.
- Cache stable prefixes only.
- Track cost per PR and failed run.
- Cap subagent fan-out.
Verification rules:
- Tests first for new behavior.
- CI is the source of truth.
- LLM review is advisory.
- UI changes need browser or visual evidence.
- Security-sensitive changes need dedicated review.
Final Take
The 2026 agentic SDLC is not “the model writes the code and the humans disappear.” It is closer to an engineering control plane:
- Instructions tell agents where they are.
- Specs tell them what done means.
- Tools tell them what they can touch.
- Hooks and CI enforce rules.
- Logs make actions reviewable.
- Humans keep merge authority.
The teams that win will not be the ones with the biggest context window. They will be the ones with the cleanest context system, the least ambiguous tasks, the tightest tool boundaries, and the fastest path from agent output to trustworthy evidence.
Sources
- [S1] GitHub Blog: Copilot is moving to usage-based billing
- [S2] GitHub Docs: About customizing GitHub Copilot responses
- [S3] GitHub Docs: Creating custom agents for Copilot cloud agent
- [S4] GitHub Docs: Adding agent skills for GitHub Copilot
- [S5] AGENTS.md open format
- [S6] OpenAI: Harness engineering - leveraging Codex in an agent-first world
- [S7] Claude Code Docs: How Claude remembers your project
- [S8] Claude Code Docs: Best practices
- [S9] GitHub Docs: MCP and Copilot cloud agent
- [S10] Claude Code Docs: Parallel sessions and worktrees
- [S11] Model Context Protocol specification overview
- [S12] SWE Context Bench
- [S13] UTBoost: Rigorous Evaluation of Coding Agents on SWE-Bench
- [S15] OWASP MCP Security Cheat Sheet
- [S16] OWASP Top 10 for Agentic Applications 2026
- [S17] Claude Code Docs: Hooks security considerations
- [S18] Anthropic Docs: Prompt caching
- [S19] llama.cpp
- [S20] Qwen3-Coder-Next Technical Report
- [S21] OpenAI: Agentic AI Foundation
- [S22] GitHub Spec Kit: Specification-Driven Development
- [S23] Sourcegraph MCP Server
- [S24] SearXNG documentation
- [S25] Firecrawl documentation
- [S26] Playwright MCP vision mode
- [S27] GitLab MCP server
- [S28] DeerFlow GitHub repository
- [S29] Hermes Agent terminal backend documentation
- [S30] OpenClaw security model documentation
- [S31] GitHub Docs: Responsible use of Copilot cloud agent
- [S32] GitHub Docs: Customizing the Copilot cloud-agent firewall
- [S33] OpenRewrite Docs: Migrate to Java 21
- [S34] Microsoft Learn: .NET Upgrade Assistant overview
- [S35] NIST AI Risk Management Framework
- [S36] NIST Concept Note: AI RMF Profile for Critical Infrastructure