The Declarative Agent Trap

Introduction

In 1986, Fred Brooks argued that the shift from assembly language to high-level programming languages delivered at least a five-fold productivity gain. The AI agent industry is making a similar move right now: replacing imperative Python scripts with declarative text files that describe what an agent should do, not how. A PayPal study found the approach cut development time by 67% and compressed 500-line workflows into 50. The Linux Foundation formed a consortium. Every major vendor shipped a framework. Consensus was reached.

Here is the problem with consensus: it makes people stop thinking about the hard parts. The industry adopted declarative agent definitions because the productivity gains are real. But 36.8% of community-contributed agent skills contain security flaws. Skill-based prompt injection succeeds 95.1% of the time. Declarative files are loaded into context as trusted instructions, so they bypass the guardrails that protect against untrusted input. The same simplicity that lets a product manager write a SKILL.md file lets an attacker write a malicious one.

This article is not an argument against declarative agents. The evidence for them is overwhelming. It is an argument that the industry's current adoption pattern, grabbing the productivity gains while ignoring the security and governance implications, is setting up a wave of failures that will discredit the approach before it matures.

Key Findings

What the Research Shows

The productivity gains are real and large. PayPal's declarative DSL cut development from 48 hours to 16, reduced modification time by 76%, and compressed 500+ lines into under 50.
Adoption outpaced security by 18 months. 60,000+ projects adopted AGENTS.md before the first systematic security audit. Snyk found 36.8% of community skills contain flaws. Cisco scanned 31,000 and found 26% vulnerable.
Skill-based injection is nearly undefendable. SkillJect achieves 95.1% attack success, compared to 10.9% for direct command injection. Skills are loaded as trusted instructions, bypassing existing guardrails.
Version control is governance, not just convenience. When agent behavior lives in text files, every change is a commit with an author and a diff. ISACA and Microsoft now mandate this level of auditability.
Domain experts can author behavior — and attackers can too. The same low barrier that lets a PM write a SKILL.md file let one threat actor upload 677 malicious packages to ClawHub with only a week-old GitHub account.
Every major vendor shipped a framework. Microsoft, Salesforce, IBM, Stanford's DSPy, and Anthropic all have declarative agent tooling. The Linux Foundation formed a consortium to standardize it.
Declarative has hard limits. Complex decision trees, iterative reasoning, and workflows exceeding context windows still require imperative code. The industry knows this but rarely says it.

Detailed Analysis

The Assembly Language of AI

The core principle is old. Declarative programming expresses what you want computed, not how to compute it. SQL is declarative: you describe the data you want. A Python for-loop is imperative: you describe how to fetch it step by step.

Applied to AI agents, this means specifying the inputs, outputs, and constraints of a task rather than scripting prompt templates, API calls, and control flow. DSPy, the Stanford project whose creators compare their work to the shift from assembly to C, embodies this at the framework level. DSPy lets developers write "signatures" (natural-language typed declarations like "consume questions and return answers") instead of crafting prompt strings. An optimizer then automatically refines the underlying prompts by maximizing task-specific metrics.

The results are striking. In one demonstration, a DSPy optimizer raised a ReAct agent's accuracy from 24% to 51% with no human prompt authoring. For comparison, LangChain's codebase in September 2023 contained over 50 strings exceeding 1,000 characters and 54 files dedicated to prompt engineering. DSPy's codebase contains zero hand-written prompts.

IBM's Prompt Declaration Language takes a complementary path: a YAML-based format with 15 block types that handles chat templates, message accumulation, and model interactions declaratively. Salesforce built Agent Script, a DSL that compiles into an "Agent Graph" consumed by its Atlas Reasoning Engine. Microsoft's Foundry workflows offer bidirectional visual and YAML editing where each save creates an immutable version integrated with CI/CD pipelines.

The pattern is consistent across vendors: describe what the agent should accomplish, and let the framework handle execution.

The Productivity Story That Ended the Conversation

PayPal's internal declarative language now processes millions of daily e-commerce interactions across product search, personalization, and cart management. What took 500+ lines of imperative code fits in under 50 lines of specification, with sub-100ms runtime overhead. Numbers like these made the adoption decision feel obvious. Nobody asked what they were trading for the convenience.

Skills in Markdown

The framework layer handles how agents execute. A separate layer, built on Markdown files, handles what agents know how to do.

Anthropic published the Agent Skills specification on December 18, 2025, defining a skill as a directory containing a SKILL.md file: YAML frontmatter for metadata, Markdown body for instructions. The design uses progressive disclosure. Agents load only a skill's name and description at startup, then pull the full content into context when the skill becomes relevant. The recommended body size is under 500 lines with fewer than 5,000 tokens.

Simon Willison called the specification "deliciously tiny." Microsoft, OpenAI, GitHub, Atlassian, Figma, and Cursor adopted it within weeks. Skills written for Claude Code work in GitHub Copilot by copying SKILL.md files between directories.

AGENTS.md serves a complementary role: a project-level README for AI agents, adopted by over 60,000 open-source projects since August 2025. Martin Fowler's team observed that nearly all forms of AI coding context engineering "ultimately involve a bunch of markdown files with prompts."

The distinction between these two layers matters. DSPy operates at the programming level, abstracting how LLM calls are optimized. SKILL.md operates at the knowledge level, encoding what an agent can do in a specific domain. Both are declarative. Both are typically used together.

The practical consequence: domain experts (product managers, compliance officers, customer support leads) can define and modify agent behavior through the same pull request workflow that engineers use for code. dbt's team invested dozens of hours of focused expert work per skill. Mintlify auto-generates skill.md files for documentation sites, regenerating them with every update. Distribution hubs and CLI tools make installation a one-command operation. The barrier to authoring agent behavior is now literacy, not programming skill.

Read that last sentence again. Literacy is the only prerequisite for defining what an AI agent does in production. That is simultaneously the strongest argument for declarative agents and the clearest description of their attack surface.

Version Control as Governance

When agent behavior lives in text files, it inherits the full power of version control.

Every change becomes a commit. Every commit has an author, a timestamp, and a diff. Teams can review behavioral changes through pull requests, roll back to previous versions, and maintain audit trails showing exactly what an agent was instructed to do at any point in time.

This is not a convenience; it is becoming a compliance requirement. ISACA argues that agent logic should be maintained under version control and audited with the same rigor as business logic. Microsoft mandates that actions of every agent must be auditable. Decagon's Agent Versioning system applies CI/CD principles where every agent change is a versioned commit, with product teams using a console for reviewing diffs and rolling back.

The institutional infrastructure is forming to support this. The Agentic AI Foundation, established December 2025 under the Linux Foundation, governs foundational projects including MCP, goose, and AGENTS.md. Platinum members include AWS, Anthropic, Google, Microsoft, and OpenAI.

For regulated industries, this matters enormously. When a regulator asks "what was your AI doing on March 15th?", a git log provides a precise, verifiable answer. Imperative code scattered across application logic cannot offer the same clarity.

The Governance That Exists on Paper

Microsoft's declarative agents must pass Responsible AI validation checks before deployment. Combined with version-controlled definitions, this creates an auditable chain from specification to production. The tooling is there. The question is how many teams outside Microsoft actually use it. Snyk's data suggests the answer is: not nearly enough.

The Security Problem

Ease of authoring creates a corresponding ease of attack. The same simplicity that lets domain experts write agent skills in Markdown also lets attackers craft malicious ones.

Snyk's ToxicSkills study analyzed nearly 4,000 agent skills and found that 13.4% contain critical-level security issues: malware distribution, prompt injection, and exposed secrets. At any severity level, 36.8% have at least one flaw. Cisco's broader scan of 31,000 skills found vulnerabilities in 26%.

The attack surface is novel and potent. The SkillJect framework demonstrated that skill-based prompt injection succeeds 95.1% of the time, compared to only 10.9% for direct command injection. Skills are loaded into context as trusted instructions, so malicious content in a SKILL.md file bypasses the guardrails that protect against untrusted user input. Traditional antivirus cannot detect this: the file is plain text, and malicious commands exist only momentarily during execution.

Community marketplace infrastructure is not yet equipped to handle this. A single threat actor uploaded 677 malicious packages to the ClawHub marketplace, exploiting a minimum barrier of just a one-week-old GitHub account. Cisco responded by releasing an AI Defense Skill Scanner with end-to-end supply chain security. The security community is catching up, but the gap between adoption velocity and security tooling is real.

This is not an argument against declarative definitions. Imperative code has security vulnerabilities too. But the declarative approach concentrates trust in a small number of text files that are easy to modify and hard to audit automatically. Organizations adopting this paradigm need to treat skill files with the same security posture they apply to dependency management: signed sources, automated scanning, and human review for anything that enters production.

Where This Goes

Gartner predicts that over 40% of agentic AI projects will be canceled by end of 2027 due to escalating costs, unclear value, or inadequate controls. Declarative agent definitions directly address two of those three factors. They reduce costs by compressing development cycles. They improve controls by making agent behavior auditable and version-controlled. But "inadequate controls" is doing a lot of work in that sentence, and it is precisely the factor that declarative adoption has not solved.

The approach has clear limits. Complex decision trees, iterative reasoning loops, and workflows that exceed context window constraints (roughly one million tokens today, while enterprise monorepos routinely exceed that) still require imperative code. Microsoft's documentation acknowledges this explicitly. Declarative is a powerful default, not a universal solution.

The trajectory is unmistakable. When Fred Brooks documented the productivity gains of moving from assembly to high-level languages, the industry did not go back. The shift from hand-crafted prompts to declarative text files follows the same logic. But Brooks also noted that the move to high-level languages introduced an entire class of new vulnerabilities that took decades to address. We are at month 18 of a similar transition, with adoption curves moving faster than security tooling. The declarative agent pattern will win. The question is how many breaches happen before the industry treats SKILL.md files with the same rigor it learned to apply to package.json.

Timeline

Key Events

Oct 2023 DSPy paper published by Stanford NLP, introducing declarative signatures and automated prompt optimization.
Oct 2024 IBM releases Prompt Declaration Language, a YAML-based declarative prompt programming framework.
Aug 2025 OpenAI releases AGENTS.md, a Markdown-based project specification for AI agents.
Oct 2025 Microsoft announces Agent Framework with YAML/JSON declarative agent definitions.
Oct 2025 Anthropic launches Skills for Claude with the SKILL.md format.
Dec 2025 Salesforce announces Agent Script powered by Atlas Reasoning Engine.
Dec 2025 Agentic AI Foundation formed under the Linux Foundation with MCP, goose, and AGENTS.md as founding projects.
Dec 2025 Agent Skills published as open standard; adopted by Microsoft, OpenAI, GitHub, Atlassian, Figma, and Cursor.
Dec 2025 PayPal publishes declarative language study showing 67% development time reduction and 10x code compression.
Feb 2026 Snyk ToxicSkills study finds 36.8% of agent skills contain security flaws.
Feb 2026 SkillJect paper demonstrates 95.1% attack success rate via skill-based prompt injection.

The Declarative Agent Trap

What the Research Shows

The Assembly Language of AI

Skills in Markdown

Version Control as Governance

The Security Problem

Where This Goes

Key Events

References

Standards & Official Documentation

Academic Research

Security Research

Industry Analysis & Commentary

Run Your Own Research