Skill Architecture Patterns — Lessons from Open-Source Repos

Structural patterns for how skills are organized, packaged, and composed — extracted from Anthropic’s official skills repo and ComposioHQ’s awesome-claude-skills.

Data sources:


1. The Canonical Skill Structure

Both repos converge on this minimal structure:

skill-name/
├── SKILL.md          # REQUIRED: YAML frontmatter + instructions
├── LICENSE.txt        # OPTIONAL: per-skill license
├── scripts/           # OPTIONAL: helper scripts (Python, Bash, JS)
├── reference/         # OPTIONAL: domain knowledge files
├── templates/         # OPTIONAL: output templates
├── examples/          # OPTIONAL: example inputs/outputs
├── agents/            # OPTIONAL: sub-agent definitions
├── assets/            # OPTIONAL: fonts, images, schemas
└── themes/            # OPTIONAL: pre-built configurations

SKILL.md Anatomy

---
name: lowercase-hyphenated-name
description: One line explaining WHEN to activate and WHAT it does
license: "Complete terms in LICENSE.txt" | "Proprietary" | omitted
---

# Skill Title (optional — some skills skip this)

[Body: instructions, workflows, constraints, examples]

Key Design Decisions

Decision Anthropic Official Community
name format lowercase-hyphenated lowercase-hyphenated (enforced by CONTRIBUTING.md)
description purpose Trigger condition for Claude Trigger condition (same intent)
license field 15/17 skills include it ~10% include it
Body size 30-590 lines 30-500 lines
Sections Varies by complexity 5-section template recommended

2. Description Writing — The Trigger Mechanism

The description field is the most important line in any skill. It determines when Claude activates the skill. Analysis of 50+ descriptions reveals distinct patterns:

Pattern A: Action + Domain + Trigger Keywords

description: >-
  Guide for creating high-quality MCP servers that enable LLMs to
  interact with external services. Use when building MCP servers to
  integrate external APIs.

Structure: {what it does}. {when to use it}.

Pattern B: Capability List

description: >-
  Toolkit for interacting with and testing local web applications
  using Playwright. Supports verifying frontend functionality,
  debugging UI behavior, capturing browser screenshots, and viewing
  browser logs.

Structure: {primary capability}. Supports {capability}, {capability}, {capability}.

Pattern C: Negative Boundary

description: >-
  Suite of tools for creating elaborate HTML artifacts using React,
  Tailwind CSS, shadcn/ui. Use for complex artifacts requiring state
  management — not for simple single-file HTML artifacts.

Structure: {what it does}. Use for {scope} — not for {anti-scope}.

Pattern D: Emotional Hook (Community)

description: >-
  Automatically creates user-facing changelogs from git commits.
  Turns hours of manual changelog writing into minutes of automated
  generation.

Structure: {what it does}. {value proposition — time saved}.

Best Practice

Combine A + C: State the action, the domain, the trigger condition, AND the negative boundary.

description: >-
  Migrate .NET Framework projects to .NET Core multi-targeting.
  Use when the user says "start migration" or provides a .csproj path.
  NOT for greenfield .NET projects or projects already on .NET Core.

3. The Five-Section Community Template

From awesome-claude-skills/CONTRIBUTING.md — the recommended sections for any skill:

# {Skill Name}

## When to Use
- [Trigger conditions — bullet list]
- Keywords: [comma-separated trigger words]

## What This Skill Does
- [Capabilities — bullet list]
- [Limitations — explicit boundaries]

## How to Use
### Basic Usage
[Simplest invocation]

### Advanced Usage
[With options/flags]

## Example
[Complete, realistic input → output]

## Tips
- [Power user advice]
- [Common pitfalls]
- [Related workflows]

Analysis: Which Sections Matter Most?

Section Impact on Quality Prevalence
How to Use (steps) Critical — provides the execution path ~90% of skills
Example (populated) High — grounds the output format ~65% of skills
When to Use (scope) Medium — prevents mis-activation ~40% of skills
Tips (pitfalls) Medium — prevents common errors ~50% of skills
What This Skill Does Low — redundant with description ~30% of skills

4. Marketplace Packaging

Anthropic’s marketplace.json groups 17 skills into 3 installable packages:

{
  "plugins": [
    {
      "name": "document-skills",
      "skills": ["xlsx", "docx", "pptx", "pdf"]
    },
    {
      "name": "example-skills",
      "skills": ["algorithmic-art", "brand-guidelines", "canvas-design", ...]
    },
    {
      "name": "claude-api",
      "skills": ["claude-api"]
    }
  ]
}

Grouping Strategy

Strategy When to Use Example
By domain Skills share resources (schemas, helpers) document-skills: xlsx + docx + pptx + pdf
By audience Skills target same user type example-skills: creative + testing + building
Standalone Skill is self-contained and specialized claude-api

Resource Sharing Within Packages

The document-skills package shares:

This reduces total package size vs. 4 independent skills.


5. Script Integration Patterns

Pattern A: Self-Documenting Script (–help)

Always run `python scripts/with_server.py --help` before first use.
Use the script as a black box — do not modify it.

Used by: webapp-testing, skill-creator

Pattern B: Init + Build Pipeline

1. Run `bash scripts/init-artifact.sh` (creates project structure)
2. [User develops in the created structure]
3. Run `bash scripts/bundle-artifact.sh` (produces single output file)

Used by: web-artifacts-builder, artifacts-builder

Pattern C: Validation Script

Before packaging, run validation:
`python scripts/package_skill.py --validate`
Checks: frontmatter present, description non-empty, no broken references

Used by: skill-creator

Pattern D: Data Processing Script

Use bundled scripts for file processing:
- `scripts/read_pdf.py` — extract text from PDF
- `scripts/create_pdf.py` — generate PDF with reportlab
- `scripts/merge_pdfs.py` — combine multiple PDFs

Used by: pdf, docx, pptx, xlsx


6. Reference File Organization

Flat References (simple skills)

mcp-builder/
├── SKILL.md
└── reference/
    ├── mcp_best_practices.md
    ├── node_mcp_server.md
    ├── python_mcp_server.md
    └── evaluation.md

Hierarchical References (complex skills)

claude-api/
├── SKILL.md
├── shared/           # Cross-language
│   ├── models.md
│   ├── prompt-caching.md
│   └── tool-use-concepts.md
├── python/
│   └── claude-api/
│       ├── README.md
│       ├── streaming.md
│       └── tool-use.md
├── typescript/
│   └── claude-api/
│       ├── README.md
│       └── ...
└── [5 more languages]

Theme/Variant References

theme-factory/
├── SKILL.md
└── themes/
    ├── arctic-frost.md
    ├── ocean-depths.md
    ├── sunset-boulevard.md
    └── [7 more themes]

Key Insight: Reference Files as Progressive Disclosure Tier 3

All reference files follow the Progressive Disclosure pattern (Pattern 100):


7. Sub-Agent Architecture

Only 2 of 49 non-Composio skills use sub-agents, both from Anthropic:

skill-creator: 3 Sub-Agents

Agent Role When Spawned Memory
grader Evaluate test results After each eval run Isolated (no skill context)
comparator Blind A/B comparison After two versions run Isolated (doesn’t know version labels)
analyzer Extract improvements After comparison Has both transcripts + results

doc-coauthoring: 1 Sub-Agent

Agent Role When Spawned Memory
naive-reader Fresh-eyes review After document draft Isolated (only has the document)

Key Architectural Rule

Sub-agents get LESS context, not MORE. Their value comes from what they DON’T know:


8. Composio’s Universal Template

832 skills use a near-identical template with only 3 variables:

Variable Example Values
{service_name} “slackbot”, “atlassian”, “googledrive”
{toolkit_name} “SLACKBOT”, “ATLASSIAN”, “GOOGLEDRIVE”
{specific_tools} Varies — some skills list tools, most don’t

Template Quality Tiers

Tier Count Characteristic
Generic ~780 3-step (Search → Connect → Execute), no specific tools listed
Enriched ~50 Lists specific tool names, parameter tables, common patterns
Custom ~2 Fully custom content beyond the template

Lesson: Templates Enable Scale, Not Quality

Composio achieved 832 skills by templating. But the generic skills are low-quality — they provide no domain-specific guidance. The enriched variants (like googledocs-automation) that list specific tools and pitfalls are significantly more useful.

For our plugin: Templates are useful for generating boilerplate (PR descriptions, branch names) but the high-value content (error handling rules, fix strategies) must be hand-crafted per domain.


9. Cross-Repo Comparison Summary

Dimension Anthropic Official Community Our Plugin (migration plugin)
Skills count 17 32 unique + 832 template 21 skills + 3 agents + 4 commands
Avg SKILL.md size ~200 lines ~120 lines ~150 lines
Progressive Disclosure Explicit 3-tier Rare Explicit (<always-loaded> / <on-demand>)
Sub-agents 2 skills use them 0 skills 3 agents (build-repair, ado, dependency)
Script bundling 5 skills (Python, Bash) 0 skills 2 skills (Python)
Eval framework Built-in (skill-creator) None dev-prompt-review (custom)
Safety boundaries Provider guards, anti-slop Minimal Extensive (7 guardrails, 4 tiers, 99 patterns)
Domain knowledge SDK docs, font catalogs Algorithm internals, SaaS APIs Error handling rules, package maps, project maps
Output verification Visual QA sub-agent None Build loop (7 attempts)

What We Can Learn From Each

From Anthropic Official:

From Community:

From Composio: