Bundling Scripts and Reference Docs in a Skill

A skill is more than a SKILL.md file. The folder around it can carry executable scripts, reusable templates, and deep-dive reference docs the agent loads on demand. This guide covers how to organise those assets so the skill stays small at routing time and powerful at invocation time.

Past the simplest “one-file skill,” every well-built skill grows assets around its SKILL.md: helper scripts, reusable templates, deeper reference documents the agent reads on demand. Done well, this lets the skill stay tiny at routing time — only name + description — and rich at invocation time, with the body pointing the agent at exactly the assets it needs for the current job.

Done badly, you end up with a 900-line SKILL.md that the agent reads every time it’s invoked, eating context for things it’ll never reach. This guide is the layout that consistently scales.

The standard folder layout

text
my-skill/
├── SKILL.md            # manifest + body — read at invocation
├── scripts/            # optional: executables called by the agent
│   ├── extract.py
│   └── format.js
├── templates/          # optional: text snippets the body references
│   ├── email.txt
│   └── report.md
└── reference/          # optional: deep docs the agent reads on demand
    ├── api-spec.md
    └── edge-cases.md

Three things to notice:

  • Only SKILL.md is read by default at invocation. Everything else is referenced from the body and loaded lazily.
  • Subfolders are conventional, not magic. The agent doesn’t auto-discover scripts — the body has to point at them.
  • The folder name doubles as the skill name in the manifest. Keep them in sync.

scripts/ — code the agent runs

If your skill needs to do something deterministic — parse a file, transform data, hit an API — write it as a script in scripts/ and tell the agent how to run it from the body of SKILL.md:

markdown
## Extracting tables from PDFs

Use the bundled extractor:

```bash
python scripts/extract.py --pdf {input_path} --out {output_path}
```

It writes one CSV per page. Errors go to stderr.

Document the exact command. The agent will copy it nearly verbatim. Spell out the flags, the input/output expectations, and what success looks like (a file on disk? a number on stdout?). Ambiguity here costs you reliable invocations.

Language choice

Python and Node are the lowest-friction choices because most agent runtimes have them. If you reach for Bash, Go, or Rust, document the runtime requirement and check at the start of the script.

Error handling

Scripts should fail loud, not silent. Exit code != 0, error message on stderr, and ideally a one-line hint about how to fix. The agent reads stderr and will surface useful messages back to the user.

templates/ — text snippets the body reuses

Anything that’s “fill in the blanks” — a reusable email body, a structured report, a JSON skeleton — belongs in templates/:

markdown
## When the user wants a status email

Use templates/status-email.txt as the starting point. Replace
{{project}} and {{eta}} with the values gathered earlier in the
conversation. Don't paraphrase the body — the wording is approved.

The point of templates is consistency. If the same skeleton needs to come out 50 times across different runs, hard-coding it as a template + reference beats trying to regenerate it each time.

reference/ — docs the agent reads on demand

The deepest tier. Stuff that’s too long to put in SKILL.md but too important to leave out:

  • API specs (“here’s every endpoint, with shape, with example”)
  • Edge case catalogues (“when input has Unicode in the keys, do X”)
  • Worked examples (“here’s a full transcript of a real invocation”)

From the body, point the agent at the file:

markdown
## Edge cases

If the input file has more than 10,000 rows, switch strategies.
Read reference/large-files.md for the full procedure.

The agent will fetch reference/large-files.md only when the body tells it to. That keeps the default invocation lightweight.

Why this matters. Every token the agent reads is a token spent. Lazy-loading reference material means a skill can carry 30 pages of edge-case handling without paying for it on the 95% of invocations that don’t hit those edges.

What does not belong in the folder

  • Secrets. Skills are typically committed to source control. Don’t bundle API keys — read them from the environment instead.
  • Large binaries. A 50MB ML model in scripts/ bloats the install. Fetch on first use, cache, gitignore.
  • Vendored dependencies. Use a requirements.txt / package.json and let the runtime install. Don’t commit node_modules.
  • Old versions of SKILL.md. Source control handles history. Don’t keep SKILL.old.md alongside the live one.

Final checklist

  • Folder name = name: in frontmatter
  • SKILL.md body points at every asset it expects to use, with exact paths
  • Scripts have entry-point comments + clear errors
  • Reference docs are referenced, not loaded by default
  • No secrets, no large binaries, no vendored dependencies

For the file at the top of the folder, see Anatomy of a Skill. For making the routing reliable so this work actually gets reached, see Writing Skill Descriptions That Actually Trigger and How Skill Trigger Matching Actually Works.