Reference implementation and reusable workflow for optimizing prompt-like markdown files, Agent Skills, and agent instruction files with Agent Lightning, official evals/evals.json manifests, and a multi-agent training loop.
- Why use this repo
- What this repo provides
- Ways to use it
- Requirements
- Installation
- Quick start
- Documentation
- Development
- License
Use this project when you need to:
- optimize prompts and prompt-like markdown files
- optimize Agent Skills contracts and supporting assets
- optimize agent instruction files such as
AGENTS.md,copilot-instructions.md,CLAUDE.md, or*.agent.md - run an automatic optimization loop locally, on a schedule, or through a reusable GitHub Agentic Workflow
The repository ships a coordinated set of custom agents for prompt and skill optimization workflows.
| Agent | Role |
|---|---|
trainer |
Canonical orchestrator for trainer-skill execution, workspace coordination, and optimization-loop handoffs. |
researcher |
Owns public-source discovery, source triage, and research brief generation before synthesis or optimization. |
teacher |
Reviews optimization artifacts or user-supplied context and recommends what should improve next. |
student |
Drafts and revises candidates from teacher guidance, with optional engineer-agent coaching when specialized advice is needed. |
judge |
Scores outputs, candidates, and traces with rubric-driven evaluation. |
conservator |
Reviews changes against training history and repository context to avoid regressions. |
adversary |
Stress-tests prompts, datasets, and evaluators for failure modes before finalization. |
engineer |
Applies prompt-engineering, context-engineering, and Trace-oriented implementation guidance. |
The repo exposes reusable Agent Skills for workflow authoring, prompt engineering, judging, and trainer-loop execution.
The skills in this repo can be added by running:
npx skills add Tyler-R-Kendrick/copilot-auto-training
| Skill | Purpose |
|---|---|
create-workflow |
Create or update GitHub Agentic Workflows with gh aw, frontmatter, MCP setup, compilation, and validation guidance. |
| Skill | Purpose |
|---|---|
learn |
Capture user corrections from the active conversation and update the most relevant instructions, docs, evals, or tests so the same mistake is less likely to happen again. |
| Skill | Purpose |
|---|---|
engineer-prompt |
Improve prompts and context plans with the smallest effective prompt-engineering technique. |
engineer-code |
Apply Microsoft Trace to train Python prompts, helpers, and agent components. |
engineer-copilot-agent |
Improve GitHub Copilot custom-agent contracts, tool routing, and handoff design with bounded repo-aware guidance. |
| Skill | Purpose |
|---|---|
judge-rubric |
Build formal rubrics before scoring candidates or artifacts. |
judge-outcome |
Compare final outputs when end-state quality is the primary evidence. |
judge-trajectory |
Evaluate traces, tool usage, and side effects when process quality matters. |
| Skill | Purpose |
|---|---|
researcher-research |
Research grounded source material, datasets, and benchmarks before synthesis. |
| Skill | Purpose |
|---|---|
trainer-train |
Own the end-to-end trainer loop contract for one selected target, including workspace setup, stage sequencing, steering, validation, and write-back decisions once concrete stage capabilities are known. |
trainer-synthesize |
Build official eval manifests plus explicit train.jsonl and val.jsonl datasets. |
trainer-optimize |
Run single-shot Agent Lightning optimization against explicit datasets. |
trainer-election |
Elect the strongest candidate from existing scored workspace artifacts. |
The repository publishes a reusable train-prompt GitHub Agentic Workflow that selects one prompt-like file, runs the trainer loop, and opens a pull request only when validation passes and the diff is meaningful.
The repository also publishes a single installable Copilot CLI plugin, copilot-training, that bundles its skills, agents, hooks, and MCP runtime assets.
The local MCP server under tools/agent-skills-mcp exposes repository skills to tool discovery so built-in agents and workflows can discover and invoke them consistently.
The optimization loop initializes a workspace, prepares an engineer review, fills any missing research or dataset gaps, runs optimization, optionally performs election, validates the result, and only then writes back the selected candidate and opens a pull request.
flowchart TD
select["Select target
prompt file"]:::setup --> workspace["Initialize local
.trainer-workspace"]:::setup
workspace --> review["Engineer review
goals, risks, plan"]:::setup
review --> research["Research
researcher agent"]:::stage
research --> synthesize["Synthesize
datasets and evals"]:::stage
synthesize --> optimize["Optimize
trainer-optimize"]:::stage
optimize --> election{"Multiple candidates?"}:::decision
election -->|Yes| elect["Election
trainer-election"]:::stage
election -->|No| validate["Validate
python -m pytest -q"]:::validation
elect --> validate
validate -->|Pass + diff| writeback["Write back
selected candidate"]:::success
validate -->|Fail or no diff| checkpoint["Keep artifacts
for review/resume"]:::warning
writeback --> pr["Create PR
or fallback issue"]:::success
classDef setup fill:#E8F1FF,stroke:#2563EB,stroke-width:1.5px,color:#0F172A;
classDef stage fill:#EEFCE8,stroke:#16A34A,stroke-width:1.5px,color:#052E16;
classDef decision fill:#FEF3C7,stroke:#D97706,stroke-width:1.5px,color:#451A03;
classDef validation fill:#F3E8FF,stroke:#9333EA,stroke-width:1.5px,color:#3B0764;
classDef success fill:#DCFCE7,stroke:#15803D,stroke-width:1.5px,color:#14532D;
classDef warning fill:#FEE2E2,stroke:#DC2626,stroke-width:1.5px,color:#450A0A;
Use the reusable workflow when another repository already stores prompt-like markdown in git and you want scheduled or manual optimization runs that produce reviewable pull requests.
Prerequisites for the target repository:
- GitHub Agentic Workflows is initialized with
gh aw init COPILOT_GITHUB_TOKENis configured for the chosen engine; see docs/getting-started.md for the required secrets, PAT permissions, roles, and screenshot references- the repository contains prompt-like markdown files the workflow can select
python -m pytest -qis the correct repository validation command, or the imported workflow is adjusted to the repository's validation command and recompiled
Install the workflow with an explicit path:
gh aw add Tyler-R-Kendrick/copilot-auto-training/.github/workflows/train-prompt.md --name train-promptIf you are publishing your own fork, replace Tyler-R-Kendrick/copilot-auto-training with your fork's OWNER/REPO path in the install commands below.
Update it later with:
gh aw update train-promptThe imported workflow will:
- select exactly one prompt-like file
- create or update that file's local
.trainer-workspace/<prompt-name>/ - use the
researcheragent for grounded source discovery plus packagedresearcher-research,trainer-synthesize,trainer-optimize, andtrainer-electionassets from this repository through the bundled runtime - open a pull request only when the optimization produced a meaningful diff and
python -m pytest -qpassed
The workflow source lives in .github/workflows/train-prompt.md. Frontmatter changes require recompiling it with gh aw compile train-prompt.
Fork the repository or create a new repository from it when you want the whole toolchain, examples, workflows, skills, and docs in one place.
Create a project from the template with:
gh repo create <new-repo> --template Tyler-R-Kendrick/copilot-auto-trainingFor local iteration inside this repository, clone it and ask Copilot to run the trainer on a prompt-like file:
run @trainer on #<prompt-name>
Use the plugin marketplace when you want these skills available inside Copilot CLI without copying files by hand.
Register the marketplace:
copilot plugin marketplace add Tyler-R-Kendrick/copilot-auto-trainingInstall the published plugin:
copilot plugin install copilot-training@copilot-trainingIf you prefer a direct repository import instead of marketplace registration, install from the subdirectory path:
copilot plugin install Tyler-R-Kendrick/copilot-auto-training:plugins/copilot-trainingThe installable plugin bundles live under plugins/, and the marketplace manifest lives at .github/plugin/marketplace.json. For the full import flow, see docs/copilot-cli-plugins.md.
Use it locally as either a repo template or a locally cloned copy, then follow the GitHub Agentic Workflows cross-repository guidance to point it at the repositories you want to optimize.
- Python 3.11+
- dependencies from requirements.txt
- an authenticated Copilot session plus the
COPILOT_MODELsetting documented in docs/getting-started.md
python3.12 -m venv .venv
source .venv/bin/activate
python -m pip install -r requirements.txtInside the devcontainer, .devcontainer/post-start.sh repairs or recreates .venv with Python 3.12 and installs requirements.txt automatically when the environment is missing, stale, or broken. The Copilot coding-agent bootstrap workflow at .github/workflows/copilot-setup-steps.yml reuses that same script so the hosted agent gets the repository's shared setup plus gh aw.
Run the smallest example in this repository through copilot chat:
run @trainer on #:examples/first-run/prompts/classify_support.md --debug-onlyRun a small optimization pass:
run @trainer on #:examples/first-run/prompts/classify_support.md --iterations 2 --beam-width 2 --branch-factor 2For the full setup, configuration, and artifact walkthrough, start with docs/getting-started.md.
- docs/getting-started.md: installation, configuration, examples, and outputs
- docs/copilot-cli-plugins.md: how to register this repo as a Copilot CLI marketplace and install its plugins
- docs/dashboard.md: how to open and use the Agent Lightning dashboard
- docs/troubleshooting.md: common setup, dataset, runtime, and dashboard issues
- examples/first-run/README.md: smallest runnable example in the repo
- skills/trainer-optimize/README.md: overview of the prompt optimization skill
- skills/trainer-optimize/references/dataset-format.md: dataset schema and scoring guidance
Key entry points:
- skills/trainer-optimize/scripts/run_optimize.py: optimization runtime
- skills/trainer-optimize/scripts/generate_jsonl.py: CSV-to-JSONL dataset bootstrapper
- tests/test_run_optimize.py: end-to-end behavior coverage
Skill layout:
skills/,.agents/skills/, and.claude/skills/are the canonical in-repo skill roots.plugins/copilot-training/is the single installable Copilot CLI plugin for this repo; itsskills/,agents/,hooks/, andmcps/entries symlink back to the canonical repo sources rather than copying them..agents/skills/is the managed symlink mirror maintained by.github/hooks/sync-skill-links.pyso the repo does not keep copied skill directories.- The helper can also link skills from
~/skillsand~/.agents/skillsinto.agents/skills/when those home-level roots exist. - Local home-skill symlinks created by the watcher are ignored by
.agents/skills/.gitignoreso they do not dirty the repository. - Use
python .github/hooks/sync-skill-links.py --checkto verify that.agents/skills/exactly matches the discovered skill roots. - The launcher at
.github/hooks/ensure-skill-link-watcher.shperforms an immediate sync and starts a background watcher so future additions to~/skillsand~/.agents/skillsare linked automatically during the session. - The write-time hook in
.github/hooks/prompt-workflow-reminder.jsonstarts that launcher automatically after file edits.
The repository currently ships official eval manifests for the trainer and engineering skills plus a smaller onboarding example under examples/first-run.
This project is licensed under the terms of the LICENSE file.