Iterate & Deploy

Refine through testing, then package for real users

Phase 4: Iterate and Refine

No prompt is right on the first try. Plan for at least 2-3 major iterations. Each iteration should address a specific category of improvement.

The Iteration Cycle

Write promptTest with real inputsIdentify gapsRefineRepeat

What to Focus on at Each Iteration

IterationFocus AreaKey Questions
v1 → v2Structure & lengthIs the prompt too long? Are sections redundant? Can examples be extracted?
v2 → v3ActionabilityDoes the output help users do something? Are recommendations specific enough?
v3 → v4Edge casesDoes it handle ambiguous inputs? Multi-role jobs? Non-standard industries?

Refactoring Principles

Extract, don't delete

Move worked examples to separate files instead of removing them

Compress, don't lose

Condense verbose rubrics into inline scales (e.g., 1=Routine | 5=Creative)

Add actionability

If users read the output and ask "now what?", strengthen recommendations

Test the edges

Feed unusual inputs: vague titles, multi-SOC roles, niche industries

Phase 5: Package and Deploy

A GPT isn't just a prompt — it's a complete package. Here's everything you need:

ComponentPurpose
NameClear, searchable, describes the function
DescriptionOne sentence explaining what it does
System PromptYour refined instructions (the final version)
Knowledge FilesCurated data files uploaded to the GPT
Conversation StartersPre-written prompts showing users how to begin
CapabilitiesPlatform features to enable (web search, code interpreter, etc.)
Supporting MaterialsTemplates, cheat sheets, setup guides for end users

Conversation Starters That Work

Write 3-4 starters that demonstrate the tool's range. Each shows a different use case and primes the user to provide the right input:

“Analyze this job description for AI augmentation:”
“What are the top tasks we can automate for this role?”
“Generate AI Wins Dashboard for this job description.”
“Give me risk vs moat scores for this position.”

Phase 6: Test with Real Users

Test TypeWhat to TryWatch For
Happy pathStandard job description with clear tasksOutput matches expected format?
Ambiguous inputVague title like “Analyst” with no contextDoes the GPT ask for clarification?
Edge caseHighly physical role (e.g., construction)Correctly flags tasks as Human-led?
Multi-roleJD spanning 2-3 SOC codesAcknowledges ambiguity?
Stress testVery long or very short job descriptionsOutput quality holds?

Feedback Questions to Ask Testers

  1. Was the output useful for making a decision?
  2. Was anything confusing or unexpected?
  3. What would you do differently with this information?
  4. What's missing?

Case Study: The Prompt Iteration Journey (v1 → v3)

This is the real evolution of the AI Task Augmentation Analyst prompt, showing what changed at each version and why.

v1

The Kitchen Sink

Characteristics:

  • ~228 lines — comprehensive but long
  • Full Identity & Purpose section
  • Complete scoring rubrics with detailed tables
  • Worked example embedded directly in the prompt
  • Risk x Moat quadrant diagram included

What worked:

  • Thorough — covered every edge case
  • Self-contained — everything in one file
  • The worked example produced consistent outputs

What didn't:

  • Token-heavy — example consumed context window
  • Redundant sections overlapped
  • Tried to be both manual and instructions

Lesson: Your first version should be exhaustive. Get everything out of your head and onto the page. You'll refine later.

v2

Compress and Extract

What changed from v1:

ChangeWhy
Reduced ~228 lines to ~97 linesLess token usage, faster processing
Extracted worked example to separate fileKeeps prompt focused on instructions
Compressed rubrics to inline formatSame information, fewer tokens
Condensed Identity to one paragraphRemoved verbose opening

Lesson: Separate "instructions" from "reference material." The prompt tells the AI how to think. Knowledge files provide the data and examples it references.

v3

Make It Actionable

What changed from v2:

ChangeWhy
Expanded Step 8 to full Pilot Launch PlanUsers said "great analysis, but now what?"
Added pilot scoring criteria tableMakes pilot selection systematic, not gut-feel
Added structured recommendation format (6 parts)Task, rationale, AI approach, metrics, scope, rollback
Added Pilot Kickoff QuestionsHelps users take immediate action

Lesson: The biggest improvement often isn't in the analysis — it's in the "so what?" Your GPT should bridge the gap between insight and action.

The Evolution at a Glance

v1

Exhaustive

Get everything down. Be thorough. Don't self-edit yet.

v2

Compressed

Separate instructions from data. Cut redundancy. Extract examples.

v3

Actionable

Close the "now what?" gap. Add recommendations, next steps, follow-ups.

Iteration principle: Each version should make ONE major improvement, not rewrite everything. Preserve what works, fix what doesn't.

Common Pitfalls

Prompt too long

AI ignores later instructions. Fix: extract examples and reference data to knowledge files.

No scoring rubric

Same task gets different scores each run. Fix: define explicit 1-5 scales with criteria per level.

Output without action

Users say “interesting, but now what?” Fix: add a recommendation section with next steps.

No fallback chain

AI hallucinates data when files aren't loaded. Fix: build a priority chain with confidence flags.

Platform lock-in

Can't move to another tool. Fix: keep prompt and data platform-agnostic, package separately.

Source Files

These are the actual files used in the AI Task Augmentation Analyst GPT. View, copy, or download them to use as a starting point for your own custom GPT.

📝

System Prompt (v3 — Latest)

The final prompt used in the GPT instructions field

📊

Worked Example Analysis

Marketing Coordinator analysis — uploaded as a knowledge file

📁

O*NET Data Priority Guide

Download these files from O*NET and upload as knowledge files in the order below

Tier 1 — Start here
  • Occupation Data
  • Task Statements
  • Task Ratings
Tier 2 — Add next
  • Work Activities
  • Skills
  • Technology Skills
Tier 3 — If space allows
  • Abilities
  • Knowledge
  • Work Context
Download O*NET Database Files
💡

Key Takeaways

  • Plan for 2-3 iterations minimum — no prompt is right on the first try
  • v1 = exhaustive, v2 = compressed, v3 = actionable — this is the natural evolution
  • Package the full experience — name, description, starters, knowledge files, and supporting materials
  • Test with real users and diverse inputs — happy path, ambiguous, edge cases, and stress tests
  • Close the “now what?” gap — the biggest improvement is often in recommendations, not analysis