The Prompt Engineer's Pattern Book
The field has converged on a small set of reusable patterns that solve most prompting problems. The real skill is knowing which pattern fits which problem.
In a companion article, The Anatomy of a Prompt, we covered the building blocks: system prompts define identity and constraints, few-shot examples demonstrate format and style, and chain-of-thought prompting enables multi-step reasoning. Those are the raw materials.
This article covers the patterns, recurring structures that practitioners use to solve common prompting problems. Like design patterns in software engineering, these are not inventions. They are observations about what works, distilled from thousands of experiments and production deployments.6
Four patterns account for most of what experienced prompt engineers do: persona prompting, template patterns, meta-prompting, and self-consistency. Each solves a different class of problem. Knowing which to reach for, and when to combine them, separates effective prompting from trial and error.7
Persona Prompting
Assign the model a specific role with the persona pattern, and the quality of its output changes. This is one of the most widely observed effects in prompt engineering, and one of the most underestimated.8
The naive version is familiar: "You are an expert in X." But effective persona prompting goes further. A well-constructed persona defines expertise, constraints, communication style, and even what the model should not do.
The Difference a Persona Makes
Consider a tax question posed two ways:
# Generic prompt User: Can I deduct my home office if I'm a W-2 employee? # Persona prompt System: You are a licensed CPA with 15 years of experience in individual tax preparation. You specialize in employment-related deductions. You always cite the relevant IRS publication. When the answer depends on specific circumstances, you list those circumstances rather than guessing. User: Can I deduct my home office if I'm a W-2 employee?
The generic prompt produces a plausible answer that may or may not mention the Tax Cuts and Jobs Act of 2017, which eliminated the home office deduction for W-2 employees. The persona prompt reliably produces a response that cites IRS Publication 587, distinguishes between W-2 and self-employment scenarios, and flags the 2017 law change.
Same model. Same question. Different context, different output distribution.
Why Personas Work
Language models learn associations during training. Text written by tax professionals has different statistical properties than text written by general audiences. It uses specific terminology, cites authoritative sources, hedges appropriately, and structures arguments differently.9
When you assign a persona, you bias the model toward generating tokens consistent with that kind of text. The persona doesn't give the model new knowledge. It shifts which knowledge gets activated and how it gets expressed.1011
This explains both the power and the limits of persona prompting. The model can produce text in the style of an expert, drawing on patterns it learned from expert-written text. But it cannot actually verify its claims against reality. A confident-sounding persona can make hallucinations more convincing, not less frequent.1213
Constructing Effective Personas
The best personas include four components:
- Identity: Who the model is (role, experience level, specialization)
- Expertise scope: What the model knows and, critically, what falls outside its scope
- Communication style: How the model should express itself (formal, technical, conversational)
- Behavioral constraints: What the model should do when uncertain, when it encounters edge cases, or when asked about topics outside its scope
Omit any of these, and the model fills in the gaps with its default behavior. That default is usually "helpful general assistant," which may not be what you want.
Template Patterns
A template is a prompt with fixed structure and variable content. The structure stays the same across invocations; only the inputs change. This is the most common pattern in production systems, and for good reason: templates make prompts testable, reusable, and version-controlled.14
Anatomy of a Template
Consider a template for document summarization:
# Template: Document Summary System: You are a technical writer who creates concise summaries of research documents. User: Summarize the following document in exactly 3 bullet points. Each bullet should be one sentence. Focus on findings, not methodology. DOCUMENT: {document_text} AUDIENCE: {audience_level} FORMAT: 3 bullet points, one sentence each
The curly-brace variables ({document_text}, {audience_level}) are the only parts that change. Everything else is fixed: the persona, the output format, the constraints. A developer can call this template thousands of times with different documents and get structurally consistent results.
Why Templates Matter
Templates solve four problems simultaneously:
Consistency. Every invocation uses the same instructions, so output variance comes from the input, not from prompt phrasing. This makes results comparable across runs.
Testability. Because the structure is fixed, you can build test suites. Feed known inputs, check that outputs meet quality criteria, and catch regressions when you update the template.
Reusability. A well-designed template can serve an entire team or product. New users do not need to understand prompt engineering; they fill in the blanks.
Version control. Templates are code. You can diff them, review them, roll them back. When output quality degrades after a model update, you can systematically test which template changes restore performance.
Template Composition
Templates compose naturally through prompt chaining. A code review system might chain three templates together, where each template produces structured output that feeds directly into the next step:15
# Template 1: Extract functions from a code file extract_functions(code) -> list of functions # Template 2: Review a single function review_function(function, language, style_guide) -> review comments # Template 3: Aggregate reviews into a report aggregate_reviews(reviews, severity_threshold) -> final report
Each template is simple and focused. The composition handles complexity. This mirrors how software engineers decompose systems into small, composable functions.
Templates can also include conditional sections. A customer support template might include a refund policy block only when the query is classified as a complaint, or a technical troubleshooting section only when the query involves a product defect. The conditional logic lives in the application code; the template provides the structure for each path.
Production systems run on templates.16
Meta-Prompting
Meta-prompting uses a language model to generate, critique, or improve prompts. It is the prompt-about-prompts pattern, and it addresses a fundamental challenge: writing good prompts is hard, and humans are not always the best judges of what will work.
The Basic Loop
The simplest form of meta-prompting asks the model to improve a prompt you have already written:
User: I wrote this prompt for a customer support chatbot: "You are a helpful customer service agent. Answer questions about our products." Critique this prompt. What is missing? What could cause problems in production? Suggest an improved version.
The model will typically identify missing constraints (what happens when the user asks about competitors?), missing formatting instructions (should responses be formal or casual?), and missing edge cases (what if the user is angry?). The improved version it generates is usually more robust than the original.
This works because the model has seen thousands of prompts during training, along with discussions about what makes prompts effective. It can apply that meta-knowledge to your specific case.
Automated Optimization
Zhou et al. (2022) formalized this insight in their paper "Large Language Models Are Human-Level Prompt Engineers." Their system, APE (Automatic Prompt Engineer), generates candidate prompts, scores them against a test set, and iteratively refines the best performers.4
The optimization loop is straightforward:
1. Generate candidate prompts for a task 2. Test each candidate on evaluation examples 3. Score candidates by output quality 4. Select the top performers 5. Mutate top prompts to generate new candidates 6. Repeat until convergence
On several benchmarks, APE-generated prompts matched or exceeded human-written ones. The system discovered phrasings that humans would not have tried, including some that were grammatically awkward but empirically effective.17
When Meta-Prompting Helps
Meta-prompting is most useful in two scenarios. First, when you are starting from scratch and need a reasonable first draft. Asking the model to generate a prompt for a task you describe is faster than writing one from nothing, and the result often covers edge cases you would have missed.
Second, when you have a prompt that mostly works but fails on specific inputs. Showing the model the prompt, the failing inputs, and the incorrect outputs lets it diagnose the problem and suggest targeted fixes.
Meta-prompting is less useful when the problem is not the prompt but the model's capabilities. If the task requires knowledge the model does not have, or reasoning it cannot perform, no amount of prompt refinement will help. Knowing the difference between "bad prompt" and "wrong tool" is essential. Otherwise, meta-prompting becomes circular: generating increasingly elaborate prompts that cannot fix a fundamental limitation.1819
Self-Consistency
Self-consistency is an ensemble method for language models. Generate multiple outputs for the same input, then aggregate the results. The idea is simple, and the results are surprisingly strong.3
Wang et al. (2022) introduced the technique as an extension of chain-of-thought prompting. Instead of generating a single reasoning chain, they sampled multiple chains (using temperature > 0 to introduce diversity) and took the majority answer.2
How It Works
Consider a math word problem. With standard chain-of-thought, the model generates one reasoning path and produces an answer. That path might contain an error. With self-consistency, you sample five or ten paths at elevated temperature and take the most common answer:
# Same prompt, 5 samples at temperature 0.7 Sample 1: ... 5 + 6 = 11 ... -> 11 Sample 2: ... 5 + 6 = 11 ... -> 11 Sample 3: ... 5 + 3 = 8 ... -> 8 (arithmetic error) Sample 4: ... 5 + 6 = 11 ... -> 11 Sample 5: ... 5 + 6 = 11 ... -> 11 Majority vote: 11 (4 out of 5)
Sample 3 made an error, but the majority vote corrects it. The technique assumes that correct reasoning paths are more likely than any specific incorrect path, even if individual samples are noisy.20
Why Diversity Matters
Self-consistency requires temperature > 0. At temperature 0, every sample is identical (or nearly so), and majority voting has no effect. The diversity comes from sampling: different token choices lead to different reasoning paths, which sometimes reach different conclusions.21
The optimal temperature varies by task. Too low, and all samples agree regardless of correctness. Too high, and the reasoning becomes incoherent. In practice, temperatures between 0.5 and 0.7 work well for most reasoning tasks.22
The Cost-Accuracy Tradeoff
More samples mean better accuracy, but also more cost and latency. The relationship follows a curve of diminishing returns:
For most applications, five samples hit the sweet spot. Beyond ten, the accuracy gains rarely justify the cost. But for high-stakes decisions where correctness matters more than latency or expense, twenty or more samples can be worthwhile.
When to Use Self-Consistency
Self-consistency works best for tasks with a single correct answer: math problems, factual questions, logic puzzles, classification. For open-ended generation (writing, creative tasks), there is no "correct" answer to vote on, so the technique does not apply in the same way.23
It also works best when the model is capable of solving the task but sometimes makes errors. If the model consistently gets something wrong across all samples, majority voting just confirms the wrong answer. Self-consistency corrects random errors, not systematic ones.24
Use it for high-stakes reasoning where reliability matters more than speed.5
Pattern Selection
Given a problem, which pattern should you reach for? The answer depends on what kind of problem you are solving.
Problem Type Recommended Pattern(s) ................................................................................................ Domain-specific Q&A Persona Structured output Template Batch processing Template Prompt development Meta-prompting High-stakes reasoning Self-consistency + Chain-of-thought Classification at scale Template + Self-consistency Customer-facing chatbot Persona + Template Complex analysis pipeline Template composition + Persona One-off exploration Meta-prompting + Persona
Patterns Compose
In practice, production systems combine multiple patterns. A medical triage chatbot might use a persona (experienced nurse practitioner), a template (structured intake form), and self-consistency (sample three responses and flag disagreements for human review). Each pattern handles a different aspect of the problem.
The composition follows a natural hierarchy. Persona prompting sets the foundation: who is the model in this context? Template patterns provide the structure: what inputs does it receive and what format should the output take? Self-consistency provides reliability: how confident can we be in the result? Meta-prompting sits outside the loop, used during development to refine the other three.
Start with the simplest pattern that could work. Add complexity only when testing reveals a gap that a new pattern addresses. Over-engineering a prompt is as real a problem as under-engineering one.
Closing
Patterns are not magic. They are structured ways to exploit how language models process context. Persona prompting biases token generation toward expert-style text. Templates enforce consistency and enable testing. Meta-prompting uses the model's own knowledge of prompting to refine your work. Self-consistency trades compute for reliability through ensemble voting. And when the task involves tool use, the ReAct pattern provides a structured loop for reasoning and acting.
None of these techniques require special APIs, fine-tuning, or advanced tooling. They are available to anyone with access to a language model. What separates effective practitioners from everyone else is not access to secret techniques, but judgment about which pattern fits which problem.1
That judgment comes from practice. Write prompts, test them, observe failures, diagnose the cause, apply the right pattern, and test again. The cycle is empirical, closer to experimental science than to software engineering.25
The patterns in this article, combined with the building blocks covered in The Anatomy of a Prompt, form a practical toolkit for most prompting problems. Master the small set. Know when each applies. That is the skill.26
References
- Brown, T., et al. "Language Models are Few-Shot Learners." NeurIPS, 2020.
- Wei, J., et al. "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models." NeurIPS, 2022.
- Wang, X., et al. "Self-Consistency Improves Chain of Thought Reasoning in Language Models." ICLR, 2022.
- Zhou, Y., et al. "Large Language Models Are Human-Level Prompt Engineers." ICLR, 2022.
- Kojima, T., et al. "Large Language Models are Zero-Shot Reasoners." NeurIPS, 2022.
- White, J., et al. "A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT." arXiv, 2023.
Further Reading
- Jurafsky, Daniel & James H. Martin. "Speech and Language Processing," 3rd ed. (draft). Chapters 7 (prompting, conditional generation, temperature) and 10 (in-context learning, fine-tuning).
- Widdows, Dominic & Trevor Cohen. "Large Language Models: How They Work and Why They Matter." SemanticVectors Publishing, 2025. Chapters 1, 4-7.
- Alammar, Jay & Maarten Grootendorst. "Hands-On Large Language Models." O'Reilly Media, 2024. Chapter 6.
- Raschka, Sebastian. "Build a Large Language Model (From Scratch)." Manning, 2024. Chapter 7.
- Extended grounding notes for all citations: Sources.