← All Articles

The Prompt Engineer's Pattern Book

The field has converged on a small set of reusable patterns that solve most prompting problems. The real skill is knowing which pattern fits which problem.

In Brief

Four patterns account for most of what effective practitioners do with prompts: persona prompting, which biases the model toward expert-style text; template patterns, which enforce consistency and enable testing; meta-prompting, which uses the model's own knowledge of prompting to refine your work; and self-consistency, which trades compute for reliability through ensemble voting. None of these are tricks. They are regularizers that exploit how transformer attention weights context by specificity and recency. A persona activates expert-style patterns the model learned during training, a template enforces structure that makes outputs comparable and testable, meta-prompting uses the model's ability to reason about prompts as objects, and self-consistency applies an old machine-learning insight (ensemble methods reduce variance) to language-model inference.

The patterns compose naturally. A medical triage chatbot might use a persona (experienced nurse practitioner), a template (structured intake form), and self-consistency (three sampled responses with disagreements flagged for human review), because each pattern handles a different aspect of the problem. The deeper principle is that patterns are structured exploitations of how language models actually work: you are not commanding the model, you are constructing context that makes the desired behavior the path of least resistance through the probability distribution. That understanding explains why some phrasings are more effective than others (specificity, recency, and attention all favor some framings over others) and why no single phrasing is universally correct, because different models, different training data, and different tasks shift the probability landscape. The discipline is testing before deploying and measuring which pattern combination produces the target output on your actual distribution of inputs, rather than trusting that the cleverest phrasing will land most of the time.

In a companion article, The Anatomy of a Prompt, we covered the building blocks: system prompts define identity and constraints, few-shot examples demonstrate format and style, and chain-of-thought prompting enables multi-step reasoning. Those are the raw materials.

This article covers the patterns, recurring structures that practitioners use to solve common prompting problems. Like design patterns in software engineering, these are not inventions. They are observations about what works, distilled from thousands of experiments and production deployments.⁶

Four patterns account for most of what experienced prompt engineers do: persona prompting, template patterns, meta-prompting, and self-consistency. Each solves a different class of problem. Knowing which to reach for, and when to combine them, separates effective prompting from trial and error.⁷

. . .

Persona Prompting

Assign the model a specific role with the persona pattern, and the quality of its output changes. This is one of the most widely observed effects in prompt engineering, and one of the most underestimated.⁸

The naive version is familiar: "You are an expert in X." But effective persona prompting goes further. A well-constructed persona defines expertise, constraints, communication style, and even what the model should not do.

The Difference a Persona Makes

Consider a tax question posed two ways:

# Generic prompt
User: Can I deduct my home office if I'm a W-2 employee?

# Persona prompt
System: You are a licensed CPA with 15 years of experience
in individual tax preparation. You specialize in
employment-related deductions. You always cite the
relevant IRS publication. When the answer depends on
specific circumstances, you list those circumstances
rather than guessing.

User: Can I deduct my home office if I'm a W-2 employee?

The generic prompt produces a plausible answer that may or may not mention the Tax Cuts and Jobs Act of 2017, which eliminated the home office deduction for W-2 employees. The persona prompt reliably produces a response that cites IRS Publication 587, distinguishes between W-2 and self-employment scenarios, and flags the 2017 law change.

Same model. Same question. Different context, different output distribution.

Why Personas Work

Language models learn associations during training. Text written by tax professionals has different statistical properties than text written by general audiences. It uses specific terminology, cites authoritative sources, hedges appropriately, and structures arguments differently.⁹

When you assign a persona, you bias the model toward generating tokens consistent with that kind of text. The persona doesn't give the model new knowledge. It shifts which knowledge gets activated and how it gets expressed.¹⁰¹¹

This explains both the power and the limits of persona prompting. The model can produce text in the style of an expert, drawing on patterns it learned from expert-written text. But it cannot actually verify its claims against reality. A confident-sounding persona can make hallucinations more convincing, not less frequent.¹²¹³

Constructing Effective Personas

The best personas include four components:

Identity: Who the model is (role, experience level, specialization)
Expertise scope: What the model knows and, critically, what falls outside its scope
Communication style: How the model should express itself (formal, technical, conversational)
Behavioral constraints: What the model should do when uncertain, when it encounters edge cases, or when asked about topics outside its scope

Omit any of these, and the model fills in the gaps with its default behavior. That default is usually "helpful general assistant," which may not be what you want.

. . .

Template Patterns

A template is a prompt with fixed structure and variable content. The structure stays the same across invocations; only the inputs change. This is the most common pattern in production systems, and for good reason: templates make prompts testable, reusable, and version-controlled.¹⁴

Anatomy of a Template

Consider a template for document summarization:

# Template: Document Summary

System: You are a technical writer who creates concise
summaries of research documents.

User: Summarize the following document in exactly 3 bullet
points. Each bullet should be one sentence. Focus on
findings, not methodology.

DOCUMENT:
{document_text}

AUDIENCE: {audience_level}
FORMAT: 3 bullet points, one sentence each

The curly-brace variables ({document_text}, {audience_level}) are the only parts that change. Everything else is fixed: the persona, the output format, the constraints. A developer can call this template thousands of times with different documents and get structurally consistent results.

Why Templates Matter

Templates solve four problems simultaneously:

Consistency. Every invocation uses the same instructions, so output variance comes from the input, not from prompt phrasing. This makes results comparable across runs.

Testability. Because the structure is fixed, you can build test suites. Feed known inputs, check that outputs meet quality criteria, and catch regressions when you update the template.

Reusability. A well-designed template can serve an entire team or product. New users do not need to understand prompt engineering; they fill in the blanks.

Version control. Templates are code. You can diff them, review them, roll them back. When output quality degrades after a model update, you can systematically test which template changes restore performance.

Template Composition

Templates compose naturally through prompt chaining. A code review system might chain three templates together, where each template produces structured output that feeds directly into the next step:¹⁵

# Template 1: Extract functions from a code file
extract_functions(code) -> list of functions

# Template 2: Review a single function
review_function(function, language, style_guide) -> review comments

# Template 3: Aggregate reviews into a report
aggregate_reviews(reviews, severity_threshold) -> final report

Each template is simple and focused. The composition handles complexity. This mirrors how software engineers decompose systems into small, composable functions.

Templates can also include conditional sections. A customer support template might include a refund policy block only when the query is classified as a complaint, or a technical troubleshooting section only when the query involves a product defect. The conditional logic lives in the application code; the template provides the structure for each path.

Production systems run on templates.¹⁶

. . .

Meta-Prompting

Meta-prompting uses a language model to generate, critique, or improve prompts. It is the prompt-about-prompts pattern, and it addresses a fundamental challenge: writing good prompts is hard, and humans are not always the best judges of what will work.

The Basic Loop

The simplest form of meta-prompting asks the model to improve a prompt you have already written:

User: I wrote this prompt for a customer support chatbot:

"You are a helpful customer service agent. Answer
questions about our products."

Critique this prompt. What is missing? What could
cause problems in production? Suggest an improved
version.

The model will typically identify missing constraints (what happens when the user asks about competitors?), missing formatting instructions (should responses be formal or casual?), and missing edge cases (what if the user is angry?). The improved version it generates is usually more robust than the original.

This works because the model has seen thousands of prompts during training, along with discussions about what makes prompts effective. It can apply that meta-knowledge to your specific case.

Automated Optimization

Zhou et al. (2022) formalized this insight in their paper "Large Language Models Are Human-Level Prompt Engineers." Their system, APE (Automatic Prompt Engineer), generates candidate prompts, scores them against a test set, and iteratively refines the best performers.⁴

The optimization loop is straightforward:

1. Generate candidate prompts for a task
2. Test     each candidate on evaluation examples
3. Score    candidates by output quality
4. Select   the top performers
5. Mutate   top prompts to generate new candidates
6. Repeat   until convergence

On several benchmarks, APE-generated prompts matched or exceeded human-written ones. The system discovered phrasings that humans would not have tried, including some that were grammatically awkward but empirically effective.¹⁷

When Meta-Prompting Helps

Meta-prompting is most useful in two scenarios. First, when you are starting from scratch and need a reasonable first draft. Asking the model to generate a prompt for a task you describe is faster than writing one from nothing, and the result often covers edge cases you would have missed.

Second, when you have a prompt that mostly works but fails on specific inputs. Showing the model the prompt, the failing inputs, and the incorrect outputs lets it diagnose the problem and suggest targeted fixes.

Meta-prompting is less useful when the problem is not the prompt but the model's capabilities. If the task requires knowledge the model does not have, or reasoning it cannot perform, no amount of prompt refinement will help. Knowing the difference between "bad prompt" and "wrong tool" is essential. Otherwise, meta-prompting becomes circular: generating increasingly elaborate prompts that cannot fix a fundamental limitation.¹⁸¹⁹

. . .

Self-Consistency

Self-consistency is an ensemble method for language models. Generate multiple outputs for the same input, then aggregate the results. The idea is simple, and the results are surprisingly strong.³

Wang et al. (2022) introduced the technique as an extension of chain-of-thought prompting. Instead of generating a single reasoning chain, they sampled multiple chains (using temperature > 0 to introduce diversity) and took the majority answer.²

How It Works

Consider a math word problem. With standard chain-of-thought, the model generates one reasoning path and produces an answer. That path might contain an error. With self-consistency, you sample five or ten paths at elevated temperature and take the most common answer:

# Same prompt, 5 samples at temperature 0.7

Sample 1: ... 5 + 6 = 11 ...   -> 11
Sample 2: ... 5 + 6 = 11 ...   -> 11
Sample 3: ... 5 + 3 = 8 ...    -> 8  (arithmetic error)
Sample 4: ... 5 + 6 = 11 ...   -> 11
Sample 5: ... 5 + 6 = 11 ...   -> 11

Majority vote: 11 (4 out of 5)

Sample 3 made an error, but the majority vote corrects it. The technique assumes that correct reasoning paths are more likely than any specific incorrect path, even if individual samples are noisy.²⁰

Why Diversity Matters

Self-consistency requires temperature > 0. At temperature 0, every sample is identical (or nearly so), and majority voting has no effect. The diversity comes from sampling: different token choices lead to different reasoning paths, which sometimes reach different conclusions.²¹

The optimal temperature varies by task. Too low, and all samples agree regardless of correctness. Too high, and the reasoning becomes incoherent. In practice, temperatures between 0.5 and 0.7 work well for most reasoning tasks.²²

The Cost-Accuracy Tradeoff

More samples mean better accuracy, but also more cost and latency. The relationship follows a curve of diminishing returns:

Accuracy gains flatten by ten samples while cost keeps climbing, which is why five samples hits the sweet spot for most applications.

For most applications, five samples hit the sweet spot. Beyond ten, the accuracy gains rarely justify the cost. But for high-stakes decisions where correctness matters more than latency or expense, twenty or more samples can be worthwhile.

When to Use Self-Consistency

Self-consistency works best for tasks with a single correct answer: math problems, factual questions, logic puzzles, classification. For open-ended generation (writing, creative tasks), there is no "correct" answer to vote on, so the technique does not apply in the same way.²³

It also works best when the model is capable of solving the task but sometimes makes errors. If the model consistently gets something wrong across all samples, majority voting just confirms the wrong answer. Self-consistency corrects random errors, not systematic ones.²⁴

Use it for high-stakes reasoning where reliability matters more than speed.⁵

. . .

Pattern Selection

Given a problem, which pattern should you reach for? The answer depends on what kind of problem you are solving.

Problem Type                   Recommended Pattern(s)
................................................................................................
Domain-specific Q&A            Persona
Structured output              Template
Batch processing               Template
Prompt development             Meta-prompting
High-stakes reasoning          Self-consistency + Chain-of-thought
Classification at scale        Template + Self-consistency
Customer-facing chatbot        Persona + Template
Complex analysis pipeline      Template composition + Persona
One-off exploration            Meta-prompting + Persona

Patterns Compose

In practice, production systems combine multiple patterns. A medical triage chatbot might use a persona (experienced nurse practitioner), a template (structured intake form), and self-consistency (sample three responses and flag disagreements for human review). Each pattern handles a different aspect of the problem.

The composition follows a natural hierarchy. Persona prompting sets the foundation: who is the model in this context? Template patterns provide the structure: what inputs does it receive and what format should the output take? Self-consistency provides reliability: how confident can we be in the result? Meta-prompting sits outside the loop, used during development to refine the other three.

The four patterns compose in a hierarchy: persona forms the foundation, templates add structure, self-consistency layers on reliability, and meta-prompting wraps the whole stack as the evaluation loop.

Start with the simplest pattern that could work. Add complexity only when testing reveals a gap that a new pattern addresses. Over-engineering a prompt is as real a problem as under-engineering one.

. . .

Closing

Patterns are not magic. They are structured ways to exploit how language models process context. Persona prompting biases token generation toward expert-style text. Templates enforce consistency and enable testing. Meta-prompting uses the model's own knowledge of prompting to refine your work. Self-consistency trades compute for reliability through ensemble voting. And when the task involves tool use, the ReAct pattern provides a structured loop for reasoning and acting.

None of these techniques require special APIs, fine-tuning, or advanced tooling. They are available to anyone with access to a language model. What separates effective practitioners from everyone else is not access to secret techniques, but judgment about which pattern fits which problem.¹

That judgment comes from practice. Write prompts, test them, observe failures, diagnose the cause, apply the right pattern, and test again. The cycle is empirical, closer to experimental science than to software engineering.²⁵

The patterns in this article, combined with the building blocks covered in The Anatomy of a Prompt, form a practical toolkit for most prompting problems. Master the small set. Know when each applies. That is the skill.²⁶

. . .

References

Brown, T., et al. "Language Models are Few-Shot Learners." NeurIPS, 2020.
Wei, J., et al. "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models." NeurIPS, 2022.
Wang, X., et al. "Self-Consistency Improves Chain of Thought Reasoning in Language Models." ICLR, 2022.
Zhou, Y., et al. "Large Language Models Are Human-Level Prompt Engineers." ICLR, 2022.
Kojima, T., et al. "Large Language Models are Zero-Shot Reasoners." NeurIPS, 2022.
White, J., et al. "A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT." arXiv, 2023.

The Prompt Engineer's Pattern Book

Persona Prompting

The Difference a Persona Makes

Why Personas Work

Constructing Effective Personas

Template Patterns

Anatomy of a Template

Why Templates Matter

Template Composition

Meta-Prompting

The Basic Loop

Automated Optimization

When Meta-Prompting Helps

Self-Consistency

How It Works

Why Diversity Matters

The Cost-Accuracy Tradeoff

When to Use Self-Consistency

Pattern Selection

Patterns Compose

Closing

References

Further Reading