← All guides

Practical guide

5 Common Prompt Mistakes (And How to Fix Them)

The five prompt anti-patterns that break automation at scale. Each one looks innocent in testing but causes silent failures in production.

8 min read

1

Vague instructions that leave too much interpretation

The problem

Prompts like 'Summarize this article' or 'Write a good email' give the model no structure, no constraints, and no success criteria. The output will be different every time — sometimes good, often unusable.

Why it breaks at scale

LLMs are pattern matchers. Without explicit patterns to follow, they fall back on statistical averages that rarely match what you actually want. Vague prompts produce vague outputs.

Define the output shape explicitly

Before:

Summarize this article.

After:

Summarize this article in exactly 3 bullet points. Each bullet must: (1) start with a category label in ALL CAPS, (2) contain one key finding from the article, and (3) stay under 25 words. If the article is too short to extract 3 findings, return only as many bullets as you can justify.

Quick diagnostic

If your prompt is under 20 words and contains no numbers, lists, or constraints, it's probably too vague.

2

Missing context about who the output is for

The problem

A prompt might be technically clear, but if the model doesn't know the audience, expertise level, or use case, it will produce generic content that doesn't fit anyone.

Why it breaks at scale

The same information can be presented as a technical deep-dive for engineers, an executive summary for founders, or a beginner-friendly explainer for newcomers. Without audience context, the model guesses — and usually guesses wrong.

Set the role and audience upfront

Before:

Extract the key decisions from these meeting notes and list the action items.

After:

You are an operations assistant for a fast-moving startup. Extract key decisions from these meeting notes for a non-technical founder who wasn't in the room. List action items with: owner name, deadline, and priority (P0/P1/P2). If an owner or deadline isn't mentioned, return 'TBD' instead of guessing.

Quick diagnostic

If your prompt doesn't specify a role ('You are a...') or an audience ('...for X'), add both.

3

No error handling for missing or bad input

The problem

Most prompts assume happy-path inputs: the document exists, the data is complete, the user message is coherent. Real-world inputs are messy, incomplete, and sometimes nonsensical.

Why it breaks at scale

Without explicit error handling, the model will fabricate outputs even when the input is garbage. A prompt that processes invoices should reject corrupted PDFs — but if you didn't tell it to, it will make up numbers.

Add validation and rejection rules

Before:

Read the uploaded invoice and extract the vendor name, total amount, and line items.

After:

Read the uploaded invoice and extract the vendor name, total amount, and line items. BEFORE extracting, check: (1) Is the file readable? (2) Is it actually an invoice (not a receipt, contract, or other document)? If either check fails, return {status: 'rejected', reason: 'TYPE'} and stop. Do not fabricate data.

Quick diagnostic

If your prompt processes external inputs (files, user messages, API data) and has no validation step, add one.

4

Open-ended outputs with no format or length constraints

The problem

Prompts that ask for 'a summary', 'a response', or 'an analysis' without format or length limits produce outputs that are unpredictable in both structure and cost. One input might generate 50 words, another 500.

Why it breaks at scale

Unbounded outputs create two problems: (1) downstream systems can't parse them reliably, and (2) token costs spiral out of control. A 500-word summary that should have been 50 words costs 10x more and is less useful.

Define the output format and length budget

Before:

Analyze the customer feedback and suggest improvements.

After:

Analyze the customer feedback and suggest improvements. Output format: JSON with keys {summary: string, themes: string[], suggestions: string[], sentiment: 'positive'|'neutral'|'negative'}. Constraints: summary must be under 100 words, themes array max 5 items, suggestions array max 3 items. If feedback is empty or incoherent, return {error: 'invalid_input'}.

Quick diagnostic

If your prompt doesn't specify a format (JSON, markdown, bullet list) or a length limit (words, items, characters), add both.

5

Missing safety rails for high-stakes outputs

The problem

Prompts that generate customer-facing content, make decisions, or trigger actions need safety constraints. Without them, the model can produce outputs that damage trust, expose sensitive information, or cause real harm.

Why it breaks at scale

LLMs don't have judgment — they have pattern matching. A prompt that generates Slack messages might accidentally include sensitive data. A prompt that classifies support tickets might mark a refund request as 'spam'. Safety rails prevent these failures.

Add explicit safety constraints and escalation rules

Before:

Read the customer message and draft a reply that solves their problem.

After:

Read the customer message and draft a reply that solves their problem. SAFETY RULES: (1) Never include pricing or account data in the reply. (2) If the customer mentions a security issue, return {escalate: true, reason: 'security'} instead of a draft. (3) If you're unsure about a technical fix, say 'I'll connect you with a specialist' instead of guessing. (4) Never promise refunds, credits, or policy exceptions.

Quick diagnostic

If your prompt generates content that a customer will see, or makes decisions that trigger actions, add safety rails.

The pattern underneath all five mistakes

Every mistake in this list comes from the same root cause: assuming the model will figure out what you want.

LLMs are powerful but not telepathic. They don't know your audience, your constraints, your failure modes, or your safety requirements unless you tell them. The fix is always the same: be explicit about what success looks like, what failure looks like, and what the model should do in each case.

A prompt that works in testing but breaks in production usually failed to define one of these: output shape, audience, error handling, length budget, or safety rails.

Test your prompts against these mistakes

The Prompt Evaluator checks for all five of these anti-patterns automatically, plus 20 more signals across clarity, context, robustness, cost, and safety.

If you catch these mistakes every week, that is the signal to move from the free evaluator to a paid prompt QA workflow.