Home Exam InfoDomain 04

Prompt Engineering & Structured Output

System prompts, role framing, examples, XML scaffolding, and structured output techniques that turn Claude into a dependable component of larger systems.

Section 01

Study Guide

Long-form notes and references.

Domain 4 shifts prompt engineering from 'writing instructions' to designing multi-layered contracts. Schemas, tool use, defensive parsing, and code-level guardrails work together to produce production-grade reliability from a probabilistic generator.

North Star

Use prompts for semantic problems, schemas for structural enforcement, and code for deterministic repair. Don't fight the model — architect around its probabilistic nature.

Glossary of Key Terms

tool_use (Structured Generation Mode): An API mechanism that constrains model output to a defined input_schema, dramatically improving JSON validity even when no external tool actually executes.
Dummy Tool: A tool defined purely to force schema-constrained generation; never executed externally. Often called 'tool-use-as-generation-mode'.
Sentinel Value: A reserved explicit placeholder (e.g., 'not_specified', 'unknown') that represents missing or unspecified data in a structured schema, replacing inconsistent omissions/nulls/blanks.
Few-Shot Curation: The strategic selection of edge cases, decision boundaries, and contrastive pairs as in-context examples — chosen for informational value rather than quantity.
Contrastive Example: Two nearly-identical inputs with different outputs presented together so the model learns which features actually drive the decision.
Decision Boundary: The fuzzy zone between two similar labels where classification ambiguity exists — the highest-value target for few-shot examples.
Chain-of-Thought (CoT): A reasoning technique where the model explains its reasoning step-by-step before producing a conclusion.
Grounded CoT: CoT in which every reasoning step must cite explicit evidence from the provided source material, producing inspectable, verifiable, audit-friendly logic.
Two-Pass Pipeline: An architecture that splits a task into separate identification and execution stages to improve reliability, scalability, and interpretability.
Decompose-and-Filter: A pipeline pattern where one pass identifies/filters relevant items and another pass processes them — used for nested tasks, large volumes, or fuzzy-to-deterministic handoffs.
Defensive Parsing: Repairing minor structural drift (string→array, '3'→3, missing fields) in deterministic code rather than over-prompting the model to be perfectly formatted.
Don't Fight the Model: The principle that probabilistic formatting variability should be handled with code, not with longer prompts and retries.
Prefilling: Supplying the beginning of the assistant's response to anchor the autoregressive generation path toward the intended task format and prevent false refusals.
Autoregressive Anchoring: The phenomenon where early generated tokens strongly bias the trajectory of subsequent tokens — exploited by prefilling.
False Refusal: The model incorrectly refusing a benign task because keywords (medical, security, legal) resemble risky topics during initial classification.
Primacy Effect: LLMs allocate disproportionate attention to the beginning of the context, making it the optimal location for critical safety constraints.
Lost-in-the-Middle: Reduced model attention to instructions buried in the middle of long contexts, weakening compliance with rules placed there.
Defense-in-Depth (Prompt): Reinforcing safety rules across multiple layers — system prompt placement, schemas, tool permissions, infrastructure — so no single layer is solely trusted.
Message Batches API: An asynchronous bulk-processing mode (~24h window) offering ~50% lower cost in exchange for flexible latency, ideal for offline pipelines.
Prompt Caching: Reusing computation on repeated context prefixes (system prompts, knowledge bases, style guides) so the same tokens are not re-tokenized and re-attended on every request.

← Previous

Claude Code

Context & Reliability

Prompt Engineering & Structured Output

Study Guide

tool_use as a Structured Generation Mode

Sentinel Values in Schemas

Strategic Few-Shot Curation (85% → 95%)

Chain-of-Thought with Source Grounding

Two-Pass / Decompose-and-Filter Pipelines

Defensive Parsing & 'Don't Fight the Model'

Prefilling Against False Refusals

System Prompt Architecture & Primacy

Message Batches API & Prompt Caching

Prompts as Code (Version-Controlled, Not DB-Edited)

Glossary of Key Terms

Prompt Engineering & Structured Output

Study Guide

01tool_use as a Structured Generation ModeWhy is the 'tool_use' mechanism recommended for generating JSON output even when no actual external tool is being called?

tool_use as a Structured Generation Mode

02Sentinel Values in SchemasIn the context of Schema Constraints, what is a 'sentinel value' and what problem does it solve?

Sentinel Values in Schemas

03Strategic Few-Shot Curation (85% → 95%)When attempting to increase model accuracy from 85% to 95%, how should few-shot examples be curated?

Strategic Few-Shot Curation (85% → 95%)

04Chain-of-Thought with Source GroundingWhat is 'Chain-of-Thought with Source Grounding' and why is it superior to standard CoT?

Chain-of-Thought with Source Grounding

05Two-Pass / Decompose-and-Filter PipelinesUnder what circumstances should a 'Two-Pass' or 'Decompose-and-Filter' pipeline be implemented?

Two-Pass / Decompose-and-Filter Pipelines

06Defensive Parsing & 'Don't Fight the Model'How does 'Defensive Parsing' apply the principle of 'Don't Fight the Model'?

Defensive Parsing & 'Don't Fight the Model'

07Prefilling Against False RefusalsHow can 'Prefilling' be used as a defensive strategy against false safety refusals?

Prefilling Against False Refusals

08System Prompt Architecture & PrimacyAccording to the System Prompt Architecture, where should critical safety constraints be placed to maximize compliance?

System Prompt Architecture & Primacy

09Message Batches API & Prompt CachingWhat are the primary cost optimization benefits of the Message Batches API and Prompt Caching?

Message Batches API & Prompt Caching

10Prompts as Code (Version-Controlled, Not DB-Edited)Why should prompts be managed as 'Code' in a version-controlled repository rather than as entries in a database?

Prompts as Code (Version-Controlled, Not DB-Edited)

Glossary of Key Terms