Claude Code Hooks: calidad determinista para garantizar proyectos mantenibles

I’ve watched developers ship a complete feature with AI in 15 minutes. It works perfectly on the day. Three months later, nobody wants to touch that module — it’s become a knot of nested conditionals and cross-cutting dependencies. Here’s how to avoid that pattern using Claude Code Hooks: quality controls that run at deterministic points in the cycle and that the model cannot skip, even if it tries.

Why AI Generates Accidental Complexity

There’s an important difference between “easy” and “simple.” An agent can generate an endpoint that handles five business cases in a monolithic 200-line block. It’s easy: it works immediately, the tests pass, the user is happy. But it’s not simple: every new requirement forces you to touch the same block, side effects multiply, and whoever picks up the code six months later has to understand everything to change anything.

The problem isn’t just the model. In most of the projects where I’ve seen this degradation, there are three concrete causes: the agent doesn’t have enough context about the existing structure, the prior code already accumulates bad patterns that the agent follows as reference, or the model simply doesn’t find the relevant files and duplicates logic instead of reusing it. AI optimizes for solving the current prompt. Your job is to optimize for the project long-term.

Hooks are part of that answer. Not the only part, but the most underused.

What Are Claude Code Hooks

Hooks are shell commands, HTTP endpoints, prompts, or agents that run automatically at specific points in the Claude Code cycle. The fundamental difference from “telling Claude to do something”: hooks don’t depend on the model deciding to run them. They’re inflexible rules that execute independently of what the model has planned.

You configure them in .claude/settings.json (project scope, committable to the repo) or ~/.claude/settings.json (global scope, applies to all your projects). You can also manage them interactively with the /hooks command inside Claude Code.

The base structure looks like this:

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Edit|Write",
        "hooks": [
          {
            "type": "command",
            "command": "$CLAUDE_PROJECT_DIR/.claude/hooks/lint-changed-file.sh"
          }
        ]
      }
    ]
  }
}

Three levels of nesting: the event (PostToolUse), the matcher group that filters when it applies (Edit|Write), and the handler that defines what to run. The matcher is a regular expression against the tool name. $CLAUDE_PROJECT_DIR is an environment variable Claude Code injects with the project root path.

The most useful events for code quality are PreToolUse (before a tool runs, can block it), PostToolUse (after it runs successfully), and Stop (when Claude finishes responding). The full cycle includes many more, from SessionStart to WorktreeCreate, but these three cover 90% of quality use cases.

The Three Hook Types I Use in Production

Command Hooks: The Deterministic Base

The command type runs a shell script. The script receives the event context as JSON on stdin and communicates the result via exit code. The semantics depend on which event the hook fires from:

In PreToolUse: exit 0 allows the action to proceed; exit 2 blocks it with an error message Claude sees and can use to self-correct; any other code fails silently without blocking.
In PostToolUse: the file change has already happened. Exit 2 feeds stderr back to Claude so it can self-correct — it does not block or undo the action. Exit 0 continues normally.

This distinction matters. Don’t use a PostToolUse hook with exit 2 expecting it to prevent a change — it won’t. Use PreToolUse when you need prevention; use PostToolUse when you want correction feedback after the fact.

A concrete example: run lint only on the file Claude just modified.

#!/bin/bash
# .claude/hooks/lint-changed-file.sh
INPUT=$(cat)
FILE_PATH=$(echo "$INPUT" | jq -r '.tool_input.file_path // empty')

# Only lint TypeScript/JavaScript files
if [[ "$FILE_PATH" != *.ts && "$FILE_PATH" != *.js ]]; then
  exit 0
fi

if ! npx eslint -- "$FILE_PATH" --max-warnings 0; then
  echo "Lint failed in $FILE_PATH. Fix the errors before continuing." >&2
  exit 2
fi

exit 0

One important detail: don’t add 2>&1 to the ESLint command. For PostToolUse hooks with exit 2, Claude Code reads stderr to surface actionable feedback. With 2>&1, ESLint’s rule violations, line numbers, and error messages all land on stdout where they’re invisible to Claude. Let ESLint write directly to stderr.

The JSON that arrives on stdin includes tool_input.file_path for write events, tool_input.command for Bash, and session_id, cwd, and hook_event_name in all events. You can extract any of these fields with jq to condition the behavior.

Prompt Hooks: Semantic Judgment With a Fast Model

For conditions a linter can’t evaluate, the prompt type sends a prompt to a Claude model (Haiku by default, configurable with the model field) and expects a JSON decision in the format {"ok": true} or {"ok": false, "reason": "..."}.

One important limitation: the $ARGUMENTS placeholder is replaced with the full event JSON before the prompt is sent. For Write events this contains the complete file content — usable for structural analysis. For Edit events it only contains the old_string and new_string fragments. A prompt hook cannot reliably count total lines or nesting depth from a partial diff.

When to Use `prompt` vs `agent`: The Trade-Off Between Cost and Capability

The key question is: do you need to inspect the codebase, or just evaluate the event context?

Use prompt when:

You only need to evaluate the data already in the event JSON ($ARGUMENTS)
The decision is semantic but doesn’t require reading the source code
You want low latency (a single LLM call, typically Haiku — fast and cheap)
Example: “Could this bash command affect production?” — the command is already in the event JSON, no need to open files

Use agent when:

You need to verify the real state of the codebase (read files, execute tests, search for patterns)
The verification requires context beyond what the event provides
You can tolerate higher latency (agent hooks default to 60 seconds and support up to 50 tool turns)
Example: “Do all modified files have corresponding tests?” — you need Glob, Read, and Grep to answer

The practical implication: detecting levels of nesting is conceptually valid as a quality gate, but it’s a terrible use for an LLM. Nesting depth is a measurable, algorithmic property — a linter or regex-based script can determine it in milliseconds with absolute certainty. Sending file content to an LLM model, paying for tokens, and waiting for a response when you could run jq or an AST parser locally is a waste of latency and cost.

Reserve agent hooks for verifications that genuinely require reasoning about code. A test suite pass/fail is a perfect example: the agent must run the test command, parse the output, and make a judgment call. Naming conventions, function length, and nesting depth are not. Use a command hook with a linter for those.

For simpler semantic checks where the event content is sufficient — checking whether naming conventions were followed in a newly written file, for example — type: "prompt" with Haiku is faster and cheaper than spinning up an agent. Make the prompt ask a binary, concrete question. The more open-ended the question, the more variability in the response.

Agent Hooks: Verification With Access to Real Code

When verification requires inspecting files, searching for patterns, or running commands, the agent type launches a subagent with tool access (Read, Grep, Glob, Bash) that can investigate the real state of the project before returning its decision.

{
  "hooks": {
    "Stop": [
      {
        "hooks": [
          {
            "type": "agent",
            "prompt": "Verify that all unit tests pass. Run the test suite and check the results. $ARGUMENTS",
            "timeout": 120
          }
        ]
      }
    ]
  }
}

The agent hook has a default timeout of 60 seconds and can run up to 50 tool turns. The difference from a prompt hook: while a prompt only evaluates the event context, an agent can open files, search the codebase, and run commands before deciding. The agent returns the same decision format — {"ok": true} or {"ok": false, "reason": "..."} — but with access to the full toolkit.

The table below summarizes the differences between the three types:

Type	What it runs	When to use it	Cost
`command`	Shell script	Deterministic rules: lint, formatting, tests	Minimal
`prompt`	LLM model (Haiku)	Semantic judgment: naming, structure, conceptual coverage	Low
`agent`	Subagent with tools	Verification against real code: tests, dependencies, consistency	Medium

A command hook running lint takes 1–3 seconds. An agent hook running the full test suite can take 30–120 seconds. Match the hook type to the feedback speed you need at each point in the cycle.

Verification Gates Before Claude Finalizes

The place where I’ve gotten the most value from hooks is as a gate before Claude declares the task complete. Using the Stop event with a verification agent, I can enforce concrete conditions before the model says “I’m done.”

These are the three checks I run in serious projects:

1. Mandatory tests before finishing

The most basic. A Stop hook with type: "agent" that runs the test suite and blocks if any fail.

2. OpenAPI documentation kept current

I use this to ensure any change to API endpoints is reflected in the spec. There are two approaches depending on how rigorous you need to be:

Simple approach: check if the file was modified

This script checks whether route files changed in the current session and, if so, whether openapi.json was also updated:

#!/bin/bash
# .claude/hooks/check-openapi-freshness.sh
INPUT=$(cat)

# Check if any route files were modified in this session
ROUTES_MODIFIED=$(git diff --name-only HEAD | grep -E 'routes|controllers' | wc -l)

if [ "$ROUTES_MODIFIED" -gt 0 ]; then
  # git diff --exit-code exits 0 if no changes, 1 if there are changes
  # If openapi.json has no diff, it wasn't updated — block
  if git diff --exit-code openapi.json > /dev/null 2>&1; then
    echo "openapi.json must be updated when route files change." >&2
    exit 2
  fi
fi

exit 0

Wire this to a Stop event in settings.json:

{
  "hooks": {
    "Stop": [
      {
        "hooks": [
          {
            "type": "command",
            "command": "$CLAUDE_PROJECT_DIR/.claude/hooks/check-openapi-freshness.sh"
          }
        ]
      }
    ]
  }
}

Robust approach: analyze code to verify documentation matches API changes

If you need stronger verification that the documentation actually reflects the code changes (not just that the file was touched), use an agent hook. This approach analyzes the modified route files and compares them against the OpenAPI spec, ensuring endpoints, methods, and parameters match:

{
  "hooks": {
    "Stop": [
      {
        "hooks": [
          {
            "type": "agent",
            "prompt": "Check all the documents modified in the commit and verify the documentation is updated following the standards defined in the skill. Specifically: 1) Identify all API route files that were modified in this session. 2) For each modified route, extract the endpoint paths, HTTP methods, and parameter definitions from the code. 3) Verify that these match exactly in the openapi.json or openapi.yaml file. 4) Check that request/response schemas are documented. If the API routes changed but the OpenAPI spec does not reflect those changes, respond with {\"ok\": false, \"reason\": \"[list the mismatches]\"}. If everything is documented correctly, respond with {\"ok\": true}.",
            "timeout": 120
          }
        ]
      }
    ]
  }
}

The agent approach is more powerful because it doesn’t just check if the file was modified — it uses Read, Glob, and pattern analysis to ensure the documentation content actually matches the code. Use this when your project has strict API contract requirements or when false positives from the simple script become a problem.

3. Basic security analysis

An agent hook on Stop that looks for problematic patterns before Claude finishes: hardcoded credentials, SQL queries built with string concatenation, or new endpoints added without authentication middleware. This type of verification requires inspecting real code, so type: "agent" is the right choice.

One practical note: this agent reads files to scan them, which means it may transmit the very credentials it’s looking for when it sends that content to the API. The better approach is to pass only file paths in the prompt and let the agent do its own local reads. Smaller prompt, lower cost, and sensitive content doesn’t need to leave the machine:

{
  "hooks": {
    "Stop": [
      {
        "hooks": [
          {
            "type": "agent",
            "prompt": "Review the files modified in this session looking for: 1) hardcoded credentials or secrets, 2) SQL queries using string concatenation instead of parameters, 3) new endpoints without authentication middleware. Use your tools to read the modified files locally. If you find any issue, respond {\"ok\": false, \"reason\": \"[description of the problem and affected file]\"}. If everything looks fine, {\"ok\": true}.",
            "timeout": 90
          }
        ]
      }
    ]
  }
}

The minimum viable setup for any project that will last more than three months: a test hook on Stop. Without it, anything Claude generates can break existing code without anyone noticing until someone runs the tests manually.

Token Cost Isn’t a Problem, It’s a Trade-Off

Model-based and agent hooks consume additional tokens. A prompt hook with Haiku evaluating a naming convention can cost 200–400 tokens per invocation. An agent hook running the test suite costs, in my experience, between 2,000 and 10,000+ tokens depending on how many files the agent reads — a simple run that only checks test output stays at the low end; one that reads multiple files for security analysis can push well past the top.

The right framing isn’t “can I afford this cost?” but “what does it cost me not to have it?” A security bug that reached production because nobody verified AI-generated code costs more in debugging time, reputation damage, and incidents than weeks of hook usage. For serious long-term development, the token cost is a justifiable trade-off.

What does make sense to optimize is which hook type to use where. The hierarchy is clear:

Use a command hook (shell script) first — no token cost, runs in milliseconds, deterministic. Best for: lint, format checks, test execution, pattern detection via regex or AST parsing.
Use a prompt hook with Haiku second — low token cost (200–400 tokens), single LLM call, suitable for semantic judgments where the event context is sufficient. Best for: naming conventions, variable clarity, code style decisions.
Use an agent hook last — higher token cost (2,000–10,000+ tokens), can read files and run tools, necessary only for verifications that require reasoning about code state. Best for: test suites, security scanning, cross-module consistency, dependency analysis.

The most common mistake: reaching for an agent to verify something that bash or a linter can check in one line. Nesting depth, line counts, and bracket balancing are algorithmic — don’t pay for an LLM to solve them.

Common Mistakes

Hooks That Are Too Strict from the Start

The most frequent mistake when starting out: configuring a PreToolUse hook that blocks 80% of Claude’s actions because the rules are too aggressive. The symptom is Claude in a loop, trying the same action in slightly different ways.

Start permissive. A PostToolUse hook that runs lint but doesn’t block (a command type that always exits 0, showing warnings on stderr) is a good first step. Convert warnings to blocks only once you’ve calibrated that the rules generate zero false positives in your project.

Excessive Agent Context

Giving a worker agent access to 100 files “just in case” has the opposite effect: the model loses focus and makes changes to unrelated modules. The symptom is asking “fix this bug in the payments module” and Claude modifying configuration files or authentication tests.

Minimum viable context. Add files to the context only when they’re strictly necessary for the task. A UserPromptSubmit hook with an agent that verifies task scope before proceeding can help catch prompts that are too ambiguous before they execute.

Full Tests as a PostToolUse Hook

Configuring a PostToolUse hook that runs the full test suite after every file change is the fastest way to abandon hooks entirely. Every minor edit takes two minutes waiting for 300 tests.

Divide responsibilities: lint and unit tests for the modified file in PostToolUse (fast, 3–15 seconds), full suite in the Stop hook or as a CI gate. Feedback loop speed matters for hooks to remain useful rather than feel like friction.

Using LLM Hooks for Deterministic Checks

A tempting but costly mistake: using an agent or prompt hook to verify properties that can be checked algorithmically — nesting depth, line counts, bracket matching, or presence of specific patterns. These checks have a single correct answer and should run deterministically.

Use a shell script (command hook) with regex or an AST parser instead. They’re orders of magnitude faster, consume zero tokens, and always give the same result. Reserve LLM hooks for genuinely semantic decisions: “Is this variable name clear?” or “Does this error message provide enough context?”

The counter-argument (“what if the rules are subjective?”) is valid for naming and style, but even then, you save vastly more by using a linter (deterministic, milliseconds) than a model (tokens, latency, variability).

Vague Prompts in Model Hooks

A prompt hook that says “evaluate whether the code is good and return ok or not” generates inconsistent responses. Haiku will interpret “good” differently each invocation. The practical result is hooks that sometimes block, sometimes don’t, for the same reasons.

Model hooks work best with binary, concrete questions: “does this SQL query use parameterized statements instead of string concatenation?” or “are there hardcoded API keys in this code?” One question, one answer, predictable result.

Implementation Checklist

Create .claude/hooks/ in the project and add scripts to version control
Treat .claude/** changes as code requiring the same review as production scripts — add to CODEOWNERS if the project has one (in open-source repos, a malicious contributor could add hooks that run arbitrary code on every developer’s machine)
Configure a PostToolUse hook with Edit|Write matcher for lint on the modified file
Configure a Stop hook with type: "agent" to verify the test suite passes
Adjust the test hook timeout to the actual suite runtime (default: 60s for agent hooks, 600s for command hooks)
Make scripts executable with chmod +x .claude/hooks/*.sh
Verify hooks appear in /hooks and trigger correctly
Add security hook for projects with sensitive data or public endpoints
Document in CLAUDE.md what each hook does and why it exists

Sources

Hooks reference — Claude Code documentation — input/output schemas, hook types, lifecycle events, and configuration examples.
Automate workflows with hooks — Claude Code guide — practical guide with ready-to-use examples and common troubleshooting.

Frequently Asked Questions

Why hooks and not just a good CLAUDE.md?

CLAUDE.md tells the model how it should behave. Hooks define what happens regardless of how the model decides to behave. For context and style preferences, CLAUDE.md is the right tool. For quality guarantees that can’t be broken, hooks are the answer. Use them together: CLAUDE.md for instructions, hooks for invariants.

Do hooks work in Claude Code CLI, Cursor, and other editors?

Hooks execute when Claude Code itself runs — they’re not editor features. They work in any context where Claude Code runs: CLI terminal, and automated pipelines via API. They do not run in Cursor or other editors that use their own model integration layer, even if .claude/settings.json exists in the repo. One documented exception: PermissionRequest hooks don’t fire in non-interactive mode (-p). For automated pipelines, use PreToolUse instead.

What if the verification hook keeps failing and blocks everything?

Check the error message in Claude’s transcript. The text you write to stderr at exit 2 reaches the model directly as feedback. If the hook is rejecting systematically, either the rules are too strict for the current state of the project, or there’s a bug in the script. You can temporarily disable all hooks with "disableAllHooks": true in settings while you debug. For detailed debugging, run claude --debug or activate verbose mode with Ctrl+O.

Yes. Hooks in .claude/settings.json can be committed to the repository and apply to any team member who uses Claude Code on that project. Hooks in ~/.claude/settings.json are local to your machine. For hooks with absolute paths or developer-specific configurations, use .claude/settings.local.json, which is gitignored by default. The practical recommendation: lint and test hooks in the project (.claude/settings.json), desktop notifications in global (~/.claude/settings.json). Treat changes to .claude/** with the same review rigor as production scripts — if the project has a CODEOWNERS file, include it.