Reflection in AI Agents: Self-Reflection and Cross-Reflection
How an AI agent evaluates and improves its own output using self-reflection and cross-reflection. Step-by-step code guide for developers getting started.
Library
Technical articles: agents, architecture, tools and design decisions.
How an AI agent evaluates and improves its own output using self-reflection and cross-reflection. Step-by-step code guide for developers getting started.
LLMs can't see the current time or your database. Tool use gives them that access. I explain the loop, tool definition, and the most common mistakes.
Learn how to run AI tasks in parallel: Sectioning divides the work, Voting repeats for higher reliability. Code guide for junior developers.
The Router pattern classifies user input and delegates it to the right agent. Learn how to implement it with rules, semantics, and fallback chains.
GoF patterns don't disappear in agentic systems—they transform. Command=tool call, Mediator=orchestrator, Adapter=MCP. The complete map with code.
Learn how to divide difficult tasks into sequential LLM calls. With line-by-line commented TypeScript code and no external frameworks.
Detect names, emails, and passwords in text without sending data to the cloud. Learn how it works and how to use it from Python.
How a chatbot with intent classification works, how one based on an LLM works, and when to choose each approach with concrete examples.
Set up autoMode.environment so the classifier understands your infrastructure and stops interrupting you with false positives.
An LLM predicts the next most probable token. It doesn't understand. So how does it produce outputs that seem to require real comprehension?
What is prompt injection, OWASP's #1 vulnerability for LLMs, how the attack works, and how to protect your app from day one.
How to build AI automations using Zapier, Make, or n8n. Five real workflows, comparison table, and a framework for choosing your first automation.
Guardrails prevent AI agents from deleting data, leaking information, or entering loops. Learn what they are and how to implement them with code from scratch.
Learn how to use an AI model to automatically evaluate another agent's responses. Rubrics, evaluation types, and the most common pitfalls.
Learn how to set up Projects in Claude from scratch: custom instructions, knowledge files, and real-world examples for different profiles.
AI models break text into tokens. Spanish needs more than English to say the same thing. Here's why and what impact it has on your projects.
What temperature is in language models, how it affects responses, and how to choose the right value for your task. Examples with the same prompt.
What a language model's context window is, how it's measured in tokens, and concrete practices to leverage it from day one.
The 6 architectural decisions for your first production AI agent: precise objective, memory, tools with minimum privilege, and human oversight.
The methodology I use to make Claude Code improve itself through measuring, proposing hypotheses, iterating, and validating results
Skills, hooks, CLAUDE.md: the complete map of Claude Code's 8 tools and when to use each one. A practical guide for beginners.
How to transform text into numerical vectors and build real semantic search. From cosine similarity to RAG, with diagrams, TypeScript code, and interactive exercises.
The 5 prompt engineering patterns: zero-shot, few-shot, CoT, role prompting, and structured output with real TypeScript code.
MCP (Model Context Protocol) standardizes how AI agents connect to external tools. Explanation with line-by-line commented Python examples.
How to build a production RAG with semantic chunking, hybrid search and reranking. The real decisions that determine whether your system retrieves well or fails silently.
La IA genera código en segundos pero no conoce tus dependencias. La estrategia de tests que funciona con agentes de programación.
Technical guide to measuring the real reliability of AI agents in production: task completion rate, deterministic evals, LLM-as-judge, and silent degradation detection.
The language you use affects how your AI reasons. Three concrete mechanisms by which TypeScript improves code generated by agents like Claude Code.
Gartner predicts over 40% of enterprise AI agent projects will be canceled before 2027. Tool loop, prompt injection, and MCP explained without the hype.
Every time the cost of programming has fallen, the number of developers has grown. That's why I think we're in the best era to be a developer.
Cómo usar Claude Code Hooks para añadir gates de calidad que el modelo no puede saltarse: lint, tests, OpenAPI y seguridad antes de cada commit.
Hay una técnica que permite ejecutar docenas de herramientas en paralelo sin que el modelo vea ningún resultado intermedio. Nadie habla de ella porque está enterrada en la especificación. Cuando la descubras, vas a replantear cómo diseñas cualquier pipeline de agentes.
Data shows poorly written AGENTS.md files reduce success rates by 2% and increase costs by 20%. Here's how to write yours correctly.