We built HALO (Hierarchal Agent Loop Optimizer), an open-source tool for debugging and optimizing AI agents using their execution traces. It’s a loop. Run your agent, feed the trac…

The installer downloads the latest release for your platform and sets up the desktop app. macOS uses a signed, notarized DMG. You can also install directly from the GitHub releases page. If you're looking for a hosted, plug-and-play version of HALO, please sign up for inference.net and follow the instructions here. HALO is a methodology for building recursively self-improving agent harnesses using RLMs. This repository contains: HALO is great at finding issues in production agent deployments. We find high-traffic environments tend to generate more data with higher variance across executions, creating the type of issues that HALO is great at identifying. A general-purpose harness like Claude Code is the wrong tool for trace analysis. This isn’t because the model isn’t smart, but because traces can get extremely long, and you need a specialized toolkit in order to make observations about systemic agentic behavior. We noticed in our testing that harnesses like CC would often overfit to an error present in a single/few traces rather than generalize to harness-level problems. This led us to creating a specialized form of a RLM. HALO uses the canonical OpenAI env vars: OPENAI_API_KEY for credentials and OPENAI_BASE_URL for OpenAI-compatible providers. If OPENAI_BASE_URL is unset, HALO uses https://api.openai.com/v1. Run halo --help to see all CLI options. The CLI mirrors the model/provider settings exposed by the Python SDK's ModelConfig and ModelProviderConfig. HALO can emit OpenInference-shaped traces of its own LLM, tool, and agent activity. It is off by default; nothing is emitted unless you pass --telemetry. When telemetry is enabled, CATALYST_OTLP_TOKEN uploads spans to inference.net Catalyst over OTLP. If it is unset, spans are written to a local JSONL file at ./halo-telemetry-{run_id}.jsonl in the current working directory.