Latest

Evaluating AI Agents: A Hybrid Deterministic and Rubric-Based Framework cover
Joseph Renner03.26.26

Evaluating AI Agents: A Hybrid Deterministic and Rubric-Based Framework

How Hebbia measures agent quality at scale with a hybrid evaluation methodology.

Engineering
FFTxt: 30k Parameters Is All You Need cover
Lukas Schmit03.19.26

FFTxt: 30k Parameters Is All You Need

Hebbia researchers leveraged classic signal processing techniques to build a text detection model smaller than most of the images it classifies.

Engineering
Reaching Autonomous Consensus on Agentic Outputs cover
Jake Skinner, Davis Li, Adithya Ramanathan 09.19.25

Reaching Autonomous Consensus on Agentic Outputs

We built a statistically rigorous, consensus-based framework for evaluating LLM outputs and used it to benchmark today's leading models on the tasks that matter most to finance professionals.

Engineering
A Look Inside Hebbia's "Deeper" Research Agent cover
William Luer07.30.25

A Look Inside Hebbia's "Deeper" Research Agent

We built a multi-agent system that goes beyond public web search to synthesize insights for any data source, including proprietary data sources.

Engineering
The Multi-Agent Redesign Behind Matrix cover
Lucas Haarmann and Bowen Zhang06.17.25

The Multi-Agent Redesign Behind Matrix

At the end of last year, we returned to the drawing board and redesigned Matrix Agent.

Engineering
The Distributed System Behind Hebbia's High-Scale AI cover
Ben Devore03.17.25

The Distributed System Behind Hebbia's High-Scale AI

We built a distributed LLM request scheduler that intelligently routes billions of tokens per day across multiple providers so high-priority work always gets through, even under rate limits.

Engineering
Goodbye, RAG: How Hebbia solved Information Retrieval for LLMs cover
Adithya Ramanathan02.14.25

Goodbye, RAG: How Hebbia solved Information Retrieval for LLMs

After pioneering semantic search and RAG, we found both fell short on the hardest questions so we scrapped them and built a new information retrieval system from scratch.

Engineering