schemadex¶

Schema introspection and resolution toolkit for SQL agents.

schemadex is a Rust core with Python bindings that turns a live database into something an LLM agent can actually consume: a token-budgeted, sample-aware, fuzzy-resolvable view of every table and column, cached on disk so the agent doesn't re-pay the introspection cost on every turn.

Install¶

pip install schemadex                            # core + all backends
pip install "schemadex[langchain,langgraph]"     # with framework adapters

10-line example¶

from schemadex import SchemaCache

cache = SchemaCache.from_url("postgres://localhost/mydb")

for name in cache.list_tables():
    print(name)

result = cache.resolve("public.orders", "customer_idd")
print(result.matched, result.confidence)   # 'customer_id', 0.98

prompt, tokens = cache.describe_for_agent(max_tokens=1500, hint="orders by region")

Benchmarks¶

Synthetic adversarial corpus: 38 column-name hallucinations modeled after real LLM SQL agent failures — case flips, missing/added underscores, plural/singular drift, semantic near-misses.

Two harnesses, same corpus:

Harness	Baseline	Treatment	Δ
Literal lookup (no LLM)	0.0%	94.7%	+94.7 pp
`qwen2.5-coder:3b` via Ollama (real LLM)	97.4%	100.0%	+2.6 pp

Same model in both halves of the LLM row — the difference is describe_for_agent ranking + resolve_column post-correction. The single miss in the LLM baseline is "What is the orderid on shipments?" — the model picked id instead of order_id; resolve_column fixed it.

Median per-record latency: literal harness ~7 µs / 3 µs (schemadex overhead is in the noise); LLM harness ~285 ms (LLM-bound on an M2 Max).

This is a micro-benchmark of the resolution path, not an end-to-end LLM agent comparison on BIRD/Spider. See Benchmark for methodology.

Why¶

LLM SQL agents fail in the same three ways over and over:

They hallucinate column names because they got the schema in a thousand-token blob and forgot half of it.
They retry the same broken query because they don't know 'No Delay' is the sentinel value covering 80% of delay_code.
They re-introspect on every step because there's nowhere obvious to cache the schema.

schemadex fixes all three:

Resolution: resolve_column(table, candidate) returns a confidence + alternatives instead of letting the agent guess.
Sampling: sample_values=True collects top-K, percentiles, and flags any value over 40% frequency as a sentinel.
Cache: introspect once, persist to ~/.cache/schemadex/<db>/, refresh on DDL change. On a local 50-table SQLite, warm reads are ~47× cold; on remote Postgres the ratio grows since cold is network-bound.

Backends¶

Backend	Feature flag	Status
Postgres	`postgres`	Stable
SQLite	`sqlite`	Stable
DuckDB	`duckdb_backend`	Stable
MySQL	`mysql`	Stable
BigQuery	`bigquery`	Scaffold
Snowflake	`snowflake`	Scaffold
MSSQL	`mssql`	Scaffold

MCP server¶

schemadex ships an MCP server. Install with pip install "schemadex[mcp]" and wire it into Claude Code by adding to ~/.claude/mcp.json:

{
  "mcpServers": {
    "schemadex": {
      "command": "schemadex-mcp",
      "args": ["--url", "sqlite:///path/to/db.sqlite"]
    }
  }
}

The agent then has list_tables, describe_for_agent, resolve_column, and run_sql tools without any extra wiring.

Project status¶

Pre-1.0. API is still in motion.

License¶

Licensed under either of MIT or Apache-2.0 at your option.