schemadex¶
Schema introspection and resolution toolkit for SQL agents.
schemadex is a Rust core with Python bindings that turns a live database into something an LLM agent can actually consume: a token-budgeted, sample-aware, fuzzy-resolvable view of every table and column, cached on disk so the agent doesn't re-pay the introspection cost on every turn.
Install¶
pip install schemadex # core + all backends
pip install "schemadex[langchain,langgraph]" # with framework adapters
10-line example¶
from schemadex import SchemaCache
cache = SchemaCache.from_url("postgres://localhost/mydb")
for name in cache.list_tables():
print(name)
result = cache.resolve("public.orders", "customer_idd")
print(result.matched, result.confidence) # 'customer_id', 0.98
prompt, tokens = cache.describe_for_agent(max_tokens=1500, hint="orders by region")
Benchmarks¶
Synthetic adversarial corpus: 38 column-name hallucinations modeled after real LLM SQL agent failures — case flips, missing/added underscores, plural/singular drift, semantic near-misses.
Two harnesses, same corpus:
| Harness | Baseline | Treatment | Δ |
|---|---|---|---|
| Literal lookup (no LLM) | 0.0% | 94.7% | +94.7 pp |
qwen2.5-coder:3b via Ollama (real LLM) |
97.4% | 100.0% | +2.6 pp |
Same model in both halves of the LLM row — the difference is describe_for_agent ranking + resolve_column post-correction. The single miss in the LLM baseline is "What is the orderid on shipments?" — the model picked id instead of order_id; resolve_column fixed it.
Median per-record latency: literal harness ~7 µs / 3 µs (schemadex overhead is in the noise); LLM harness ~285 ms (LLM-bound on an M2 Max).
This is a micro-benchmark of the resolution path, not an end-to-end LLM agent comparison on BIRD/Spider. See Benchmark for methodology.
Why¶
LLM SQL agents fail in the same three ways over and over:
- They hallucinate column names because they got the schema in a thousand-token blob and forgot half of it.
- They retry the same broken query because they don't know
'No Delay'is the sentinel value covering 80% ofdelay_code. - They re-introspect on every step because there's nowhere obvious to cache the schema.
schemadex fixes all three:
- Resolution:
resolve_column(table, candidate)returns a confidence + alternatives instead of letting the agent guess. - Sampling:
sample_values=Truecollects top-K, percentiles, and flags any value over 40% frequency as a sentinel. - Cache: introspect once, persist to
~/.cache/schemadex/<db>/, refresh on DDL change. On a local 50-table SQLite, warm reads are ~47× cold; on remote Postgres the ratio grows since cold is network-bound.
Backends¶
| Backend | Feature flag | Status |
|---|---|---|
| Postgres | postgres |
Stable |
| SQLite | sqlite |
Stable |
| DuckDB | duckdb_backend |
Stable |
| MySQL | mysql |
Stable |
| BigQuery | bigquery |
Scaffold |
| Snowflake | snowflake |
Scaffold |
| MSSQL | mssql |
Scaffold |
MCP server¶
schemadex ships an MCP server. Install with pip install "schemadex[mcp]" and wire it into Claude Code by adding to ~/.claude/mcp.json:
{
"mcpServers": {
"schemadex": {
"command": "schemadex-mcp",
"args": ["--url", "sqlite:///path/to/db.sqlite"]
}
}
}
The agent then has list_tables, describe_for_agent, resolve_column, and run_sql tools without any extra wiring.
Project status¶
Pre-1.0. API is still in motion.
License¶
Licensed under either of MIT or Apache-2.0 at your option.