AgentData is a self-hostable semantic layer and data fabric that auto-discovers the business entities inside your databases and serves them to people and AI agents as plain-language questions over REST and MCP — without writing SQL and without your row data ever leaving your environment. It's a commercial data-management product, with integration pipelines on the roadmap.
The gap between your data and the AI agents that want to use it
Postgres here, Snowflake there, an S3 lake, a legacy SQL Server. AgentData clusters equivalent tables across all of them into one business entity.
AgentData profiles every table, classifies it into an entity and role, infers relationships, and emits Cube + dbt-semantic YAML you simply review and approve.
Only the semantic model and the question reach the LLM. SQL runs locally against read-only sources; only the result returns. Run it fully air-gapped if you need to.
One validated query path backs both a REST API and an MCP server — so Claude, ChatGPT or your own agents query governed data, not raw tables.
Two paths: one writes the model, one reads the data — both governed
Point AgentData at a source and it runs: profile → classify → cluster → relate → emit. Equivalent tables across sources collapse into a single entity (Order, Customer, Product…), relationships are inferred, and a Model Registry is written for human review.
Approve-by-default review, drag-and-drop merges (union / join / SCD), calculated columns, and metrics taught in plain language ("revenue = price × qty − discount").
A question runs: plan → validate → dispatch → execute → result. The planner generates SQL from the approved model only, validates it, and executes it locally against your read-only sources.
Conversational memory ("just the top 2 of those"), multilingual questions (ask in Hebrew or any language), and saved-and-approved queries that guide the planner for everyone.
One config-driven splitter routes each source to the right adapter and engine — native, Trino or Athena.
Profiles, classifies and clusters tables into business entities and emits standards-based YAML.
Ask in plain language, follow up, and ask in any language — answers show the plan, the model YAML and the SQL.
The same validated retrieval path backs both interfaces — multi-tenant by per-user API key.
Keep everything inside your boundary. Pluggable LLM backend, zero external egress when you need it.
Production concerns handled out of the box, from cost to audit.
AgentData ships a live, secure Model Context Protocol endpoint. Create a per-user API key in the Connect tab, then point any MCP client at it.
Exposes list_entities, describe_entity, query_metric and query_nl over the same validated path used by the REST API. query_nl works in any language and never hard-fails.
No. It connects read-only. Only the semantic model and your question reach the LLM — never row data. SQL is generated from the model and executed locally; only the result comes back.
PostgreSQL, MySQL, SQL Server, Oracle, Snowflake, Redshift and Synapse via SQLAlchemy, plus S3 + Glue/Athena and Azure Blob via a lake adapter.
Yes. Point it at a local OpenAI-compatible server (Ollama / vLLM / TGI) for zero external egress, or use AWS Bedrock over PrivateLink. Credentials are Fernet-encrypted at rest; adapters are read-only.
Instead of hand-modelling every cube and metric, AgentData auto-discovers entities across heterogeneous sources, collapses equivalent tables into one entity, infers relationships, and emits Cube + dbt-semantic YAML you review — then serves it to both people and AI agents through one validated path.
Book a 30-minute demo and we'll connect it to a sample of your stack — or open the live app and explore the Northwind demo yourself.