Home/Products/Connect

DAVA Connect.

Discover relationships across datasets — FK candidates, value overlap, semantic links. The ring graph that finds the joins you forgot.

Upload two or more files. Connect runs a deterministic first pass (FK candidates, value-overlap, name Jaccard) and an optional LLM-augmented second pass that proposes semantic relationships across columns the heuristic missed.

Product · 03

What Connect ships with.

Each capability runs in the shared engine — the Norm pipeline, the Trust audit chain, the Decisioning mode toggle. Same substrate as the other four products.

01

Foreign-key discovery

Surfaces candidate FK relationships: which column in file A is referenced by which column in file B, with a confidence score per pair.

02

Value overlap + name Jaccard

Two passes: distinct-value overlap across columns, and name similarity (Jaccard on tokenized column names). Both contribute to the confidence score.

03

Semantic LLM pass (opt-in)

When enabled, Anthropic Sonnet 4.6 proposes additional relationships using column samples + table descriptions. The LLM never sees your raw data — only column profiles and a few sample values per column.

04

Ring graph visualization

The dashboard renders inferred relationships as chord arcs across a ring of tables. Click any arc to see the underlying evidence: shared values, FK match rate, semantic rationale.

SDK · python

Wire it in under a minute.

The Python SDK is the most mature. TypeScript follows the same shape. Both ship with strict types and async-first APIs.

Install
pip install dava-connect
main.py
import asyncio
from dava_connect import Client

async def main():
    async with Client(api_key="dava_live_…") as c:
        files = [open(f, "rb").read() for f in ("crm.csv", "orders.csv")]
        result = await c.discover(files, semantic=True)
        for edge in result.edges:
            print(f"{edge.from_table}.{edge.from_col} ↔ "
                  f"{edge.to_table}.{edge.to_col}  "
                  f"({edge.kind}, {edge.confidence:.2f})")

asyncio.run(main())
API surface

Endpoints you'll actually hit.

Same auth as the rest of DAVA: bearer API key on Authorization: Bearer dava_live_… or session cookie + CSRF for browser flows.

MethodEndpointPurpose
POST/v1/connect/discoverMulti-file upload; returns inferred relationships + confidence per edge.
GET/v1/connect/jobs/{id}Async job result for large file sets.

Bring us a connect problem we can prove.

We'll run your hardest dataset through DAVA Connect during a 5-day pilot. You keep the cleaned output and the evidence pack either way.