DAVA Norm.

Drop a messy CSV, get a clean structured table back. Snake-case headers, type inference + coercion, whitespace trim, dropped empty rows. Plus Smart Tables: per-column PII tags and outlier counts.

Norm is the on-ramp. The same pipeline every other DAVA product runs on — but exposed directly so you can clean a file and walk away. Sandboxed adapters, deterministic output, hash-chained to your audit log.

Product · 01

What Norm ships with.

Each capability runs in the shared engine — the Norm pipeline, the Trust audit chain, the Decisioning mode toggle. Same substrate as the other four products.

01

Type inference + coercion

Strings that should be dates become dates. Numbers with commas become numbers. Boolean-like fields normalize to true/false. Per-column inferred type returned alongside the cleaned bytes.

02

Smart Tables: PII tagging

Every column comes back with a sensitivity hint — email, phone, SSN, IBAN, credit card (Luhn-checked), DOB, name-like. Surfaces in the dashboard so customers can mask before exporting.

03

Outlier counts per numeric column

IQR + z-score, configurable. Surfaced as a count so analysts can decide whether to investigate or coerce.

04

Hash-chained provenance

The cleaned bytes ship with a SHA-256 of input, the adapter version that ran, and a chain pointer. Every subsequent operation on this file is linked to its origin.

SDK · python

Wire it in under a minute.

The Python SDK is the most mature. TypeScript follows the same shape. Both ship with strict types and async-first APIs.

Install
pip install dava-norm
main.py
import asyncio
from dava_norm import Client

async def main():
    async with Client(api_key="dava_live_…") as c:
        with open("messy.csv", "rb") as f:
            result = await c.preview("messy.csv", f.read())
        print(f"{result.rows_in} → {result.rows_out} rows "
              f"({result.dropped_rows_empty} empty rows dropped)")
        for col in result.columns:
            print(f"  {col.name_in!r} → {col.name_out!r}  "
                  f"({col.inferred_type}, {col.sensitivity_tag}, "
                  f"{col.outlier_count} outliers)")
        with open("clean.csv", "w") as f:
            f.write(result.cleaned_csv)

asyncio.run(main())
API surface

Endpoints you'll actually hit.

Same auth as the rest of DAVA: bearer API key on Authorization: Bearer dava_live_… or session cookie + CSRF for browser flows.

MethodEndpointPurpose
POST/v1/norm/previewMultipart upload of one CSV/TSV (≤ 5 MB). Returns cleaned bytes inline + per-column stats.

Bring us a norm problem we can prove.

We'll run your hardest dataset through DAVA Norm during a 5-day pilot. You keep the cleaned output and the evidence pack either way.