SQL Identity Resolution
Production-grade deterministic identity resolution for modern data warehouses.
-
Multi-Platform
Run on DuckDB, Snowflake, BigQuery, or Databricks with the same logic.
-
Production Ready
Dry run mode, metrics export, audit logging, and data quality controls.
-
SQL-First & dbt Native
Pure SQL execution or use the dbt package for seamless integration.
-
Open Source
Full control, full transparency, zero licensing costs.
Why SQL Identity Resolution?
Your data never leaves your warehouse
Unlike SaaS solutions, SQL IDR runs entirely within your data platform. No data egress, no third-party processing, full compliance control.
What is Identity Resolution?
Identity resolution is the process of matching and merging records across multiple data sources to create a unified view of each entity (customer, user, account).
This solution uses deterministic matching with label propagation to create stable, explainable identity clusters.
graph LR
A[CRM: customer_123<br/>email: john@acme.com] --> IDR[Identity Resolution]
B[Ecommerce: order_456<br/>email: john@acme.com] --> IDR
C[Mobile: device_789<br/>phone: 555-0123] --> IDR
IDR --> D[Unified Identity<br/>resolved_id: customer_123]
Quick Start
Key Features
| Feature | Description |
|---|---|
| Deterministic Matching | Rule-based matching on email, phone, loyalty ID, etc. |
| Label Propagation | Graph-based clustering using connected components |
| Incremental Processing | Watermark-based delta processing |
| Dry Run Mode | Preview changes before committing |
| Metrics Export | Prometheus, DataDog, webhook integrations |
| Production Hardening | Max group size limits, identifier exclusions |
| Golden Profiles | Automatic best-record selection |
Architecture Overview
graph TB
subgraph Sources["Source Systems"]
S1[CRM]
S2[E-commerce]
S3[Mobile App]
end
subgraph Meta["Metadata Layer"]
M1[source_table]
M2[rule]
M3[identifier_mapping]
end
subgraph Process["Processing"]
P1[Build Edges]
P2[Label Propagation]
P3[Cluster Assignment]
end
subgraph Output["Output Layer"]
O1[Membership]
O2[Clusters]
O3[Golden Profiles]
O4[Metrics]
end
Sources --> Process
Meta --> Process
Process --> Output
Next Steps
-
Get running in 5 minutes
-
Set up rules and sources
-
Preview changes safely
-
Monitor your pipeline