Back to case studies
Healthcare

Entity Resolution Pipeline

Key outcome

99.4% accuracy on patient matching

Context

A regional health network had grown through acquisitions, inheriting 12 different EHR systems with incompatible patient identifiers. Clinicians couldn't get a complete view of patient history without manual searches across systems.

The Problem

Patient records were fragmented: same patient might appear with slightly different names, outdated addresses, or conflicting demographic data across systems. Name variations (nicknames, maiden names, typos) and address changes made deterministic matching impossible.

Why generic AI wouldn't work

Standard record linkage tools rely heavily on exact or near-exact matching. Healthcare data has unique challenges: nicknames (Bob vs. Robert), maiden names, transcription errors in clinical settings, and legitimate duplicates (family members at same address). False positives in healthcare have serious safety implications.

The System We Designed

  • Probabilistic matching using multiple weighted attributes
  • Learned similarity functions for names, addresses, and dates
  • Transitive closure to link records across multiple systems
  • Cluster quality scoring to identify uncertain matches
  • Master patient index with full provenance tracking

Human-in-the-Loop & Explainability

Uncertain matches (those below confidence threshold) are routed to trained data stewards for review. Confirmed matches and rejections are used to continuously refine matching weights. Clinical users can flag suspected duplicates from their workflow.

Outcomes

  • 99.4% accuracy on patient matching (validated against manual review sample)
  • Unified patient view across all 12 systems
  • 30% reduction in duplicate lab orders
  • Clinician time saved: estimated 15 minutes per complex patient encounter

Governance & Compliance

HIPAA-compliant audit logging. All matching decisions traceable. Data steward override capability. Regular accuracy audits against gold standard datasets.

Reference available upon request. Some details have been generalized to protect client confidentiality.

Facing a similar challenge?

We'd be happy to discuss how we might approach your specific situation. Every engagement starts with understanding your unique context.

Let's Talk