Skip to content

Add entity_resolution template: resolution → exposure accumulation → reinsurance (Graph + Rules + Prescriptive)#88

Merged
cafzal merged 7 commits into
mainfrom
worktree-entity-resolution-template
Jun 17, 2026
Merged

Add entity_resolution template: resolution → exposure accumulation → reinsurance (Graph + Rules + Prescriptive)#88
cafzal merged 7 commits into
mainfrom
worktree-entity-resolution-template

Conversation

@cafzal

@cafzal cafzal commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

What

A new Financial Services template — the first in the library to cover entity resolution — built around one thesis: resolution is the step that makes the downstream reasoning correct. It resolves duplicate policyholder records scattered across an insurer's policy systems (auto / home / life) and an acquired book, then acts on the resolved view.

Until records are resolved, each insured's total exposure to the carrier is invisible — no single policy looks dangerous, yet a household's combined sum insured can breach an accumulation limit. The template makes that concrete.

Chain (Graph + Rules-based + Prescriptive)

  1. Pre-process (pandas): blocking cuts 1,275 pairs to 28 candidates, scored on name / DOB / gov-ID / email / phone / address. Two bands — auto-merge (≥0.70) and a review queue ([0.55, 0.70)) — so automation is high-precision and the review queue is load-bearing.
  2. Graph: weakly-connected-components over the auto-merge edges clusters records into insured parties, closing transitive chains a pairwise pass misses.
  3. Rules-based: match confidence tiers, a duplicate flag, and per-party total exposure with an accumulation-limit breach flag.
  4. Prescriptive: a reinsurance-cession knapsack — within a premium budget, cede the breached households that transfer the most excess exposure off the book.

Sample data

51 synthetic policy records across AUTO/HOME/LIFE/LEGACY (30 people, 16 multi-system), with per-policy coverage and three deliberate cases: a transitive chain only clustering resolves, a same-name "John Smith" trap scoring keeps apart, and a review-band duplicate whose $1.055M combined exposure is a breach hidden until a steward confirms it.

Verification (live engine)

  • 51 records → 30 insured parties; auto-resolution precision 1.000, recall 0.963, F1 0.981 (the one miss is the held review pair — recall gap parked in review, not auto-merged).
  • Accumulation: 0 policies breach at the record level → 4 households breach after resolution; confirming the review pair surfaces a 5th ($1.055M).
  • Reinsurance knapsack OPTIMAL: cede Fitzgerald + Chen — $927,000 of excess exposure for $111,240 of the $120,000 budget.

Includes runbook.md, an 8-step multi-reasoner walkthrough whose prompts were paste-tested: a fresh agent reproduced every figure above from the prompts and the RAI skills alone, never reading the template source. Local checks pass: ruff check sample-template v1, python scripts/generate_version_indexes.py --check, python -m py_compile.

A new Financial Services template demonstrating entity resolution: collapse
duplicate policyholder records spread across an insurer's policy systems
(auto / home / life) and an acquired book of business into one record per
insured party, so the customer view, total exposure, and screening all see
one person instead of several.

Pipeline:
- pandas blocking + field-similarity scoring (name, DOB, government-ID
  fragment, email, phone, address) propose candidate matches;
- a RelationalAI graph clusters accepted matches with weakly-connected-
  components, resolving transitive links a pairwise pass misses;
- rules derive a match-confidence tier (HIGH / MEDIUM / REVIEW) and an
  is_duplicate flag, bound back to the ontology.

Verified end to end against a live RAI engine on the bundled 50-record
sample: 50 records resolve into 30 insured parties (20 duplicates collapsed),
pairwise precision / recall / F1 = 1.000.
@github-actions

github-actions Bot commented Jun 16, 2026

Copy link
Copy Markdown

The docs preview for this pull request has been deployed to Vercel!

✅ Preview: https://relationalai-docs-kqti5cdry-relationalai.vercel.app/build/templates
🔍 Inspect: https://vercel.com/relationalai/relationalai-docs/EeXbBZaeRZkC28Rd8JGUChdcLJTF

Adds runbook.md: a lead-in problem statement, the distilled reasoner-chain
ASCII, and seven ordered, question-shaped prompts (build ontology -> examine
-> candidate generation -> graph clustering -> rules -> duplicate flag ->
golden record + evaluation), each labelled with its /rai-* skill and carrying
the expected response.

Verified by an independent paste-test: a fresh agent given only these prompts
and the RAI skills (never the template source) generated its own PyRel and ran
it against the live engine, reproducing 50 records -> 30 parties, 25 matches,
35 duplicates, and pairwise F1 = 1.000. The runbook notes the one
implementation-sensitive detail it surfaced (the MEDIUM/REVIEW tier boundary)
and that similarity scoring is computed Python-side (no PyRel string-sim
primitive). README links the runbook from "What's included" and the structure tree.
…surance

Extends the template so resolution drives a correct downstream decision, not
just a correct count -- now a Graph + Rules-based + Prescriptive chain.

- Two-band matching: high-confidence pairs auto-merge (graph edges); borderline
  pairs go to a ReviewPair queue instead of merging. Auto-resolution is honest
  (precision 1.000, recall 0.963) and the review queue is load-bearing.
- Rules: per-policy coverage aggregates into ResolvedParty.total_exposure with
  an accumulation-limit breach flag. No single policy breaches the $1M limit;
  resolved, 4 households do -- exposure invisible until resolution.
- Prescriptive: a reinsurance-cession knapsack cedes the most excess exposure
  within a premium budget (OPTIMAL: $927k ceded for $111,240 of $120k).
- Data: adds per-policy coverage and a deliberate review-band duplicate whose
  $1.055M combined exposure is a 5th breach hidden until the review pair is
  confirmed. Generator kept in dev_temp (static CSVs ship as the input).

Verified end to end against the live engine; the runbook's prompts were
paste-tested by a fresh agent that reproduced every figure from the prompts
and the RAI skills alone. README + runbook + version indexes updated.
@cafzal cafzal changed the title Add entity_resolution template: graph + rules-based policyholder dedup Add entity_resolution template: resolution → exposure accumulation → reinsurance (Graph + Rules + Prescriptive) Jun 16, 2026
…nners

Mirrors the README issues reviewers flagged on PR #80:
- Reword the front-matter description as one plain sentence (no colon-fragment).
- Lead "What this template is for" with the business problem and value, naming
  the reasoners in a single bold sentence rather than a technical breakdown.
- Add an assumed-knowledge line to "Who this is for".
- Replace arrow/inequality symbols in prose with words (code blocks and the
  data-flow diagram keep their arrows).

Also align the script's stage banners to the multi-reasoner convention so the
prescriptive stage reads as "Stage 3" instead of the single-reasoner
"Model the decision problem" / "Solve and check solution" headings. Comment-only;
logic and output unchanged from the validated run.

@jablonskidev jablonskidev left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is my feedback.

Comment thread v1/entity_resolution/README.md Outdated
- Insurance
---

# Entity Resolution

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do not need to add a H1 title at the top of the README.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the H1 — the body now starts at ## What this template is for. Fixed in 3c215af.

Comment thread v1/entity_resolution/README.md Outdated
python entity_resolution.py
```

6. Expected output (abridged):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Show only a tiny snippet (a few lines) so users can confirm success.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trimmed the Quickstart output to a four-line snippet that confirms a successful run; the full printout and a step-by-step walkthrough are in runbook.md. Fixed in 3c215af.

@cafzal cafzal merged commit 3efa708 into main Jun 17, 2026
3 checks passed
@cafzal cafzal deleted the worktree-entity-resolution-template branch June 17, 2026 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants