Add entity_resolution template: resolution → exposure accumulation → reinsurance (Graph + Rules + Prescriptive)#88
Merged
Conversation
A new Financial Services template demonstrating entity resolution: collapse duplicate policyholder records spread across an insurer's policy systems (auto / home / life) and an acquired book of business into one record per insured party, so the customer view, total exposure, and screening all see one person instead of several. Pipeline: - pandas blocking + field-similarity scoring (name, DOB, government-ID fragment, email, phone, address) propose candidate matches; - a RelationalAI graph clusters accepted matches with weakly-connected- components, resolving transitive links a pairwise pass misses; - rules derive a match-confidence tier (HIGH / MEDIUM / REVIEW) and an is_duplicate flag, bound back to the ontology. Verified end to end against a live RAI engine on the bundled 50-record sample: 50 records resolve into 30 insured parties (20 duplicates collapsed), pairwise precision / recall / F1 = 1.000.
|
The docs preview for this pull request has been deployed to Vercel!
|
Adds runbook.md: a lead-in problem statement, the distilled reasoner-chain ASCII, and seven ordered, question-shaped prompts (build ontology -> examine -> candidate generation -> graph clustering -> rules -> duplicate flag -> golden record + evaluation), each labelled with its /rai-* skill and carrying the expected response. Verified by an independent paste-test: a fresh agent given only these prompts and the RAI skills (never the template source) generated its own PyRel and ran it against the live engine, reproducing 50 records -> 30 parties, 25 matches, 35 duplicates, and pairwise F1 = 1.000. The runbook notes the one implementation-sensitive detail it surfaced (the MEDIUM/REVIEW tier boundary) and that similarity scoring is computed Python-side (no PyRel string-sim primitive). README links the runbook from "What's included" and the structure tree.
…surance Extends the template so resolution drives a correct downstream decision, not just a correct count -- now a Graph + Rules-based + Prescriptive chain. - Two-band matching: high-confidence pairs auto-merge (graph edges); borderline pairs go to a ReviewPair queue instead of merging. Auto-resolution is honest (precision 1.000, recall 0.963) and the review queue is load-bearing. - Rules: per-policy coverage aggregates into ResolvedParty.total_exposure with an accumulation-limit breach flag. No single policy breaches the $1M limit; resolved, 4 households do -- exposure invisible until resolution. - Prescriptive: a reinsurance-cession knapsack cedes the most excess exposure within a premium budget (OPTIMAL: $927k ceded for $111,240 of $120k). - Data: adds per-policy coverage and a deliberate review-band duplicate whose $1.055M combined exposure is a 5th breach hidden until the review pair is confirmed. Generator kept in dev_temp (static CSVs ship as the input). Verified end to end against the live engine; the runbook's prompts were paste-tested by a fresh agent that reproduced every figure from the prompts and the RAI skills alone. README + runbook + version indexes updated.
…nners Mirrors the README issues reviewers flagged on PR #80: - Reword the front-matter description as one plain sentence (no colon-fragment). - Lead "What this template is for" with the business problem and value, naming the reasoners in a single bold sentence rather than a technical breakdown. - Add an assumed-knowledge line to "Who this is for". - Replace arrow/inequality symbols in prose with words (code blocks and the data-flow diagram keep their arrows). Also align the script's stage banners to the multi-reasoner convention so the prescriptive stage reads as "Stage 3" instead of the single-reasoner "Model the decision problem" / "Solve and check solution" headings. Comment-only; logic and output unchanged from the validated run.
jablonskidev
requested changes
Jun 17, 2026
jablonskidev
left a comment
Contributor
There was a problem hiding this comment.
Here is my feedback.
| - Insurance | ||
| --- | ||
|
|
||
| # Entity Resolution |
Contributor
There was a problem hiding this comment.
You do not need to add a H1 title at the top of the README.
Collaborator
Author
There was a problem hiding this comment.
Removed the H1 — the body now starts at ## What this template is for. Fixed in 3c215af.
| python entity_resolution.py | ||
| ``` | ||
|
|
||
| 6. Expected output (abridged): |
Contributor
There was a problem hiding this comment.
Show only a tiny snippet (a few lines) so users can confirm success.
Collaborator
Author
There was a problem hiding this comment.
Trimmed the Quickstart output to a four-line snippet that confirms a successful run; the full printout and a step-by-step walkthrough are in runbook.md. Fixed in 3c215af.
…ckstart output to a tiny snippet
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
A new Financial Services template — the first in the library to cover entity resolution — built around one thesis: resolution is the step that makes the downstream reasoning correct. It resolves duplicate policyholder records scattered across an insurer's policy systems (auto / home / life) and an acquired book, then acts on the resolved view.
Until records are resolved, each insured's total exposure to the carrier is invisible — no single policy looks dangerous, yet a household's combined sum insured can breach an accumulation limit. The template makes that concrete.
Chain (Graph + Rules-based + Prescriptive)
Sample data
51 synthetic policy records across AUTO/HOME/LIFE/LEGACY (30 people, 16 multi-system), with per-policy coverage and three deliberate cases: a transitive chain only clustering resolves, a same-name "John Smith" trap scoring keeps apart, and a review-band duplicate whose $1.055M combined exposure is a breach hidden until a steward confirms it.
Verification (live engine)
Includes
runbook.md, an 8-step multi-reasoner walkthrough whose prompts were paste-tested: a fresh agent reproduced every figure above from the prompts and the RAI skills alone, never reading the template source. Local checks pass:ruff check sample-template v1,python scripts/generate_version_indexes.py --check,python -m py_compile.