Add transaction_screening_local template (rules + query on local DuckDB)#80
Add transaction_screening_local template (rules + query on local DuckDB)#80cafzal wants to merge 11 commits into
Conversation
|
The docs preview for this pull request has been deployed to Vercel!
|
… section replaces enable_model_deployment)
…et to suspect-or-near-suspect (under_review)
… rate Grow the ledger from 10 transfers/9 accounts to 75 transfers/54 accounts, embedding the structuring ring and large sender in a legitimate-traffic majority (C3xxx). 6 of 54 accounts flag (~11%), a realistic AML base rate. Add near-miss transfers (8,900 / 10,000 / 49,000 / 50,000) on the threshold edges that correctly stay clean. Update README and runbook counts to match (75 transfers / 870,000 moved / 6 suspects / 8 in investigation set).
Bump the local-DuckDB version floor from 1.11 to 1.12. The programmatic-config deploy path this template uses (create_config with a deployment section) was stabilized in 1.12 (PyRel #1730). Verified the script runs unchanged on 1.12.0 (75 transfers / 870,000 / 6 suspects / 8 under review).
Bump the local-DuckDB floor from 1.12 to 1.13. Verified the script runs unchanged on 1.13.0 (75 transfers / 870,000 / 6 suspects / 8 under review).
…ning-local-template # Conflicts: # v1/README.md
jablonskidev
left a comment
There was a problem hiding this comment.
Here is my feedback.
| @@ -0,0 +1,181 @@ | |||
| --- | |||
| title: "Transaction Screening (Local DuckDB)" | |||
| description: "Rules + query fraud-ring triage: structuring and large-sender flags, suspect classification, and one-hop investigation expansion via a relationship self-join." | |||
There was a problem hiding this comment.
Use words and sentences rather than symbols and fragments.
| - Getting Started | ||
| --- | ||
|
|
||
| ## What this template is for |
There was a problem hiding this comment.
This section should be a problem statement and motivation (1–2 paragraphs). Focus on the “why” and the value of RelationalAI, not on the technical details of the model or code. Use language that’s accessible to a broad audience.
| | Rules / logic (classification flags, chaining) | Optimization solve (`Problem`) | | ||
| | Relationship traversal (multi-hop self-joins, connectivity) | GNN training / inference | | ||
|
|
||
| ## Who this is for |
There was a problem hiding this comment.
Include assumed knowledge.
|
|
||
| - Anyone who wants to try RelationalAI without provisioning Snowflake | ||
| - Developers prototyping an ontology, rules, and queries before pointing at production data | ||
| - Anyone learning the rules + relationship-traversal patterns on a legible dataset with a realistic low base rate of suspicious activity |
There was a problem hiding this comment.
Use words rather than symbols.
|
|
||
| ## What's included | ||
|
|
||
| - **Model**: `Account`, the `transfers_to` relationship, and the classification + expansion rules |
There was a problem hiding this comment.
Use words rather than symbols, throughout the README.
| - **Sample data**: a 75-transfer ledger across 54 accounts — a structuring ring and a large sender embedded in a legitimate-traffic majority (6 of 54 accounts flag) | ||
| - **Outputs**: printed tables (network overview, per-account volume, suspects, counterparties, investigation set) | ||
|
|
||
| ## Prerequisites |
There was a problem hiding this comment.
Follow the structure here: https://github.com/RelationalAI/templates/blob/main/sample-template/README.md#prerequisites
|
|
||
| `data/transactions.csv` is a 75-transfer ledger across 54 accounts, with columns `id, src, dst, amount`. Most of it is legitimate small-business traffic (the `C3xxx` accounts — payroll runs, vendor invoices, retail transfers) that never flags. The suspicious activity is a small minority: `C2001–C2005` form a ring that cycles money in amounts just under the $10,000 reporting threshold (structuring), and `C1001` makes one large $60,000 transfer. In all, 6 of 54 accounts flag (~11%) — a realistic AML base rate. A few near-miss transfers (8,900; exactly 10,000; 49,000; exactly 50,000) sit right on the threshold edges and deliberately stay clean. | ||
|
|
||
| ## Model overview |
There was a problem hiding this comment.
Follow the pattern here: https://github.com/RelationalAI/templates/blob/main/sample-template/README.md#model-overview
| model.where(Account.transfers_to(_other), _other.is_suspect()).define(Account.near_suspect()) | ||
| ``` | ||
|
|
||
| ## Customize this template |
There was a problem hiding this comment.
Rewrite the README in prose (no symbol shorthand or sentence fragments) and align section structure with sample-template: - What this template is for: problem statement and motivation, no capability table - Who this is for: add assumed-knowledge note - Prerequisites: split into Access and Tools - Model overview: key entities, identifier, invariants, plus per-concept and relationship tables - Customize this template: Use your own data / Tune parameters / Extend the model / Scale up Front-matter description rewritten as a full sentence; template index regenerated. How-it-works code snippets unchanged (verbatim from the script).
…k and script Extend the README review fixes (words, not symbol shorthand) to the other prose artifacts: drop '+'/'&'/'/' connectives from the runbook prose and chain diagram and from the script docstring/comments. Keep the prescribed '# Define semantic model & load data' banner and the runbook's structural chain-ASCII arrows. Behavior unchanged (comments/docstring only); script runs identically.
…nners Mirrors the README issues reviewers flagged on PR #80: - Reword the front-matter description as one plain sentence (no colon-fragment). - Lead "What this template is for" with the business problem and value, naming the reasoners in a single bold sentence rather than a technical breakdown. - Add an assumed-knowledge line to "Who this is for". - Replace arrow/inequality symbols in prose with words (code blocks and the data-flow diagram keep their arrows). Also align the script's stage banners to the multi-reasoner convention so the prescriptive stage reads as "Stage 3" instead of the single-reasoner "Model the decision problem" / "Solve and check solution" headings. Comment-only; logic and output unchanged from the validated run.
…sses PR #80 review feedback) Rewrites the README to the sample-template standard the reviewer is applying to new templates: 'What this template is for' is now a broad-audience problem statement and motivation with a bold reasoning-type sentence; adds assumed knowledge, an Access/Tools Prerequisites split, a Model overview, and Customize subsections; and uses words and sentences rather than symbols and fragments throughout the prose. Also drops trailing commas after the last where(...) condition per the cleanup-template-code convention (a no-op; re-run confirms identical output). Validated: sample output matches the live run line-for-line, all code snippets are verbatim from the script, and the runbook reproduced blind end-to-end.
What
A beginner, rules-based template that runs entirely on a local DuckDB database — no Snowflake account or Native App. It triages a transfer ledger: classifies structuring and large-sender accounts, flags suspects, and builds an investigation set (suspects plus everyone who transacted directly with one) via a relationship self-join.
Local-development counterpart to the Snowflake-backed templates — same ontology → rules → query workflow, on an engine you can run anywhere.
Reasoning
is_structuring,is_large_sender,is_suspect(OR-chained),near_suspect,under_review.transfers_torelationship (the local way to answer connectivity, without the graph reasoner).Requires relationalai >= 1.13
Local DuckDB execution is enabled by a
deploymentsection:The programmatic-config deploy path this template relies on was stabilized in 1.12; 1.13 is the floor.
Files
Also regenerates the template index (
v1/README.md+ rootREADME.md) from the template's front matter viascripts/generate_version_indexes.py.Dataset
A 75-transfer ledger across 54 accounts with a realistic AML base rate: only 6 of 54 accounts (~11%) flag. The suspicious activity is a small, legible subgraph —
C2001–C2005form a structuring ring (transfers of 9,400–9,900, just under the 10,000 threshold) andC1001makes one 60,000 transfer — embedded in a legitimateC3xxxpayment network (payroll, vendor invoices, retail transfers) that never flags. Four near-miss transfers (8,900; exactly 10,000; 49,000; exactly 50,000) sit on the rule boundaries and deliberately stay clean, exercising the sharp edges of the thresholds.Verification
py_compile+ruffclean.Notes / open items
fraud-detection(multi-reasoner, Snowflake) andcommercial_underwriting(rules).