feat(datafabric): surface system fields under prompt v2 with business-field precedence by milind-jain-uipath · Pull Request #916 · UiPath/uipath-langchain-python

milind-jain-uipath · 2026-06-17T15:20:00Z

What

Surfaces the auto-added Data Fabric system/audit fields (Id, CreateTime, UpdateTime, CreatedBy, UpdatedBy) to the inner SQL agent and renders field descriptions in the schema table, while keeping business fields authoritative when a business field overlaps a system field's concept.

Changes

prompts/v1.py — adds a SYSTEM / AUDIT FIELDS step under FIELD SELECTION: when a business (non-system) field overlaps a system field's concept, prefer the business field, judged by field names and descriptions. System fields are whatever the backend tags is_system_field, not a hardcoded set.
datafabric_prompt_builder.py — system fields are surfaced (tagged system) and excluded from the derived query patterns; the schema table includes a Description column. Hidden fields stay filtered.
datafabric_tool/models.py — FieldSchema.is_system_field; display_type appends a system modifier.

Notes

Folded into the existing v1 prompt (no separate version / default flip) and applied unconditionally in the builder — no per-version rendering policy.

Tests

test_datafabric_prompt_builder.py: system fields surfaced + tagged + descriptions, query patterns exclude system fields, ECP strategy + domain guidance preserved. Datafabric + circular-import tests pass; mypy and ruff clean.

Local validation — system-field guidance (with vs without)

Data Fabric System-Field Guidance — Validation

A test to check that the prompt's SYSTEM / AUDIT FIELDS guidance makes the
agent pick the business field instead of a similar system field.

What we are checking

Every entity has auto-added system fields (Id, CreateTime, UpdateTime,
CreatedBy, UpdatedBy). When a business field means almost the same thing as
one of these, the agent should use the business field. We check that the
guidance makes this happen, and that without the guidance the agent picks the
system field instead.

Test entities

Created in Data Fabric (alpha, tenant entity/a4e, folder
860c6b05-98ba-44a4-970e-760ad956547d). Both entities have the same fields and
the same rows. The only difference is whether the business fields have
descriptions.

Entity	Id	Field descriptions
`OrderTest`	`d1ae6723-836a-f111-8fcb-0022482a9c89`	yes
`OrderTestNoDesc`	`3d1bae30-836a-f111-8fcb-0022482a9c89`	no

Fields

Field	SQL type	Description (`OrderTest`)	Overlaps with
`OrderNumber`	NVARCHAR	Human-facing order number shown to customers	`Id`
`Status`	NVARCHAR	Order status: Open, Shipped, or Cancelled	—
`Amount`	DECIMAL	Total order amount in USD	—
`OrderDate`	NVARCHAR	Date the customer placed the order	`CreateTime`
`SalesRep`	NVARCHAR	Name of the sales representative who owns the order	`CreatedBy`

The system fields (Id, CreateTime, UpdateTime, CreatedBy, UpdatedBy)
are added by the backend with isSystemField=true, isHiddenField=false, and
the same generic description "System built-in field".

Seed data (8 rows)

OrderDate is a past date. CreateTime is the time the row was inserted
(~2026-06-17). They are different on purpose, so a wrong field choice is easy to
spot in the answer.

OrderNumber	Status	Amount	OrderDate	SalesRep
ORD-1001	Shipped	1250.00	2023-01-15	Alice Chen
ORD-1002	Open	875.50	2023-03-22	Bob Martinez
ORD-1003	Cancelled	430.00	2023-05-10	Alice Chen
ORD-1004	Shipped	2100.75	2023-07-01	Carla Diaz
ORD-1005	Open	99.99	2024-02-14	Bob Martinez
ORD-1006	Shipped	560.25	2024-04-30	Carla Diaz
ORD-1007	Cancelled	1500.00	2024-06-18	Alice Chen
ORD-1008	Open	320.40	2024-09-09	Dan Kapoor

Prompts compared

Both prompts show the system fields (tagged system) and show field
descriptions. The only difference is whether the SYSTEM / AUDIT FIELDS
guidance is in the prompt.

Prompt	System fields shown	Guidance
with guidance (shipped)	yes	yes
without guidance (control)	yes	no

The "without guidance" prompt was a temporary local change, picked per run with
an env variable, and removed before merge.

How we ran it

A local agent run used a custom agent.json pointing at the two entities.

Model: gpt-4.1-2025-04-14, temperature 0, engine basic-v2
Same question each time: "when was order ORD-1001 created?"
The input named the entity so the run hit the right one, e.g.
Using OrderTest, when was order ORD-1001 created?. Only the entity name
changed; the question stayed the same.
Three runs:
- without guidance, on OrderTestNoDesc (no description)
- with guidance, on OrderTestNoDesc (no description)
- without guidance, on OrderTest (has description)

The word "created" is close to the system field name CreateTime on purpose.

Results

Arm A — without guidance, on `OrderTestNoDesc` (no description)

Tool call → OrderTestNoDesc(user_query="When was order ORD-1001 created?")

SQL → execute_sql:

SELECT OrderNumber, CreateTime FROM OrderTestNoDesc WHERE OrderNumber = 'ORD-1001' LIMIT 1

Result:

{'records': [{'CreateTime': '2026-06-17T19:32:02.5744477+00:00', 'OrderNumber': 'ORD-1001'}], 'total_count': 1}

Answer:

Order ORD-1001 was created on 2026-06-17T19:32:02.5744477+00:00.

Wrong — it used the system field CreateTime (the insert time).

Arm B — with guidance, on `OrderTestNoDesc` (no description)

Tool call → OrderTestNoDesc(user_query="When was order ORD-1001 created?")

SQL → execute_sql:

SELECT OrderDate FROM OrderTestNoDesc WHERE OrderNumber = 'ORD-1001' LIMIT 1

Result:

{'records': [{'OrderDate': '2023-01-15'}], 'total_count': 1}

Answer:

Order ORD-1001 was created on 2023-01-15.

Right — it used the business field OrderDate, with no description to help.

Arm C — without guidance, on `OrderTest` (has description)

Tool call → OrderTest(user_query="When was order ORD-1001 created?")

SQL → execute_sql:

SELECT OrderDate FROM OrderTest WHERE OrderNumber = 'ORD-1001' LIMIT 1

Result:

{'records': [{'OrderDate': '2023-01-15'}], 'total_count': 1}

Answer:

Order ORD-1001 was created on 2023-01-15.

Right — the description was enough to pick OrderDate, even with no guidance.

Summary

Same question, model, and setup every time. System fields shown in all runs.
Field the agent chose (✅ business OrderDate / ❌ system CreateTime):

Prompt	`OrderTestNoDesc` (no description)	`OrderTest` (description)
without guidance	`CreateTime` ❌	`OrderDate` ✅
with guidance	`OrderDate` ✅	`OrderDate` ✅ (expected)

Without the guidance, the agent only gets it right when a description helps
(Arm C right, Arm A wrong).
With the guidance, the agent gets it right even with no description (Arm B).

The with-guidance + OrderTest cell was not run. Both of its inputs (guidance on,
description present) already give the right answer on their own in the other runs.

Copilot

Pull request overview

This PR introduces a new Data Fabric SQL prompt version (v2) that surfaces Data Fabric’s auto-added system/audit fields to the inner SQL agent while encoding “business-field precedence” guidance to avoid the model choosing system columns when a business column better represents the user’s intent.

Changes:

Added prompt v2 with a dedicated SYSTEM / AUDIT FIELDS grounding section and made it the default prompt version.
Extended the prompt registry with schema-rendering policy flags (include system fields, render descriptions) and updated the builder to follow those flags rather than checking version names.
Updated schema/context modeling and rendering to tag system fields (system) and (for v2) include a Description column, while ensuring derived query patterns do not use system fields.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
tests/agent/tools/test_datafabric_prompt_builder.py	Adds/updates tests to verify v1 baseline stability, v2 system-field surfacing + tagging, description-column rendering, query-pattern exclusion, and v2 defaulting.
src/uipath_langchain/agent/tools/datafabric_tool/prompts/v2.py	Adds the v2 SQL prompt template including explicit system/audit-field precedence rules and guidance.
src/uipath_langchain/agent/tools/datafabric_tool/prompts/registry.py	Registers v2 and adds policy flags to `PromptVersion`; flips `DEFAULT_PROMPT_VERSION` to v2.
src/uipath_langchain/agent/tools/datafabric_tool/models.py	Adds `FieldSchema.is_system_field` and appends a `system` modifier to `display_type`.
src/uipath_langchain/agent/tools/datafabric_tool/datafabric_prompt_builder.py	Gates inclusion of system fields and description rendering via registry flags; excludes system fields from derived query patterns.

milind-jain-uipath · 2026-06-17T16:28:20Z

+    include_system_fields: bool = False
+    """Surface auto-added system/audit fields (Id, CreateTime, UpdateTime,
+    CreatedBy, UpdatedBy) in the entity schema table, tagged ``system``."""
+    render_field_descriptions: bool = False
+    """Render a Description column in the entity schema table."""


UIPath-Harshit · 2026-06-18T02:16:11Z

Update to v1. We are not there yet getting used by customers actively

UIPath-Harshit · 2026-06-18T02:31:21Z

 }

-DEFAULT_PROMPT_VERSION = "v1"
+DEFAULT_PROMPT_VERSION = "v2"


keep it to v1 and make updates there

UIPath-Harshit · 2026-06-18T02:33:07Z

+
+# Versions absent from this map fall back to the default (legacy) rendering:
+# no system fields, no description column.
+_POLICIES: dict[str, SchemaRenderPolicy] = {


I dont think this is warranted. Ideally this needs to be morphed at server side and the policy may not be needed to render it to prompt. A policy in this context is basically a construct of how a process is explored and exploited. Read up about RLs policy evaluation and value iteration constructs to get into how claude may have redirected you here.

Got it, addressed.

Surface auto-added system/audit fields (Id, CreateTime, UpdateTime, CreatedBy, UpdatedBy) to the inner Data Fabric SQL agent and render field descriptions in the schema table, while keeping business fields authoritative when concepts overlap. - v1 template: add a SYSTEM / AUDIT FIELDS step under FIELD SELECTION — when a business field overlaps a system field's concept, prefer the business field, judged by field names and descriptions. System fields are whatever the backend tags is_system_field, not a hardcoded set. - prompt builder: system fields are surfaced (tagged `system`) and excluded from the derived query patterns; the schema table always includes a Description column. Hidden fields stay filtered. - models: FieldSchema.is_system_field; display_type appends a `system` modifier. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

sonarqubecloud · 2026-06-18T05:48:16Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
100.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

Copilot AI review requested due to automatic review settings June 17, 2026 15:20

Copilot started reviewing on behalf of milind-jain-uipath June 17, 2026 15:20 View session

Copilot AI reviewed Jun 17, 2026

View reviewed changes

milind-jain-uipath force-pushed the feat/datafabric-system-fields-prompt-v2 branch 3 times, most recently from eaa36bd to 37aa826 Compare June 17, 2026 16:28

milind-jain-uipath requested a review from UIPath-Harshit June 17, 2026 16:29

UIPath-Harshit reviewed Jun 18, 2026

View reviewed changes

milind-jain-uipath force-pushed the feat/datafabric-system-fields-prompt-v2 branch from 37aa826 to a94023c Compare June 18, 2026 04:56

Merge branch 'main' into feat/datafabric-system-fields-prompt-v2

23ef72d

UIPath-Harshit approved these changes Jun 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(datafabric): surface system fields under prompt v2 with business-field precedence#916

feat(datafabric): surface system fields under prompt v2 with business-field precedence#916
milind-jain-uipath wants to merge 2 commits into
mainfrom
feat/datafabric-system-fields-prompt-v2

milind-jain-uipath commented Jun 17, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

milind-jain-uipath Jun 17, 2026

Uh oh!

UIPath-Harshit Jun 18, 2026

Uh oh!

UIPath-Harshit Jun 18, 2026

Uh oh!

milind-jain-uipath Jun 18, 2026

Uh oh!

UIPath-Harshit Jun 18, 2026

Uh oh!

milind-jain-uipath Jun 18, 2026

Uh oh!

sonarqubecloud Bot commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

milind-jain-uipath commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Changes

Notes

Tests

Data Fabric System-Field Guidance — Validation

What we are checking

Test entities

Fields

Seed data (8 rows)

Prompts compared

How we ran it

Results

Arm A — without guidance, on OrderTestNoDesc (no description)

Arm B — with guidance, on OrderTestNoDesc (no description)

Arm C — without guidance, on OrderTest (has description)

Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

milind-jain-uipath Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

UIPath-Harshit Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

UIPath-Harshit Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

milind-jain-uipath Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

UIPath-Harshit Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

milind-jain-uipath Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud Bot commented Jun 18, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

milind-jain-uipath commented Jun 17, 2026 •

edited

Loading

Arm A — without guidance, on `OrderTestNoDesc` (no description)

Arm B — with guidance, on `OrderTestNoDesc` (no description)

Arm C — without guidance, on `OrderTest` (has description)