Skip to content

feat(datafabric): fetch ontology R2RML alongside OWL#935

Open
sankalp-uipath wants to merge 2 commits into
feat/datafabric-ontology-fetch-toolfrom
feat/datafabric-ontology-r2rml-grounding
Open

feat(datafabric): fetch ontology R2RML alongside OWL#935
sankalp-uipath wants to merge 2 commits into
feat/datafabric-ontology-fetch-toolfrom
feat/datafabric-ontology-r2rml-grounding

Conversation

@sankalp-uipath

Copy link
Copy Markdown

What

fetch_ontology now retrieves each configured ontology's R2RML mapping in addition to its OWL schema, and hands both to the inner SQL agent as grounding.

  • ontology_fetch_tool.pyOntologyFetcher fetches both owl and r2rml per ontology (via EntitiesService.get_ontology_file_async) and concatenates them. The R2RML block is framed as the ontology→entity/column mapping so the LLM can translate ontology terms into real column names for SQL.

Why

The OWL gives the LLM class/property names; the R2RML mapping tells it which entity table/column each of those maps to — directly improving SQL accuracy. This is grounding text only (call it M1.5); the executable R2RML flow via an inference engine (Ontop) remains a later milestone.

Notes

  • OWL required, R2RML optional: a missing/empty/oversized OWL still falls back to "proceed using entity schemas"; a missing R2RML is skipped silently (no noise), since most ontologies have no mapping yet.
  • No new tool arguments / no LLM-controlled file type — names and folders stay pinned from config, so the model can't redirect the fetch.
  • No SDK or agent.json changeget_ontology_file_async already supported file_type="r2rml"; r2rml is in the allowlist.
  • Renamed _MAX_OWL_BYTES_MAX_FILE_BYTES (now caps both file types).
  • Tests: both file types requested per ontology; R2RML-present produces a mapping block; absent R2RML is skipped without a fallback message. Full tests/agent/tools/ suite green (654).
  • Stacked on feat(datafabric): add fetch_ontology tool to DF inner SQL agent #911 (feat/datafabric-ontology-fetch-tool) — base this PR on that branch, or merge feat(datafabric): add fetch_ontology tool to DF inner SQL agent #911 first. Unit/lint CI stays red until SDK 2.11.11 publishes (same cross-repo constraint as feat(datafabric): add fetch_ontology tool to DF inner SQL agent #911).

Copilot AI review requested due to automatic review settings June 24, 2026 18:52

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the Data Fabric inner SQL agent’s fetch_ontology tool to retrieve an ontology’s optional R2RML mapping alongside its OWL schema, so the LLM is grounded not only in semantic terms but also in how they map onto concrete entity tables/columns.

Changes:

  • Fetch both owl and r2rml per configured ontology via EntitiesService.get_ontology_file_async, concatenating results with deterministic ordering and caching.
  • Treat R2RML as optional (silently skipped on failure) while keeping OWL as required (graceful degradation message on failure/oversize/empty).
  • Expand unit tests to assert both file types are requested and to verify the R2RML-present/absent behaviors.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/uipath_langchain/agent/tools/datafabric_tool/ontology_fetch_tool.py Fetches and formats OWL plus optional R2RML blocks, adds shared file size cap, and updates tool description.
tests/agent/tools/test_ontology_fetch_tool.py Adds test coverage for requesting both file types and for R2RML present/absent output behavior.

Comment on lines +96 to +101
if optional:
# Absent/oversized optional file — skip it without noise.
logger.info(
"Optional %s for ontology %r unavailable: %s", file_type, name, e
)
return ""
Comment on lines 165 to +170
f"Fetch the OWL 2 QL ontologies (the authoritative semantic schema) "
f"for: {names}. Call this BEFORE writing SQL: it gives the exact "
"class and property names, value formats, and relationships so your "
"SQL uses the real schema instead of guesses. Takes no arguments."
f"and, when available, their R2RML mappings (ontology-to-entity/column "
f"mapping) for: {names}. Call this BEFORE writing SQL: it gives the "
"exact class and property names, value formats, relationships, and how "
"they map to entity columns, so your SQL uses the real schema instead "
"of guesses. Takes no arguments."

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comment on lines 95 to 106
except Exception as e:
if optional:
# Absent/oversized optional file — skip it without noise.
logger.info(
"Optional %s for ontology %r unavailable: %s", file_type, name, e
)
return ""
logger.warning("Ontology fetch failed for %r: %s", name, e)
return (
f"Ontology '{name}' is unavailable ({type(e).__name__}). "
"Proceed using the entity schemas in the system prompt."
)
@sankalp-uipath sankalp-uipath force-pushed the feat/datafabric-ontology-r2rml-grounding branch from f68e00e to 9f45a34 Compare June 25, 2026 08:17
@sankalp-uipath sankalp-uipath force-pushed the feat/datafabric-ontology-r2rml-grounding branch from 9f45a34 to 2647337 Compare June 25, 2026 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants