Lightdash Python SDK Guide

This guide covers the query builder, filtering system, and other features of the Lightdash Python SDK.

Quick Start

from lightdash import Client

client = Client(base_url="https://app.lightdash.cloud", token="your-token")
model = client.get_model("your_project_uuid", "your_model_name")

# Build and execute a query
result = (
    model.query()
    .metrics(model.metrics.revenue, model.metrics.profit)
    .dimensions(model.dimensions.country)
    .filter(model.dimensions.status == "active")
    .sort(model.metrics.revenue.desc())
    .limit(100)
    .execute()
)

df = result.to_df()

Query Builder

The SDK provides two patterns for building queries: single-call and chainable builder.

Single-Call Pattern

Pass all parameters at once—useful for simple queries:

query = model.query(
    metrics=[model.metrics.revenue, model.metrics.profit],
    dimensions=[model.dimensions.country],
    filters=model.dimensions.status == "active",
    sort=model.metrics.revenue.desc(),
    limit=100
)
result = query.execute()

Chainable Builder Pattern

Build queries incrementally with method chaining—each method returns a new immutable Query object:

query = (
    model.query()
    .metrics(model.metrics.revenue)
    .dimensions(model.dimensions.country, model.dimensions.date)
    .filter(model.dimensions.status == "active")
    .sort(model.metrics.revenue.desc())
    .limit(100)
)

Key characteristics:

Immutable: Each method returns a new Query object (safe for reuse)
Lazy evaluation: API calls only happen when .execute() is called
Order-independent: Methods can be called in any order
Composable: Create base queries and extend them

# Create a reusable base query
base = model.query().metrics(model.metrics.revenue).dimensions(model.dimensions.country)

# Extend it for different use cases
by_active = base.filter(model.dimensions.status == "active")
by_inactive = base.filter(model.dimensions.status == "inactive")

Filters

Filters let you constrain your query results. The SDK provides an intuitive, Pythonic API for creating filters.

Creating Filters with Operators

Use standard Python comparison operators on dimensions:

# Equality
f = model.dimensions.country == "USA"

# Inequality
f = model.dimensions.country != "USA"

# Numeric comparisons
f = model.dimensions.amount > 1000
f = model.dimensions.amount >= 1000
f = model.dimensions.amount < 500
f = model.dimensions.amount <= 500

Filter Helper Methods

For more complex filters, use helper methods:

# Check if value is in a list
f = model.dimensions.country.in_(["USA", "UK", "Canada"])

# String operations
f = model.dimensions.name.starts_with("John")
f = model.dimensions.name.ends_with("son")
f = model.dimensions.name.includes("Smith")

# Null checks
f = model.dimensions.email.is_null()
f = model.dimensions.email.is_not_null()

# Range checks
f = model.dimensions.order_date.between("2024-01-01", "2024-12-31")
f = model.dimensions.amount.not_between(100, 500)

Supported Operators by Data Type

Operator	Numeric	String	Boolean	Date
`is null`	Yes	Yes	Yes	Yes
`is not null`	Yes	Yes	Yes	Yes
`equals` / `is`	Yes	Yes	Yes	Yes
`is not`	Yes	Yes	-	Yes
`is less than`	Yes	-	-	-
`is greater than`	Yes	-	-	-
`starts with`	-	Yes	-	-
`ends with`	-	Yes	-	-
`includes`	-	Yes	-	-
`in the last`	-	-	-	Yes
`in the next`	-	-	-	Yes
`in the current`	-	-	-	Yes
`is before`	-	-	-	Yes
`is after`	-	-	-	Yes
`is between`	Yes	-	-	Yes
`is not between`	Yes	-	-	Yes

Combining Filters

Use & (AND) and | (OR) operators to combine filters:

# AND: Both conditions must be true
f = (model.dimensions.country == "USA") & (model.dimensions.amount > 1000)

# OR: Either condition must be true
f = (model.dimensions.status == "active") | (model.dimensions.status == "pending")

# Complex combinations
f = (
    (model.dimensions.country == "USA") &
    ((model.dimensions.amount > 1000) | (model.dimensions.priority == "high"))
)

Multiple .filter() calls on a query are combined with AND logic:

query = (
    model.query()
    .filter(model.dimensions.country == "USA")
    .filter(model.dimensions.amount > 1000)  # AND-ed with above
)

Multiple Filters on the Same Field

You can apply more than one condition to the same field—for example, to constrain a date dimension to a range:

query = model.query().filter(
    (model.dimensions.order_date >= "2026-01-01") &
    (model.dimensions.order_date <= "2026-01-31")
)

The same works across separate .filter() calls, since they are AND-ed together:

query = (
    model.query()
    .filter(model.dimensions.order_date >= "2026-01-01")
    .filter(model.dimensions.order_date <= "2026-01-31")
)

Filtering on Table Calculations

Define a table calculation, add it to the query, then filter on it with the same operators you use for dimensions. Table-calculation conditions are sent under filters.tableCalculations:

from lightdash import TableCalculation

profit_ratio = TableCalculation(
    name="profit_ratio",
    sql="${orders.profit} / ${orders.revenue}",
)

query = (
    model.query()
    .metrics(model.metrics.revenue, model.metrics.profit)
    .table_calculations(profit_ratio)
    .filter(profit_ratio > 0.2)  # only rows where the ratio exceeds 20%
)

Dimension and table-calculation filters can be combined with & (AND); each is serialized under its own key:

query = (
    model.query()
    .table_calculations(profit_ratio)
    .filter((model.dimensions.country == "USA") & (profit_ratio > 0.2))
)

Note on type: TableCalculation defaults type="number". The data type must be set for filter operators to compile — an untyped calculation is treated as a string by the API and rejects numeric operators like > or between. For a non-numeric calculation, pass type="string" (or date, timestamp, boolean).

Limitation — | (OR) across field types: Lightdash sends dimension, metric, and table-calculation filters as separate groups that are AND-ed together at the top level. An OR that mixes field types — e.g. (model.dimensions.country == "USA") | (profit_ratio > 0.2) — therefore cannot be expressed faithfully: it serializes as two independent groups that the server combines with AND, not OR. OR works as expected within a single field type (e.g. (profit_ratio > 0.8) | (profit_ratio < 0.2)).

Dimensions and Metrics

Access dimensions and metrics as attributes on the model:

# Access via attribute
country = model.dimensions.country
revenue = model.metrics.revenue

# List all available
all_dimensions = model.dimensions.list()
all_metrics = model.metrics.list()

Features:

Lazy loading: Fetched from API on first access, then cached
Fuzzy matching: Typos suggest closest matches
Tab completion: Works in Jupyter/IPython for discovery
Rich display: HTML rendering in notebooks

Sorting

Sort results using the .sort() method or Sort class:

from lightdash import Sort

# Using metric/dimension methods (recommended)
query = model.query().sort(model.metrics.revenue.desc())
query = model.query().sort(model.dimensions.country.asc())

# Multiple sorts
query = model.query().sort(
    model.metrics.revenue.desc(),
    model.dimensions.country.asc()
)

# Control null positioning
query = model.query().sort(model.dimensions.name.asc(nulls_first=True))

# Using Sort class directly
query = model.query().sort(Sort("orders_revenue", descending=True))

Results

Query results implement a unified ResultSet interface:

Converting Results

result = query.execute()

# To pandas DataFrame
df = result.to_df()  # or result.to_df(backend="pandas")

# To polars DataFrame
df = result.to_df(backend="polars")

# To list of dictionaries
records = result.to_records()

# To JSON string
json_str = result.to_json_str()

Iterating Over Results

# Iterate over rows
for row in result:
    print(row)

# Get total count
total = len(result)

Large Result Sets

There is no fixed SDK row cap. A query can return as many rows as the instance allows — its configured query.maxLimit (discovered automatically from /api/v1/health; e.g. 100,000 on Lightdash Cloud). Request more than that and the SDK raises a clear ValueError instead of letting the server silently return a truncated result:

# Fetch a large extract — pages are streamed transparently
result = model.query().metrics(model.metrics.revenue).limit(100_000).execute()
df = result.to_df()

# Asking for more than the instance allows fails loudly
model.query().limit(10_000_000).execute()
# ValueError: Limit 10000000 exceeds this instance's maximum query limit of 100000...

Large fetches page at the instance's maxPageSize to minimise round-trips, and every page uses the same size so no rows are skipped. To pull result sets larger than query.maxLimit, use a CSV/Excel export instead.

Pagination

For large result sets, results are paginated automatically:

result = query.execute()

# Access specific page
page_2 = result.page(2)

# Iterate through all pages
for page in result.iter_pages():
    process(page)

# Lazy DataFrame loading (polars only)
lazy_df = result.to_df_lazy()

Properties:

result.query_uuid - Unique identifier for the query
result.total_results - Total number of rows
result.total_pages - Number of pages
result.fields - Field metadata

Compiling to SQL

Use .compile() to get the warehouse SQL for a query without executing it. Nothing is run and no rows are fetched — it returns the exact SQL Lightdash would issue:

sql = (
    model.query()
    .metrics(model.metrics.revenue)
    .dimensions(model.dimensions.country)
    .filter(model.dimensions.status == "active")
    .compile()
)
print(sql)
# SELECT ... FROM ... WHERE ... GROUP BY ... LIMIT 500

This is handy for inspecting or debugging a query, or for running it directly against your warehouse when you need more rows than the query API returns — for example feeding it to BigQuery/bigframes, dbt, or a data pipeline:

import bigframes.pandas as bpd

sql = model.query().metrics(model.metrics.revenue).limit(10_000_000).compile()
df = bpd.read_gbq(sql)   # run it yourself, no Lightdash row cap

The query's limit is included in the SQL as LIMIT n; set a large limit (or strip the trailing clause) if you're running it yourself.

SQL Runner

Execute raw SQL queries directly against your data warehouse:

# Execute SQL
result = client.sql("SELECT * FROM orders WHERE status = 'active' LIMIT 100")
df = result.to_df()

# With custom limit
result = client.sql("SELECT * FROM orders", limit=1000)

# Introspection
tables = client.sql_runner.tables()
fields = client.sql_runner.fields("orders")
fields = client.sql_runner.fields("orders", schema="public")

Exception Handling

The SDK provides specific exceptions for different error conditions:

from lightdash import LightdashError, QueryError, QueryTimeout, QueryCancelled

try:
    result = query.execute()
except QueryTimeout as e:
    print(f"Query timed out: {e.query_uuid}")
except QueryCancelled as e:
    print(f"Query was cancelled: {e.query_uuid}")
except QueryError as e:
    print(f"Query failed: {e.message}")
except LightdashError as e:
    print(f"Lightdash error: {e.message} (status: {e.status_code})")

Exception hierarchy:

LightdashError - Base exception for all SDK errors
- QueryError - Query execution failed (HTTP 400)
- QueryTimeout - Query exceeded timeout (HTTP 408)
- QueryCancelled - Query was cancelled (HTTP 499)

Complete Example

from lightdash import Client, QueryError, QueryTimeout

# Initialize client
client = Client(
    base_url="https://app.lightdash.cloud",
    token="your-api-token"
)

# Get a model
model = client.get_model("project-uuid", "orders")

# Build a query with filters
query = (
    model.query()
    .metrics(model.metrics.total_revenue, model.metrics.order_count)
    .dimensions(model.dimensions.country, model.dimensions.order_date)
    .filter(
        (model.dimensions.status == "completed") &
        (model.dimensions.order_date >= "2024-01-01")
    )
    .sort(model.metrics.total_revenue.desc())
    .limit(50)
)

try:
    result = query.execute()
    df = result.to_df()
    print(f"Fetched {len(result)} rows")
    print(df.head())
except QueryTimeout:
    print("Query took too long - try adding more filters")
except QueryError as e:
    print(f"Query failed: {e.message}")

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Lightdash Python SDK Guide

Quick Start

Query Builder

Single-Call Pattern

Chainable Builder Pattern

Filters

Creating Filters with Operators

Filter Helper Methods

Supported Operators by Data Type

Combining Filters

Multiple Filters on the Same Field

Filtering on Table Calculations

Dimensions and Metrics

Sorting

Results

Converting Results

Iterating Over Results

Large Result Sets

Pagination

Compiling to SQL

SQL Runner

Exception Handling

Complete Example

Uh oh!

FilesExpand file tree

SDK_GUIDE.md

Latest commit

History

SDK_GUIDE.md

File metadata and controls

Lightdash Python SDK Guide

Quick Start

Query Builder

Single-Call Pattern

Chainable Builder Pattern

Filters

Creating Filters with Operators

Filter Helper Methods

Supported Operators by Data Type

Combining Filters

Multiple Filters on the Same Field

Filtering on Table Calculations

Dimensions and Metrics

Sorting

Results

Converting Results

Iterating Over Results

Large Result Sets

Pagination

Compiling to SQL

SQL Runner

Exception Handling

Complete Example