Skip to content

[FLINK-39950][sqlserver] Fix chunk key column search restricted to primary key columns#4443

Open
jubins wants to merge 1 commit into
apache:masterfrom
jubins:j-FLINK-39950-sqlserver-chunk-key-column-fix
Open

[FLINK-39950][sqlserver] Fix chunk key column search restricted to primary key columns#4443
jubins wants to merge 1 commit into
apache:masterfrom
jubins:j-FLINK-39950-sqlserver-chunk-key-column-fix

Conversation

@jubins

@jubins jubins commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

What is the purpose of the change

Fixes FLINK-39950SqlServerUtils.getSplitColumn() searches only table.primaryKeyColumns() when scan.incremental.snapshot.chunk.key-column is set, instead of searching all table columns. This makes it impossible to use a non-primary-key column as the chunk key on SQL Server, even though the feature is documented and works correctly in MySQL, Postgres, and Oracle connectors.

As a result, any user who sets scan.incremental.snapshot.chunk.key-column to a non-PK column gets a misleading ValidationException: Chunk key column 'X' doesn't exist in the primary key [...]. Additionally, tables without a primary key immediately throw before even checking whether a chunkKeyColumn was provided, making the no-PK + custom chunk key use case entirely broken.

The fix replaces the primaryKeys.stream() search with table.columns().stream() in SqlServerUtils.getSplitColumn(), and gates the "no primary key" error on chunkKeyColumn == null, matching the behavior of the Postgres and Oracle connectors.

Brief change log

  • Fixed getSplitColumn() to search all table columns instead of only primary key columns when chunkKeyColumn is set
  • Updated the "no primary key" guard to allow tables without a PK when a chunkKeyColumn is explicitly provided
  • Updated error message from "doesn't exist in the primary key" to "doesn't exist in the columns" to accurately reflect what was searched
  • Added SqlServerUtilsTest with six unit tests covering all cases: default PK, explicit PK column, non-PK column, no-PK table with chunk key, no-PK table without chunk key (should throw), and non-existent column (should throw)

Verifying this change

This change is covered by new unit tests in SqlServerUtilsTest:

  • testGetSplitColumnDefaultsToPrimaryKey() — verifies the default behavior (no chunkKeyColumn) still returns the first PK column
  • testGetSplitColumnWithPrimaryKeyColumnExplicitlySet() — verifies a PK column can still be explicitly set as the chunk key
  • testGetSplitColumnWithNonPrimaryKeyColumn() — the core regression test; verifies that a non-PK column (created_at) is accepted correctly
  • testGetSplitColumnWithNonPrimaryKeyColumnByName() — verifies another non-PK column (name) works correctly
  • testGetSplitColumnWithChunkKeyOnTableWithoutPrimaryKey() — verifies a table with no PK can use a chunk key column without throwing
  • testGetSplitColumnThrowsWhenNoPrimaryKeyAndNoChunkKeyColumn() — verifies the error is still thrown when neither a PK nor a chunk key column is available
  • testGetSplitColumnThrowsWhenChunkKeyColumnDoesNotExist() — verifies a non-existent column name throws with an accurate error message

Does this pull request potentially affect one of the following parts

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery (JobManager, Checkpointing, Kubernetes/Yarn, ZooKeeper): no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no — it fixes a bug in existing functionality
  • If yes, how is the feature documented? not applicable

Was generative AI tooling used to co-author this PR?

  • Yes — Claude Code was used as a pair-programming assistant. All code was written, understood, and verified by the author.
    Generated-by: Claude Opus 4.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant