Skip to content

fix(waterdata): numpy array / Series numeric params leaked their repr into the query#308

Merged
thodson-usgs merged 1 commit into
DOI-USGS:mainfrom
thodson-usgs:fix/waterdata-numeric-array-params
Jun 1, 2026
Merged

fix(waterdata): numpy array / Series numeric params leaked their repr into the query#308
thodson-usgs merged 1 commit into
DOI-USGS:mainfrom
thodson-usgs:fix/waterdata-numeric-array-params

Conversation

@thodson-usgs
Copy link
Copy Markdown
Collaborator

Problem

A numeric param in _NO_NORMALIZE_PARAMS (water_year, year, month, day, thresholds, …) passed as a numpy array or pandas Series fell into the args[k] = v passthrough in _get_args without being materialized to a list. Downstream, both the GET comma-join (",".join(...) if isinstance(v, (list, tuple))) and the chunker (_extract_axes) test list/tuple — so an ndarray/Series was neither comma-joined nor chunked, and its repr leaked into the URL:

get_peaks(water_year=np.array([2020, 2021]))   # -> water_year=%5B2020+2021%5D   (str of the array!)
get_peaks(water_year=pd.Series([2020, 2021]))  # -> water_year=0    2020\n1 ...  (Series repr)
get_peaks(water_year=[2020, 2021])             # -> water_year=2020,2021         (already fine)

Fix

Split the branch so _NO_NORMALIZE_PARAMS values keep their element types (no string-normalization — they're numbers) but a non-string iterable is still materialized to a list, so it comma-joins and chunks like a plain list.

Verification

numpy array  -> water_year=2020%2C2021
pandas Series-> water_year=2020%2C2021
plain list   -> water_year=2020%2C2021   (unchanged)
scalar int   -> water_year=2020          (unchanged)

Added a regression test (test_get_args_materializes_numpy_and_series_numeric_params); existing numeric/construct tests pass; ruff clean.

🤖 Generated with Claude Code

@thodson-usgs thodson-usgs force-pushed the fix/waterdata-numeric-array-params branch from 5237df4 to c715658 Compare June 1, 2026 03:34
…r()-ing them

A numeric (_NO_NORMALIZE_PARAMS) param — water_year, year, month, day,
thresholds, … — passed as a numpy array or pandas Series fell into the
`args[k] = v` passthrough in _get_args without being materialized to a list.
Downstream, the GET comma-join and the chunker both test `list`/`tuple`, so an
ndarray/Series was neither comma-joined nor chunked: e.g.
get_peaks(water_year=np.array([2020, 2021])) produced
`water_year=%5B2020+2021%5D` (the array's repr) instead of
`water_year=2020,2021`, which the API rejects with HTTP 400. Plain lists
already worked.

Split the branch so _NO_NORMALIZE_PARAMS values keep their element types (no
string-normalization) but a non-string iterable is still materialized to a
list of native Python scalars — `.tolist()` for numpy/pandas, `list()` for
generators and other iterables — so the values comma-join in the URL, chunk,
and stay JSON-serializable (no numpy reprs in args).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@thodson-usgs thodson-usgs force-pushed the fix/waterdata-numeric-array-params branch from c715658 to fc736e4 Compare June 1, 2026 11:55
@thodson-usgs thodson-usgs marked this pull request as ready for review June 1, 2026 12:18
@thodson-usgs thodson-usgs merged commit c107fc9 into DOI-USGS:main Jun 1, 2026
14 of 15 checks passed
@thodson-usgs thodson-usgs deleted the fix/waterdata-numeric-array-params branch June 1, 2026 12:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant