Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
e57ce3e
Update background-auto-estimate plan after review cycle
Jun 4, 2026
6df933e
Promote background-auto-estimate ADR to accepted
Jun 4, 2026
1f7e689
Add pybaselines dependency
Jun 4, 2026
2ee2420
Add BackgroundEstimatorMethodEnum
Jun 4, 2026
09313fd
Add background curve estimator helper
Jun 4, 2026
e6ad77c
Add clear method to CollectionBase
Jun 4, 2026
8195dd3
Add auto_estimate to LineSegmentBackground
Jun 4, 2026
c0f0936
Reach Phase 1 review gate
Jun 4, 2026
3011b77
Guard auto_estimate against empty active data
Jun 4, 2026
0620114
Make n_points cap deterministic at zero noise
Jun 4, 2026
8b5493f
Validate auto_estimate numeric overrides
Jun 4, 2026
0cb131e
Update ADR to match Phase 1 helper contract
Jun 4, 2026
527ff41
Apply pixi run fix auto-fixes
Jun 4, 2026
4ef44bd
Use list.extend in RDP stack updates
Jun 4, 2026
5235ab6
Add unit tests for background estimator helper
Jun 4, 2026
be4e081
Add auto_estimate lifecycle tests
Jun 4, 2026
d72a0fd
Add CollectionBase.clear() invariant tests
Jun 4, 2026
5c2294f
Add functional tutorial-corpus test for auto_estimate
Jun 4, 2026
812757e
Add TOF tutorial-corpus case for auto_estimate
Jun 4, 2026
77782a5
Apply ruff format to auto_estimate tests
Jun 4, 2026
6fffe47
Document automatic background estimation in the user guide
Jun 4, 2026
9552163
Apply pixi run fix auto-fixes
Jun 4, 2026
5862327
Compare auto_estimate against tutorial background curve
Jun 4, 2026
64fa746
Assert adapter dispatch and clipping in auto_estimate tests
Jun 4, 2026
d5bc416
Simplify background estimation with auto_estimate method
AndrewSazonov Jun 4, 2026
656f4fa
Record narrower Phase 2 calibration outcome in docs
AndrewSazonov Jun 4, 2026
79ddb5d
Simplify background setting by using auto_estimate method
AndrewSazonov Jun 4, 2026
bfa5820
Document the empty-data exception to the overwrite contract
AndrewSazonov Jun 4, 2026
9bbd598
Simplify background estimation with auto_estimate method
AndrewSazonov Jun 4, 2026
d308082
Estimate background after excluding regions in ed-2 tutorial
AndrewSazonov Jun 4, 2026
2def749
Regenerate ed-6 notebook for auto_estimate conversion
AndrewSazonov Jun 4, 2026
d3c9cde
Regenerate ed-20 notebook for auto_estimate conversion
AndrewSazonov Jun 4, 2026
1caea81
Enhance table rendering: prevent cell wrapping for wide tables
AndrewSazonov Jun 4, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# ADR: Automatic Line-Segment Background Estimation

**Status:** Proposed **Date:** 2026-06-01
**Status:** Accepted **Date:** 2026-06-01

## Group

Expand Down Expand Up @@ -299,19 +299,24 @@ The intended usage is a loop, and the API supports it directly:
background and clip heights to the original measured intensities
(§2).

**Every call overwrites and re-fixes.** `auto_estimate()` always clears
the collection and rebuilds it — there is no append mode — and the
rebuilt points are **fixed** (`free=False`) regardless of whether the
previous points had been freed during refinement. A second call is
therefore a fresh fixed seed, not a merge: calling it again overwrites
the points and re-fixes them even if they were free. This keeps the loop
predictable (each pass starts from a clean, fixed background) and
idempotent (same inputs → same points). Clearing everything — including
any hand-added points — is the deliberate "overwrite" contract;
preserving manual points is deferred. When the collection is non-empty,
the call logs a one-line notice that it is replacing the existing
points, so a user who hand-tuned a background is not surprised; the
first call, with nothing to replace, is silent.
**Every call overwrites and re-fixes.** Whenever it produces an
estimate, `auto_estimate()` clears the collection and rebuilds it —
there is no append mode — and the rebuilt points are **fixed**
(`free=False`) regardless of whether the previous points had been freed
during refinement. A second call is therefore a fresh fixed seed, not a
merge: calling it again overwrites the points and re-fixes them even if
they were free. This keeps the loop predictable (each pass starts from a
clean, fixed background) and idempotent (same inputs → same points).
Clearing everything — including any hand-added points — is the
deliberate "overwrite" contract; preserving manual points is deferred.
When the collection is non-empty, the call logs a one-line notice that
it is replacing the existing points, so a user who hand-tuned a
background is not surprised; the first call, with nothing to replace, is
silent. The one exception is degenerate input: when no active data
remain (every point excluded, or data not yet loaded), the call emits a
single warning and returns **without touching the existing points**, so
an accidental call on an unloaded experiment does not wipe a hand-tuned
background.

**Always fixed; no `free` argument.** Generated points are always
created fixed (`intensity.free = False`) — there is no caller-selectable
Expand All @@ -329,18 +334,24 @@ active points only.
### 6. Where the code lives

A backend-agnostic estimator helper —
`estimate_background_curve(x, y, *, beam_mode, peaks=None, width=None, ...) -> (curve, anchors)`
`estimate_background_curve(x, y, *, method='arpls', peaks=None, width=None, ...) -> BackgroundEstimate`
— lives in a new small module in the background package (e.g.
`datablocks/experiment/categories/background/estimate.py`). It is pure
array-in/array-out (the optional `peaks` argument carries model peak
positions detected from the peak-only model array per §5 — not
array-in/array-out (the optional `peaks` argument is a boolean mask
aligned with `x` that forbids non-endpoint anchors on peak samples,
built by the adapter from the peak-only model array per §5 — not
reflection metadata), holds no model state, wraps `pybaselines` for
Stage 1, and keeps the §3 parameterization and Stage-2 thinning in-house
— so it stays unit-testable in isolation and pulls no domain logic into
`core/`. `LineSegmentBackground.auto_estimate()` is a thin adapter: read
the pattern (and model, if present), call the helper, clip, and
`create()` the points. Helpers are extracted as needed to stay under the
lint complexity thresholds
`core/`. It returns a small `BackgroundEstimate` result object (curve,
anchors, and the method/width/noise/tolerance/backend-params metadata
the adapter logs). The `beam_mode` argument from earlier drafts is
deferred with the per-beam-mode policy (see _Deferred Work_); omitting
it also keeps the helper within the project's argument-count guardrail.
`LineSegmentBackground.auto_estimate()` is a thin adapter: read the
pattern (and model, if present), call the helper, clip, and `create()`
the points. Helpers are extracted as needed to stay under the lint
complexity thresholds
([`lint-complexity-thresholds.md`](../accepted/lint-complexity-thresholds.md))
rather than raising them.

Expand All @@ -353,15 +364,17 @@ Work_ — to avoid an abstraction before its second concrete use.
The four design questions raised in review are resolved: noise-relative
Stage-2 thinning (§3), always-overwrite with a replace notice (§5), a
single Stage-1 method for now (§3), and a void method that logs a
one-line summary (§1). What remains is empirical calibration, done
against the tutorial corpus during implementation:

- The exact Stage-2 tolerance multiplier (`c · σ`, proposed `c ≈ 2`) and
the width percentile (proposed ~75th) need tuning against real
datasets.
- Whether the single Stage-1 method holds across the whole corpus
(CWL/TOF, neutron/X-ray) or a `beam_mode`/`radiation_probe` policy is
eventually needed (see §Deferred Work).
one-line summary (§1). Empirical calibration was carried out in Phase 2:

- The Stage-2 tolerance multiplier (`c · σ`, `c = 2`) and the width
percentile (~75th) are first-cut constants; they were validated — not
exhaustively swept — against the representative CWL (`ed-2`) and TOF
(`ed-13`) datasets plus the analytic unit cases, and produce sensible
backgrounds there. Re-tuning stays possible if a future dataset needs
it.
- The single Stage-1 method (`arpls`) holds for both validated beam
modes; no `beam_mode`/`radiation_probe` policy was required (it stays
in §Deferred Work should a future corpus show otherwise).

## Consequences

Expand Down Expand Up @@ -465,22 +478,22 @@ helper:
the single fallback warning rather than an exception or a garbage
background.

**Tutorial corpus as real-world reference.** The ~25 tutorial scripts in
`docs/docs/tutorials/*.py` already build real experiments with
well-defined backgrounds across both beam modes and both probes — CWL
(e.g. the sloping background in
[`ed-17.py`](../../../../docs/docs/tutorials/ed-17.py) and
[`ed-2.py`](../../../../docs/docs/tutorials/ed-2.py)) and TOF (e.g.
[`ed-13.py`](../../../../docs/docs/tutorials/ed-13.py),
[`ed-16.py`](../../../../docs/docs/tutorials/ed-16.py)). Their
hand-placed line-segment points are ground truth: stripping them and
**Tutorial corpus as real-world reference.** The tutorial scripts in
`docs/docs/tutorials/*.py` build real experiments with well-defined
backgrounds across both beam modes and both probes. Their hand-placed
line-segment points are a real-world reference: stripping them and
re-running `auto_estimate()` should reproduce a comparable background
curve within tolerance. This gives broad, real coverage across space
groups, beam modes, and probes at almost no authoring cost, and is the
reference set used to calibrate the default constants and confirm the
single Stage-1 method. These corpus checks run at the functional /
script level where the tutorial experiments are already loaded, not at
unit level.
curve. **Phase 2 outcome:** the functional regression validates two
representative datasets — CWL
[`ed-2.py`](../../../../docs/docs/tutorials/ed-2.py) and TOF
[`ed-13.py`](../../../../docs/docs/tutorials/ed-13.py) — comparing the
estimated curve against the hand-placed reference to within a fraction
of the measured signal scale; the single `arpls` default and the
first-cut constants hold for both. Sloping and curved backgrounds are
covered against exact analytic ground truth by the unit tests, not the
corpus. A broader per-tutorial sweep (e.g. `ed-17`, `ed-16`) was not
needed and stays available if a future dataset misbehaves. These checks
run at the functional / unit level.

The estimator module mirrors into
`tests/unit/easydiffraction/datablocks/experiment/categories/background/`
Expand Down
2 changes: 1 addition & 1 deletion docs/dev/adrs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ folders.
| Documentation | Accepted | Plotting & Docs Performance for Interactive Figures | Self-hosts a lazy, shared figure runtime so docs pages load fast and progressively while staying interactive. | [`plotting-docs-performance.md`](accepted/plotting-docs-performance.md) |
| Documentation | Suggestion | Documentation CI and Build Verification | Proposes strict MkDocs builds, API-derived docs, snippet smoke tests, link checks, and prose/spelling checks. | [`documentation-ci-build.md`](suggestions/documentation-ci-build.md) |
| Experiment model | Accepted | Immutable Experiment Type | Makes experiment type axes creation-time state rather than mutable runtime state. | [`immutable-experiment-type.md`](accepted/immutable-experiment-type.md) |
| Experiment model | Suggestion | Automatic Line-Segment Background Estimation | Detects line-segment background control points from the measured pattern, peak-insensitive and editable. | [`background-auto-estimate.md`](suggestions/background-auto-estimate.md) |
| Experiment model | Accepted | Automatic Line-Segment Background Estimation | Detects line-segment background control points from the measured pattern, peak-insensitive and editable. | [`background-auto-estimate.md`](accepted/background-auto-estimate.md) |
| Factories | Accepted | Factory Contracts and Metadata | Standardizes factory construction, metadata, compatibility, and registration behavior. | [`factory-contracts.md`](accepted/factory-contracts.md) |
| Naming | Accepted | Factory Tag Naming | Defines canonical factory tag style and standard abbreviations. | [`factory-tag-naming.md`](accepted/factory-tag-naming.md) |
| Persistence | Accepted | Free-Flag CIF Encoding | Encodes fit free/fixed state through CIF uncertainty syntax instead of a separate free list. | [`free-flag-cif-encoding.md`](accepted/free-flag-cif-encoding.md) |
Expand Down
8 changes: 4 additions & 4 deletions docs/dev/package-structure/full.md
Original file line number Diff line number Diff line change
Expand Up @@ -276,7 +276,10 @@
│ │ │ │ │ ├── 🏷️ class PolynomialTerm
│ │ │ │ │ └── 🏷️ class ChebyshevPolynomialBackground
│ │ │ │ ├── 📄 enums.py
│ │ │ │ │ └── 🏷️ class BackgroundTypeEnum
│ │ │ │ │ ├── 🏷️ class BackgroundTypeEnum
│ │ │ │ │ └── 🏷️ class BackgroundEstimatorMethodEnum
│ │ │ │ ├── 📄 estimate.py
│ │ │ │ │ └── 🏷️ class BackgroundEstimate
│ │ │ │ ├── 📄 factory.py
│ │ │ │ │ └── 🏷️ class BackgroundFactory
│ │ │ │ └── 📄 line_segment.py
Expand Down Expand Up @@ -611,8 +614,6 @@
│ │ │ │ └── 🏷️ class ProjectInfo
│ │ │ └── 📄 factory.py
│ │ │ └── 🏷️ class ProjectInfoFactory
│ │ ├── 📁 publication
│ │ ├── 📁 rendering
│ │ ├── 📁 rendering_plot
│ │ │ ├── 📄 __init__.py
│ │ │ ├── 📄 default.py
Expand Down Expand Up @@ -673,7 +674,6 @@
│ │ ├── 📁 html
│ │ │ └── 📁 vendor
│ │ └── 📁 tex
│ │ └── 📁 styles
│ ├── 📄 __init__.py
│ ├── 📄 data_context.py
│ │ └── 🏷️ class ReportDataContext
Expand Down
4 changes: 1 addition & 3 deletions docs/dev/package-structure/short.md
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,7 @@
│ │ │ │ ├── 📄 base.py
│ │ │ │ ├── 📄 chebyshev.py
│ │ │ │ ├── 📄 enums.py
│ │ │ │ ├── 📄 estimate.py
│ │ │ │ ├── 📄 factory.py
│ │ │ │ └── 📄 line_segment.py
│ │ │ ├── 📁 calculator
Expand Down Expand Up @@ -289,8 +290,6 @@
│ │ │ ├── 📄 __init__.py
│ │ │ ├── 📄 default.py
│ │ │ └── 📄 factory.py
│ │ ├── 📁 publication
│ │ ├── 📁 rendering
│ │ ├── 📁 rendering_plot
│ │ │ ├── 📄 __init__.py
│ │ │ ├── 📄 default.py
Expand Down Expand Up @@ -330,7 +329,6 @@
│ │ ├── 📁 html
│ │ │ └── 📁 vendor
│ │ └── 📁 tex
│ │ └── 📁 styles
│ ├── 📄 __init__.py
│ ├── 📄 data_context.py
│ ├── 📄 enums.py
Expand Down
Loading
Loading