Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
5d62d60
Add space-group-database ADR suggestion
AndrewSazonov Jun 1, 2026
406fda5
Add space-group-database implementation plan
AndrewSazonov Jun 1, 2026
0eaa64f
Add cctbx-based space-group table extraction
AndrewSazonov Jun 1, 2026
e674fb9
Add multi-source cross-check and disagreement report
AndrewSazonov Jun 1, 2026
665e4dd
Generate initial space-group disagreement report
AndrewSazonov Jun 1, 2026
a5f7bc6
Localize space-group curation artifacts
AndrewSazonov Jun 2, 2026
1262626
Move crysview demo to subdirectory
AndrewSazonov Jun 2, 2026
21319b1
Add curated space-group overrides
AndrewSazonov Jun 2, 2026
8095e79
Generate complete space_groups.json.gz with recorded provenance
AndrewSazonov Jun 2, 2026
64c5cf1
Load space groups from JSON and drop restricted unpickler
AndrewSazonov Jun 2, 2026
3fbb123
Reach Phase 1 review gate
AndrewSazonov Jun 2, 2026
5c0911c
Cover public space-group coordinate aliases
AndrewSazonov Jun 2, 2026
a7f8bd1
Add space-group database ADR index row
AndrewSazonov Jun 2, 2026
d259410
Add Phase 2 space-group database tests and packaging check
AndrewSazonov Jun 2, 2026
d0b8d5a
Apply pixi run fix auto-fixes
AndrewSazonov Jun 2, 2026
535dbd6
Alias runtime coordinate codes in space-group database
AndrewSazonov Jun 2, 2026
fb132d3
Apply prettier formatting to space-group-database ADR
AndrewSazonov Jun 2, 2026
d077d44
Make packaging check inspect the wheel directly
AndrewSazonov Jun 2, 2026
39eab36
Unify Plotly tooltip styling and fix white Bragg hover
AndrewSazonov Jun 2, 2026
50cef0f
Guard optional IPython import in table backend base
AndrewSazonov Jun 2, 2026
0da38c1
Pin space-group record count in tests and wheel check
AndrewSazonov Jun 2, 2026
717e94d
Update Plotly hover test and fix overlong comment
AndrewSazonov Jun 2, 2026
8b01a47
Update tutorial links to latest docs
AndrewSazonov Jun 2, 2026
0f56b73
Add Wyckoff letter detection ADR
AndrewSazonov Jun 2, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/dev/adrs/accepted/crysview-structure-visualization.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,8 @@ parameters a refinement is adjusting.

A working prototype establishes the target experience and the data it
needs. It lives at
[`crysview-threejs-demo.html`](crysview-threejs-demo.html) and
demonstrates, against a non-orthogonal unit cell:
[`crysview-threejs-demo.html`](crysview-structure-visualization/crysview-threejs-demo.html)
and demonstrates, against a non-orthogonal unit cell:

- atoms as spheres with element radius and colour;
- anisotropic ADP ellipsoids (semi-axis lengths plus orientation);
Expand Down
83 changes: 43 additions & 40 deletions docs/dev/adrs/index.md

Large diffs are not rendered by default.

532 changes: 532 additions & 0 deletions docs/dev/adrs/suggestions/background-auto-estimate.md

Large diffs are not rendered by default.

434 changes: 434 additions & 0 deletions docs/dev/adrs/suggestions/plotting-docs-performance.md

Large diffs are not rendered by default.

515 changes: 515 additions & 0 deletions docs/dev/adrs/suggestions/space-group-database.md

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Space-group database curated overrides.
#
# Initial minimal seed decision, 2026-06-02:
# Use the ADR source priority without manual row-level overrides for now:
# cryspy wyckoff.dat for Wyckoff-facing values, cctbx/sgtbx for
# setting-level metadata, and RASPA/Avogadro/SgInfo/gemmi as cross-checks.
#
# Flagged rows remain visible in tmp/space-groups/extracted-comparison/
# for future verification against International Tables Vol A, Bilbao
# Crystallographic Server, and ISODISTORT.
[]
502 changes: 502 additions & 0 deletions docs/dev/adrs/suggestions/wyckoff-letter-detection.md

Large diffs are not rendered by default.

1 change: 0 additions & 1 deletion docs/dev/package-structure/full.md
Original file line number Diff line number Diff line change
Expand Up @@ -263,7 +263,6 @@
│ ├── 📄 __init__.py
│ ├── 📄 crystallography.py
│ └── 📄 space_groups.py
│ └── 🏷️ class _RestrictedUnpickler
├── 📁 datablocks
│ ├── 📁 experiment
│ │ ├── 📁 categories
Expand Down
276 changes: 276 additions & 0 deletions docs/dev/plans/background-auto-estimate.md

Large diffs are not rendered by default.

275 changes: 275 additions & 0 deletions docs/dev/plans/space-group-database.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,275 @@
# Plan: Complete Space-Group Reference Database

This plan follows [`AGENTS.md`](../../../AGENTS.md) and implements the
[`space-group-database`](../adrs/suggestions/space-group-database.md)
ADR.

**Deliberate exception to note for `/draft-impl-1`:** Phase 1 contains a
**maintainer-only curation gate** (P1.4). The disagreement report is
generated by the agent, but selecting authoritative values is a human
decision (the maintainer inspects sources + International Tables and
fills the overrides file). `/draft-impl-1` must **stop at P1.4 and
wait** for the maintainer; it resumes at P1.5 once the ADR companion
overrides file is populated (or confirmed empty because all machine
sources agreed).

**Deliberate exception to the normal checked-in tooling pattern:** the
space-group source bundle, extraction helpers, generator, and comparison
tables live under ignored `tmp/space-groups/`. They are one-time
curation artifacts, not branch deliverables. The branch commits the
final generated database, the ADR companion overrides file, and ADR
provenance; the local helper path and SHA-256 are recorded so a future
careful rebuild remains possible from the preserved local workspace.

## ADR

This plan owns the ADR
[`docs/dev/adrs/suggestions/space-group-database.md`](../adrs/suggestions/space-group-database.md)
(drafted via `/draft-adr`, review cycle closed). It is a
**prerequisite** for
[`wyckoff-letter-detection`](../adrs/suggestions/wyckoff-letter-detection.md):
this plan delivers the complete data; that feature delivers the
`''`→`None` consumer handling so the triclinic groups use it.

## Branch and PR

- Branch: **`space-group-database`** (flat slug off `develop`, no
`feature/` prefix). Do not push unless asked.
- PR targets **`develop`**.

## Decisions (settled in the ADR)

- **Format:** `space_groups.json.gz` (gzip-compressed JSON). Drop the
pickle and the `_RestrictedUnpickler`; the loader `json.load`s the
decompressed stream and rebuilds the in-memory dict.
- **In-memory shape unchanged:** `SPACE_GROUPS` stays a dict keyed by
`(IT_number, IT_coordinate_system_code)`; on disk it is a list of
setting records carrying those two canonical fields. Existing
consumers (`crystallography.py`, `calculators/cryspy.py`) are
untouched.
- **Source priority:** cryspy `wyckoff.dat` first for Wyckoff-facing
values (letters, multiplicities, site symmetries, representative
coordinate orbits); cctbx/sgtbx for setting-level metadata and
operation checks. cctbx is **generation-only**, installed into a
**throwaway environment** for the one-time build and never added to
the project's runtime dependencies (`cctbx` is named here for
`/draft-impl-1` pre-approval, but it is _not_ a pyproject runtime
dep).
- **Scope:** all 230 groups × all standard settings and public
coordinate-code aliases × full Wyckoff orbits, plus the
**symmetry-core** metadata `hall_symbol`, general-position `symop`
list, `generators`, `point_group`, `laue_class`, `centring` (further
fields deferred per the ADR).
- **Cross-check sources:** cryspy `data/cryspy/wyckoff.dat`, gemmi
(settings + ops + orbit-closure of a given representative), parsed
Avogadro/SgInfo setting data, and the full RASPA appendix CSV under
`data/raspa/`; **International Tables** is the maintainer's authority
for flagged cases (PDF — not machine-readable, so not automated).
- **Curation:** disagreements (≥2 machine sources differ) → local
grouped Markdown/CSV exports under
`tmp/space-groups/extracted-comparison/`; maintainer selections are
the checked-in ADR companion record (list of records, each with
rationale).
- **Build is one-time:** the exact software versions are recorded in the
ADR's _Build Provenance_ section; no recurring pipeline.

## Open questions

- None blocking. The exact cctbx field availability (e.g. whether every
symmetry-core field is exposed directly or must be derived) is
resolved empirically while writing the generator (P1.1); record
anything surprising in the ADR.

## Concrete files likely to change

- `tmp/space-groups/helper-tools/generate_space_groups.py` — local
ignored generator (cctbx extraction + multi-source cross-check +
disagreement report + overrides consumption). Do not keep this helper
in the branch after implementation.
- `docs/dev/adrs/suggestions/space-group-database/space_groups_overrides.yaml`
— **new** curation overrides (maintainer-authored at P1.4). Keep the
selected values here, not inside this plan: the plan documents
workflow, while YAML is the structured generator input with a focused
review diff. If the ADR is accepted, move this companion file into the
accepted ADR companion folder with the ADR.
- `tmp/space-groups/extracted-comparison/` — local grouped Markdown/CSV
exports, including one CSV per compared field.
- `src/easydiffraction/crystallography/space_groups.json.gz` — **new**
generated database; **remove** `space_groups.pkl.gz`.
- `src/easydiffraction/crystallography/space_groups.py` — rewrite loader
(read JSON, reconstruct dict, drop `_RestrictedUnpickler` + `pickle`).
- `docs/dev/adrs/suggestions/space-group-database.md` — fill in _Build
Provenance_ with recorded versions.
- `tools/check_packaged_db.py` — **new** helper that inspects a built
wheel (independent of the package's dependency tree): it reads
`space_groups.json.gz` straight from the wheel and asserts the data
ships, the obsolete `.pkl.gz` is gone, and all 230 groups plus the
cryspy coordinate-code alias surface are present (used by the Phase 2
packaging regression).
- `pyproject.toml` — **only if** the Phase 2 packaging test shows the
`.json.gz` is not shipped (add a hatch `artifacts`/`force-include`
entry).
- Phase 2:
`tests/unit/easydiffraction/crystallography/test_space_groups.py` and
`test_space_groups_coverage.py` (JSON loader + presence +
query-surface
- spot-check tests); a wheel import/load packaging test.

## Implementation steps (Phase 1)

Each `- [ ]` step is one atomic commit. Per §Commits, stage only the
files the step names, with explicit paths, and commit locally with the
step's `Commit:` message **before** moving to the next step. Mark `[x]`
in this file as part of the same commit.

- [x] **P1.1 — cctbx extraction to a complete setting table.** Write the
first part of
`tmp/space-groups/helper-tools/generate_space_groups.py`: in a
throwaway env with `cctbx` installed, enumerate all 230 groups ×
standard settings via sgtbx, emit the available symmetry-core
metadata and cctbx Wyckoff/orbit candidates into the in-memory
record list keyed by `(IT_number, IT_coordinate_system_code)`.
Keep coordinates/operators as strings (JSON-native). Do not wire a
routine pixi task. Commit:
`Add cctbx-based space-group table extraction`

- [x] **P1.2 — Multi-source cross-check + disagreement report.** Extend
the generator to merge cryspy `wyckoff.dat` Wyckoff-facing values
with cctbx setting metadata, then compare against gemmi
(`spacegroup_table()` settings/ops + orbit-closure), parsed
Avogadro/SgInfo data, and the RASPA CSV. Where ≥2 machine sources
disagree on a value, write a report record
`{case, per-source values, IT: <blank for maintainer>, Override: <blank for maintainer>}`
to the local comparison folder. Consume the ADR companion
overrides file if present. P1.3 command:

```bash
pixi exec --spec cctbx --spec gemmi --spec sympy --spec pyyaml \
python tmp/space-groups/helper-tools/generate_space_groups.py \
--write-comparison-folder tmp/space-groups/extracted-comparison \
--print-summary
```

Commit: `Add multi-source cross-check and disagreement report`

- [x] **P1.3 — First generation run + commit the report.** Run the
generator (cctbx temp-installed) with no overrides to produce the
initial local disagreement report. Do not commit the local
comparison artifacts or a database file yet. Commit:
`Generate initial space-group disagreement report`

- [x] **P1.3a — Localize one-time curation artifacts.** Move the
generator and comparison outputs fully into ignored
`tmp/space-groups/`; remove the previously tracked generator and
report artifacts from the branch. Update the ADR and plan to
record that the generator helper is local curation tooling and
that the final provenance records its SHA-256 instead of a commit
hash. Commit: `Localize space-group curation artifacts`

- [x] **P1.4 — MAINTAINER CURATION GATE (manual).** `/draft-impl-1`
**stops here**. The maintainer inspects the report, consults
International Tables for flagged cases, and **always** produces a
checked-in
`docs/dev/adrs/suggestions/space-group-database/space_groups_overrides.yaml`:
a list of curated records (each with rationale) when there were
disagreements, or — if the report was empty because all machine
sources agreed — the same file containing only an explanatory
header comment and an empty list. Either way there is a concrete
file to stage, so the per-step commit rule holds. The agent
resumes at P1.5 only after the maintainer confirms. Commit
(maintainer, or agent on resume):
`Add curated space-group overrides`

- [x] **P1.5 — Final generation + provenance.** Re-run the generator
consuming the overrides to emit
`src/easydiffraction/crystallography/space_groups.json.gz`. Fill
in the ADR's _Build Provenance_ section with the exact versions
(cctbx channel/ version/build/install command, Python/platform,
gemmi/cryspy versions + `wyckoff.dat` SHA-256, gathered-input
origins+SHA-256, generator helper SHA-256 + command, output
SHA-256). Commit:
`Generate complete space_groups.json.gz with recorded provenance`

- [x] **P1.6 — Rewrite loader, drop pickle.** Rewrite
`src/easydiffraction/crystallography/space_groups.py` to read
`space_groups.json.gz` and rebuild the
`(IT_number, IT_coordinate_system_code)`-keyed `SPACE_GROUPS`
dict; remove `_RestrictedUnpickler` and the `pickle` import.
`git rm` the old `space_groups.pkl.gz`. Commit:
`Load space groups from JSON and drop restricted unpickler`

- [x] **P1.7 — Phase 1 review gate.** No code. Mark this `[x]`, commit
the checklist update alone, and hand off to `/review-impl-1`.
Commit: `Reach Phase 1 review gate`

## Phase 2 — Verification

Add/update tests, then run the checks below. Stop after Phase 1 for
review before starting Phase 2.

Tests to add/update (in `tests/unit/easydiffraction/crystallography/`,
mirroring source):

- update `test_space_groups.py` / `test_space_groups_coverage.py` for
the JSON loader (no pickle);
- **presence**: all 230 groups + standard settings load (regression vs
the current 42-group / 18-setting gap);
- **query surface (parity with today, no new index)**: every
standard-setting `(IT_number, IT_coordinate_system_code)` entry loads
from the DB, every coordinate-system code currently exposed by
cryspy's `get_it_coordinate_system_codes_by_it_number` resolves in
`SPACE_GROUPS`, and for each group the _existing_ cryspy-backed
`get_it_number_by_name_hm_short` resolution still returns an IT number
that is present in the DB. This verifies "at least as queryable as
today" against the loaded dict and the unchanged H-M path; a
database-derived H-M index is **not** added here (it stays Deferred
Work in the ADR);
- **spot-checks vs International Tables** for P4, P3, P6, Pm-3, a
monoclinic with cell choices, and an origin-choice group;
- **packaging**: `tools/check_packaged_db.py` opens the built **wheel**
and reads `space_groups.json.gz` directly from it (no install, so the
check is independent of the package's dependency tree), asserting the
data ships, the obsolete `.pkl.gz` is absent, and all 230 groups plus
the cryspy coordinate-code alias surface are present. This catches
package-data omission for the renamed `.json.gz` regardless of
unrelated runtime-import issues.

Verification commands (zsh-safe log capture where output is needed):

```bash
pixi run fix
pixi run check > /tmp/easydiffraction-check.log 2>&1; check_exit_code=$?; tail -n 200 /tmp/easydiffraction-check.log; exit $check_exit_code
pixi run unit-tests > /tmp/easydiffraction-unit.log 2>&1; unit_tests_exit_code=$?; tail -n 200 /tmp/easydiffraction-unit.log; exit $unit_tests_exit_code
pixi run integration-tests > /tmp/easydiffraction-integration.log 2>&1; integration_tests_exit_code=$?; tail -n 200 /tmp/easydiffraction-integration.log; exit $integration_tests_exit_code
pixi run script-tests > /tmp/easydiffraction-script.log 2>&1; script_tests_exit_code=$?; tail -n 200 /tmp/easydiffraction-script.log; exit $script_tests_exit_code
```

Packaging regression — build the wheel and inspect it directly (no
install, so the check does not depend on the package's full runtime
dependency tree):

```bash
rm -rf dist
pixi run dist-build > /tmp/easydiffraction-build.log 2>&1; build_exit_code=$?; tail -n 8 /tmp/easydiffraction-build.log; [ "$build_exit_code" -eq 0 ] || exit "$build_exit_code"
python tools/check_packaged_db.py dist/*.whl; pkg_check_exit_code=$?; [ "$pkg_check_exit_code" -eq 0 ] || exit "$pkg_check_exit_code"
```

If this shows `.json.gz` is not shipped, add a hatch
`artifacts`/`force-include` entry in `pyproject.toml` and re-run.

## Suggested Pull Request

**Title:** Complete the bundled space-group database (all 230 groups)

**Description:** EasyDiffraction's bundled space-group table was missing
42 common space groups entirely — including P4, P3, P6, and Pm-3 — plus
many alternative monoclinic settings, so symmetry constraints and
Wyckoff information silently did nothing for structures in those groups.
This change rebuilds the database from curated cryspy and cctbx/sgtbx
source data, covering all 230 groups, their standard settings, and every
public coordinate-code alias with full symmetry information. The seed
data is cross-checked against several independent references and keeps
flagged rows visible for later International Tables verification. The
data now ships as transparent, inspectable JSON instead of an opaque
binary pickle. Existing projects load unchanged; structures in the
previously-missing groups now get correct symmetry handling.
2 changes: 1 addition & 1 deletion docs/docs/tutorials/ed-13.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2671,7 +2671,7 @@
"\n",
"If you'd like to keep exploring, the EasyDiffraction library offers\n",
"many additional tutorials and examples on the official documentation\n",
"site: 👉 https://docs.easydiffraction.org/lib/tutorials/\n",
"site: 👉 https://docs.easydiffraction.org/lib/latest/tutorials\n",
"\n",
"Besides the Python package, EasyDiffraction also comes with a\n",
"graphical user interface (GUI) that lets you perform similar analyses\n",
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/tutorials/ed-13.py
Original file line number Diff line number Diff line change
Expand Up @@ -1508,7 +1508,7 @@
#
# If you'd like to keep exploring, the EasyDiffraction library offers
# many additional tutorials and examples on the official documentation
# site: 👉 https://docs.easydiffraction.org/lib/tutorials/
# site: 👉 https://docs.easydiffraction.org/lib/latest/tutorials
#
# Besides the Python package, EasyDiffraction also comes with a
# graphical user interface (GUI) that lets you perform similar analyses
Expand Down
Binary file not shown.
Binary file not shown.
Loading
Loading