Skip to content

First pass at EFT-based spherical geometry accuracy (port from AccuSphGeom)#1513

Open
rajeeja wants to merge 12 commits into
mainfrom
rajeeja/accusphere
Open

First pass at EFT-based spherical geometry accuracy (port from AccuSphGeom)#1513
rajeeja wants to merge 12 commits into
mainfrom
rajeeja/accusphere

Conversation

@rajeeja
Copy link
Copy Markdown
Contributor

@rajeeja rajeeja commented May 22, 2026

Closes #1509

This PR adapts the compensated-arithmetic tier from AccuSphGeom for UXarray's Numba-based spherical geometry stack. It replaces several cancellation-prone cross-product and dot-product paths with compensated primitives and applies them to arc predicates, GCA-GCA intersections, GCA/constant-latitude intersections, point-in-face, and GCA face bounds.

What is included:

  • Compensated arithmetic primitives (two_sum, two_prod, compensated cross products, compensated dot products, sum-of-squares, and sqrt correction).
  • AccuSphGeom-derived baseline regression tests for near-tangent GCA-GCA cases, GCA/constant-latitude cases, and spherical point-in-polygon cases.
  • UXarray integration for existing geometry APIs, bounds construction, and point-in-face queries.
  • User/API documentation describing the numerical accuracy problem and the compensated arithmetic approach.

Scope:

  • This is not a full port of AccuSphGeom's complete robustness stack.
  • It does not include Shewchuk adaptive predicates, Simulation of Simplicity/global-ID handling, exact/geogram fallback, C++ SIMD mask behavior, exact AccuSphGeom point-in-polygon implementation, or exact AccuSphGeom wrapper semantics.
  • UXarray-specific result packaging, longitude wrapping, and API compatibility are preserved where needed.

@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@rajeeja rajeeja requested a review from hongyuchen1030 May 22, 2026 13:24
Comment thread uxarray/grid/_eft.py Outdated
Copy link
Copy Markdown
Contributor

@hongyuchen1030 hongyuchen1030 May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have an existed implementation for these functions in

def _two_sum(a, b):
. Consider either merge them or only keep one of them

And I would suggest use other name instead of eft here, the term "EFT" specifically refers to "Error-Free" floating point operations, but the AccuCross and diff_of_productthemself is not completely Error-Free, is just more accurate than normal floating point cross (doubling the precision)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: merged this into uxarray/utils/computing.py and renamed the docs/API section to compensated arithmetic. I only call two_sum/two_prod EFT now; the cross-product helpers are described as compensated algorithms.

Comment thread uxarray/grid/bounds.py


@njit(cache=True)
def _generate_lat_lon_bounds_local(face_vertices, z_min, z_max, snap_tol_deg):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a better vectorization, consider keeping the current design in https://github.com/hongyuchen1030/AccuSphGeom/blob/56cbd6e30270ec5845a853ec72ba5b19c9128017/include/accusphgeom/algorithms/lat_lon_bounds.hpp#L107:

It uses lots of masking and avoid branching for computation steps.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: rewrote this path closer to the AccuSphGeom structure. The local bounds computation now uses mask-selection for extrema/snap decisions instead of branching through separate cases.

Copy link
Copy Markdown
Contributor

@hongyuchen1030 hongyuchen1030 Jun 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation for this entire function involves lots of new branching and operations that are not existed in the AccuSphGeom

Comment thread uxarray/grid/bounds.py


@njit(cache=True)
def _generate_lat_lon_bounds_pole(face_vertices, label, z_min, z_max, snap_tol_deg):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, consider keeping the structure here: https://github.com/hongyuchen1030/AccuSphGeom/blob/56cbd6e30270ec5845a853ec72ba5b19c9128017/include/accusphgeom/algorithms/lat_lon_bounds.hpp#L159

use masks instead of branching as much as possible for the vectorization purpose

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: same cleanup here. The polar bounds path now keeps the AccuSphGeom-style local/polar structure and uses mask-selection for the snap logic.

Comment thread uxarray/grid/bounds.py


@njit(cache=True)
def _face_location_info(face_vertices, polar_cap_z):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The branching here will probably slow down the vectorization, consider keeping the structure here:
https://github.com/hongyuchen1030/AccuSphGeom/blob/56cbd6e30270ec5845a853ec72ba5b19c9128017/include/accusphgeom/algorithms/lat_lon_bounds.hpp#L49

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: _face_location_info now follows the AccuSphGeom face-location flow. The per-edge extrema use mask-selection; only the final face classification branches remain.

Comment thread uxarray/grid/intersections.py Outdated
@njit(cache=True)
def _normalize_pair(x_hi, y_hi, z_hi, x_lo, y_lo, z_lo):
"""Normalize an (hi, lo) compensated vector, returning the unit vector and magnitude."""
x = x_hi + x_lo
Copy link
Copy Markdown
Contributor

@hongyuchen1030 hongyuchen1030 May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This _normalize_pair is not utilizing the EFT and it will have the same precision as the direct floating point precision. And the normalization is one of the big killer for the precision.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if the input are normalized, then both the gca_gca_intersection and the gca_constLat_intersection will return the normalized results already

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And just in case you want an "accurately calculated norm", it is const auto sum = numeric::sum_of_squares_c<T, 3>(v.hi, v.lo); const auto norm = numeric::acc_sqrt_re(sum.hi, sum.lo);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: removed _normalize_pair. The intersection code now keeps compensated normals/intersection vectors, and the baseline near-tangent cases pass.

Comment thread uxarray/grid/intersections.py Outdated
@@ -301,139 +314,198 @@ def _gca_gca_intersection_cartesian(gca_a_xyz, gca_b_xyz):

@njit(cache=True)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider keeping the entire structure from here : https://github.com/hongyuchen1030/AccuSphGeom/blob/e4e13215dd4771b7ef01b7edf81eaf58dd6e6995/include/accusphgeom/constructions/gca_gca_intersection.hpp#L31

These EFT-fused operations are extremely sensitive for each operations and order. The accuracy can only be guaranteed iff following the exact algorithm

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: rewrote GCA-GCA around the AccuSphGeom structure: accucross normals, accucross_pair for the intersection direction, then candidate filtering with on_minor_arc.

Comment thread uxarray/grid/intersections.py Outdated


@njit(cache=True)
def gca_const_lat_intersection(gca_cart, const_z):
Copy link
Copy Markdown
Contributor

@hongyuchen1030 hongyuchen1030 May 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The algorithm is almost very stable around floating point precision. Within the current Uxarray Error tolerance, you almost don't have to use any other safety pre-check other than if the input are valid (The relative error bound is 3*\sqrt{1-constz^2}*machine_epsilon as long as it's constZ is at least 10^15 from the equator) . And probably the only check needed Then only check you can make is if the const latitude is at the equator . Again, make sure to follow the entire algorithm structure from below. Such precision is only guaranteed
iff it follows the exact algorithm here
https://github.com/hongyuchen1030/AccuSphGeom/blob/e4e13215dd4771b7ef01b7edf81eaf58dd6e6995/include/accusphgeom/constructions/gca_constlat_intersection.hpp#L33

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: rewrote const-lat around the AccuSphGeom accux_constlat flow and filter invalid candidates after computing them.

Comment thread uxarray/grid/intersections.py Outdated
nx = nx_hi + nx_lo
ny = ny_hi + ny_lo
nz = nz_hi + nz_lo
denom = nx * nx + ny * ny
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Denome should be calculated from
const T denom = s2.hi + s2.lo;
where
const auto s2 = numeric::sum_of_squares_c<T, 2>( {normal.hi[0], normal.hi[1]}, {normal.lo[0], normal.lo[1]});

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: denom now comes from _sum_sq_c2(...) as s2_hi + s2_lo.

Comment thread uxarray/grid/intersections.py Outdated
@@ -301,139 +314,198 @@ def _gca_gca_intersection_cartesian(gca_a_xyz, gca_b_xyz):

@njit(cache=True)
def gca_gca_intersection(gca_a_xyz, gca_b_xyz):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider keeping the entire structure from here https://github.com/hongyuchen1030/AccuSphGeom/blob/e4e13215dd4771b7ef01b7edf81eaf58dd6e6995/include/accusphgeom/constructions/gca_gca_intersection.hpp#L31.

These algorithms are extremely sensitive to the operations and order, the accuracy is only guaranteed iff we follow the exact same algorithm described

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: same GCA-GCA rewrite as above, following the AccuSphGeom operation order much more closely. Added baseline tests for the near-tangent cases too.

Comment thread uxarray/grid/intersections.py Outdated
x2_at_const_z = np.isclose(
x2[2], const_z, rtol=ERROR_TOLERANCE, atol=ERROR_TOLERANCE
)
# 1. Endpoint coincidence with the latitude line.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case is still valid and calculable with the new algorithm. And we can improve the vectorization by only use mask-selection after the intersection is calculated to snap the intersection point with the endpoints

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: removed the endpoint early return. The code computes candidates first, then snaps near-endpoint results afterward.

Comment thread uxarray/grid/intersections.py Outdated
z_max = extreme_gca_z(gca_cart, extreme_type="max")

# Check if the constant latitude is within the GCA range
if not in_between(z_min, const_z, z_max):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this early exist will counter-effect the branching it brings to the vectorization

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: removed the endpoint pre-exits. I kept only invalid denom / negative-discriminant exits.

Comment thread uxarray/grid/intersections.py Outdated
elif p2_intersects_gca:
res[0] = p2
# 4. Solve for the two candidate points on the latitude circle.
r2 = 1.0 - const_z * const_z
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not the new EFT-fused algorithmn

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: replaced this with the compensated s2/s3, two_prod, cdp4, and acc_sqrt_re flow from AccuSphGeom.

@rajeeja rajeeja marked this pull request as draft May 22, 2026 21:18
@rajeeja
Copy link
Copy Markdown
Contributor Author

rajeeja commented May 22, 2026

@hongyuchen1030 Thanks for the review. The point-in-polygon predicate and orient3d sign check look good, but I can see now that the intersection functions are basically the old code with a few error-free transformation calls sprinkled in rather than a real port. I'm planning to rewrite gca_gca_intersection and gca_const_lat_intersection from scratch following your C++ implementation exactly. I will be adding the missing pieces:

  • compensated_dot_product
  • sum_of_squares_c
  • acc_sqrt_re
  • The second accucross overload that takes the hi/lo pairs

@rajeeja rajeeja marked this pull request as ready for review May 28, 2026 19:35
@rajeeja rajeeja requested a review from hongyuchen1030 May 28, 2026 19:35
Comment thread uxarray/utils/computing.py Outdated
tiers — an EFT filter (what this module implements), Shewchuk adaptive
predicates for results that fall inside the filter threshold, and a geogram
exact-arithmetic fallback. This port implements only the EFT tier. For
non-degenerate inputs in double precision this is sufficient; callers that
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're technically still unable to claim "For non-degenerate inputs in double precision this is sufficient". We can claim "This is twice more accurate than the direct floating point implementation without computation overhead with vectorization and parralization"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't use any adaptive arithmetic, we cannot claim for robustness here. We can only claim our point-in-face use the new algorithm that are more accurate (if we use any EFT operations)

Comment thread uxarray/utils/computing.py Outdated
predicates for results that fall inside the filter threshold, and a geogram
exact-arithmetic fallback. This port implements only the EFT tier. For
non-degenerate inputs in double precision this is sufficient; callers that
need to handle geometrically degenerate inputs (coincident arcs, a query
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These edge cases should be included in the test cases already. These scenarios don't have more risks than the "normal input" here. (They probably have the same risk since the intersection point is a newly constructed point)

Comment thread uxarray/grid/bounds.py

# Parameter along the arc at which z is extremal (matches C++ get_face_location_info).
denom = (z1 + z2) * (d - 1.0)
a_raw = (z1 * d - z2) / denom if denom != 0.0 else -1.0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The a = min(max(a_raw, 0.0), 1.0) should be able to prevent the divide by zero here. But we can keep it just in case

Comment thread uxarray/grid/bounds.py
z_max = z_max_candidate
if z_min_candidate < z_min:
z_min = z_min_candidate

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider utilizing the boolean operation from AccuSphHGeom to reduce branching

  const MaskType<T> north_pole_candidate_mask = (face_z_max >= polar_cap_z);
  const MaskType<T> south_pole_candidate_mask = (face_z_min <= -polar_cap_z);
  const MaskType<T> local_mask = !(north_pole_candidate_mask | south_pole_candidate_mask);

Comment thread uxarray/grid/bounds.py


@njit(cache=True)
def _lon_bounds_from_vertices(face_vertices):
Copy link
Copy Markdown
Contributor

@hongyuchen1030 hongyuchen1030 Jun 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably don't need this work around anymore with the current latlon bounds implementation



@njit(parallel=True, nogil=True, cache=True)
def constant_lat_intersections_no_extreme(lat, edge_node_z, n_edge):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need to seperate the extreme case here, and what does the "extreme" mean here

Comment thread uxarray/grid/intersections.py Outdated
n2x = n2x_hi + n2x_lo
n2y = n2y_hi + n2y_lo
n2z = n2z_hi + n2z_lo
if (
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The degeneracy check like this will greatly impact the vectorization performance, any check like this should be isolated from the intersection computation kernel. And the AccuXGCA itself is able to handle extremely short arcs

Comment thread uxarray/grid/intersections.py Outdated
and math.isfinite(vn)
):
# Parallel (coplanar) arcs: check whether endpoints of one lie on the other.
if on_minor_arc(v0, w0, w1):
Copy link
Copy Markdown
Contributor

@hongyuchen1030 hongyuchen1030 Jun 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, all these if-else branch will impact the vectorization behavior, that's why the original try_gca_gca_intersection only use mask/boolean operations. The point is, we do not try to prevent the NaN or Divide by zero in the intersection kernel, these errors will be recorded in the "status" so an outside dispatch function will know how to handle each case.

Comment thread uxarray/grid/intersections.py Outdated
p1_intersects_gca = point_within_gca(p1, gca_cart[0], gca_cart[1])
p2_intersects_gca = point_within_gca(p2, gca_cart[0], gca_cart[1])
if planar_sq < 0.0:
return res
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isolate these if-else branch outside of the computation kernel

# deduplication in the caller works correctly. Matches Hongyu's suggestion
# of mask-selection to snap after computing rather than branching out early.
_snap_sq = 1e-14 # distance² ≈ (1e-7)² — well above algorithm error (~1e-15)
for xe in (x1, x2):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Snapping involving branching should be isolated outside of the computation kernels. The intersection computation kernels should not include any branching and always stay in the SIMD-packed vectorization friendly form

res[0, 1] = p2[1]
res[0, 2] = p2[2]

return res
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation in

include/accusphgeom/constructions/gca_constlat_intersection.hpp

was intentionally designed with three separate layers, and I think it is important to keep these layers conceptually and structurally separated.

  1. Core computation kernel: accux_constlat

    This is the numerical kernel. It contains the carefully designed AccuSphere algorithm for computing the great-circle-arc and constant-latitude intersection.

    This layer should stay isolated and intact. It is designed to be SIMD-packing friendly, branch-free in the hot path, and easy to reason about for accuracy, performance, and future optimization. This function should not be mixed with high-level filtering, status interpretation, or UXarray-specific logic.

  2. SIMD-friendly batch API: try_gca_constlat_intersection

    This is the API intended for heavy computation.

    It is still SIMD-packing friendly and is designed to compute the full matrix of possible intersection results from the input faces or edge lists. It does not immediately discard invalid intersections. Instead, it returns the computed candidate points together with a status flag indicating whether each result is valid.

    This is the layer we want to use for large-scale vectorized computation, because it keeps the computation uniform. Even invalid candidates are part of the output matrix, and validity is represented by status instead of control flow.

  3. Dispatcher / convenience API: gca_constlat_intersection

    This is the higher-level user-facing dispatcher.

    This layer is allowed to branch. It reads the status returned by try_gca_constlat_intersection, filters out invalid results, and returns only the valid intersection points. This is useful as a lightweight convenience API, especially when the caller wants clean geometry results instead of the full vectorized computation matrix.

The important point is that these three layers serve different purposes.

The core kernel should focus only on accurate numerical computation. The batch API should focus on uniform, SIMD-friendly execution. The dispatcher should handle branching, filtering, and convenience behavior.

Mixing these layers together is not ideal because it makes the heavy computation path harder to vectorize, harder to optimize, and harder to verify numerically. Once filtering, branching, and application-specific logic are pushed into the core kernel or SIMD batch layer, the implementation becomes less predictable and less suitable for large-scale computation.

For this kind of heavy numerical geometry kernel, the best practice is usually:

  • keep the low-level numerical kernel pure, isolated, and branch-minimized;
  • keep the batch/vectorized API uniform and status-based;
  • move branching, filtering, and user-facing convenience behavior to a separate dispatcher layer;
  • avoid mixing UXarray-specific data handling with the core computational algorithm.

So for UXarray integration, I think the preferred workflow should be:

  1. Use try_gca_constlat_intersection for the main large-scale computation.
  2. Preserve the full output matrix and status information during the vectorized computation stage.
  3. Apply filtering only afterward, either through gca_constlat_intersection or through UXarray-side post-processing logic.

That way, we preserve the original AccuSphere design: accurate computation first, SIMD-friendly batch execution second, and lightweight branching/filtering only at the outer layer.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood — the key point is the design separation, not the C++ specifics. We've now split both intersection functions into the three layers you described:

  • _accux_constlat / _accux_gca — pure numerical kernels, no branching, no validity filtering, compensated operations in the exact AccuSphGeom order
  • _try_gca_const_lat_intersection / _try_gca_gca_intersection — batch/status layer using integer mask arithmetic (pos_valid * (1 - neg_valid)) to select candidates without branching in the hot path; returns (point, status, pos, neg)
  • gca_const_lat_intersection / gca_gca_intersection — outer dispatcher that reads status, applies endpoint snapping, and packages UXarray's NaN-filled result format; all UXarray-specific branching lives here



@njit(cache=True)
def _point_in_polygon_sphere(q, polygon):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to implement the tier 3 SoS robustness here: Location detail::point_in_polygon_sphere_impl,https://github.com/hongyuchen1030/AccuSphGeom/blob/e4e13215dd4771b7ef01b7edf81eaf58dd6e6995/src/algorithms/point_in_polygon_sphere.cpp#L359, and https://github.com/hongyuchen1030/AccuSphGeom/blob/e4e13215dd4771b7ef01b7edf81eaf58dd6e6995/src/algorithms/point_in_polygon_sphere.cpp#L345-L357

It's better for us to use the SoS (Simulation of Simplicity) to handle the near-degenerate cases. The vertex id is the vertex index in the data-array, And we will assign the query point q and the end-point r a unique id.

@rajeeja
Copy link
Copy Markdown
Contributor Author

rajeeja commented Jun 4, 2026

@hongyuchen1030

This does not fully ports AccuSphGeom because we do not have:

  • Shewchuk adaptive predicates,
  • Simulation of Simplicity/global-ID handling,
  • exact/geogram fallback,
  • C++ SIMD mask behavior,
  • exact AccuSphGeom point-in-polygon implementation,
  • exact AccuSphGeom wrapper semantics.

@hongyuchen1030
Copy link
Copy Markdown
Contributor

@hongyuchen1030

This does not fully ports AccuSphGeom because we do not have:

  • Shewchuk adaptive predicates,
  • Simulation of Simplicity/global-ID handling,
  • exact/geogram fallback,
  • C++ SIMD mask behavior,
  • exact AccuSphGeom point-in-polygon implementation,
  • exact AccuSphGeom wrapper semantics.

I understand that UXarray is not directly porting the C++ SIMD mask behavior as-is. My point is that the mask-based workflow is not exclusive to C++.

The key idea is the vectorized computation design: we keep the heavy computation path uniform, compute all candidate results in a batch, and use a status or mask array to indicate which results are valid. In C++, that may correspond to SIMD masks. In Python/NumPy/Numba, the same design can be represented by boolean masks, status arrays, or validity flags.

So the important part is not the specific C++ SIMD implementation detail. The important part is preserving the separation between:

  1. the core numerical kernel,
  2. the vectorized/batch computation layer that returns full results plus status,
  3. the outer dispatcher/filtering layer that only keeps valid points.

You can refer to my earlier reply here for the intended vectorized implementation workflow:

#1513 (comment)

This design pattern is still relevant for Python. Even if UXarray does not use the exact C++ SIMD mask behavior, it should still preserve the same vectorized workflow using Python-side masks or status arrays, rather than mixing branching and filtering into the core computation path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Port new algorithms to python from AccuSphGeom repo

2 participants