Skip to content

Add fixed-trajectory system tests with cross-track error metrics#365

Open
pvkumara wants to merge 71 commits into
developfrom
pkumaraTrajectoryTesting
Open

Add fixed-trajectory system tests with cross-track error metrics#365
pvkumara wants to merge 71 commits into
developfrom
pkumaraTrajectoryTesting

Conversation

@pvkumara

@pvkumara pvkumara commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

- What features did you add and/or bugs did you address?*
Screenshot from 2026-06-09 15-42-21

- Which GitHub issue does this address?

This PR does not address any GitHub issues but instead adds a new fixed-trajectory test suite for automatic path tracking error in sim.

- Additional description if not fully described in the GitHub issue

This PR adds automated fixed-trajectory evaluation tests for the autonomy stack and fixes trajectory-tracking bugs that caused path execution to fail. It also improves the test results workflow so maintainers get one readable summary file instead of many per-test logs.

  • Please add videos and images to demonstrate the feature. Please upload videos to somewhere persistent (e.g. YouTube or Vimeo) for archival purposes.

https://youtu.be/zaaZqLUzqZ8

How did you implement it?

  • Algorithm details, design decisions, engineering notes, and any other relevant information about the implementation should be included
image image image image

How do you run and use it?

  • What commands and button presses do you use to manually launch the stack to use your new feature?

The exact workflow of running all these tests is to simply do airstack up and then create a test based on your needs using the global CLI options. Some basic tests that I ran to validate this testing stack is included below. The global CLI options are also included below.

  • Write a detailed procedure with EXACT BASH COMMANDS so that another maintainer can replicate and understand the benefits of your feature, and reproduce the videos and images you added above.
image image

Testing with PyTest

  • What pytests did you add to ensure the feature is reliable and robust? What metrics are used?
image
  • What's the exact command to run the pytests that test your feature? i.e. airstack test -m ...
image
  • What are the expected results of the tests? What should a maintainer look at to understand whether the test succeeded?

A maintainer should see that all the tests have passed in their console once they input an airstack test command and they should go to the testing folder, isolate the folder that has their test and open their summary.txt file for the test in question to see all the outputted metrics from the test.
image

Documentation

  • Was mkdocs.yml updated? (y/n)

Yes, the mkdocs.yml was updated to make a trajectory testing page.

  • Do the docs have sufficient scope such that a newcomer can easily reproduce and use your feature?

Yes, there is now a docs that explains the full pipeline for trajectory testing.

  • Is there sufficient visual media?

I believe that there is sufficient visual media from the YouTube Video above but I can generate more if needed.

Versioning

Yes, the versioning was changed to 0.19.0-alpha.4.

pvkumara5 and others added 30 commits April 27, 2026 05:15
New tests/test_fixed_trajectory.py evaluates drone performance on Circle,
Figure8, Racetrack, and Line trajectories: takeoff -> execute -> land with
cross-track error, path RMSE, execution time, and success metrics recorded
to metrics.json for baseline comparison.

- Python ideal-path generators mirror fixed_trajectory_task.cpp equations
- Cross-track error uses robot pose snapshot at dispatch to transform
  base_link ideal path to world frame for odom comparison
- 5m loose tolerance documents the known circle failure without stranding drone
- conftest.py gains --trajectory-types CLI option and generalised phase-order
  sorting/ID-rewriting for both autonomy test modules
- tests/README.md documents the new module, all 11 metrics, and run commands

Made-with: Cursor
* Add link to PAT

* Change to new orchestrator instance workflow

* Add availability zone

* Bump version to 0.18.0-alpha.7
…esting

Co-authored-by: Cursor <cursoragent@cursor.com>

# Conflicts:
#	.agents/skills/configure-multi-robot/SKILL.md
#	.agents/skills/run-system-tests/SKILL.md
#	.env
#	AGENTS.md
#	docs/development/intermediate/testing/index.md
#	mkdocs.yml
#	robot/docker/docker-compose.yaml
#	robot/ros_ws/src/sensors/lidar_point_cloud_filter/README.md
#	robot/ros_ws/src/sensors/lidar_point_cloud_filter/scripts/validate_lidar_filter_clouds.py
#	robot/ros_ws/src/sensors/lidar_point_cloud_filter/setup.cfg
#	robot/ros_ws/src/sensors/lidar_point_cloud_filter/setup.py
#	simulation/isaac-sim/extensions/PegasusSimulator
#	tests/README.md
#	tests/conftest.py
#	tests/parse_metrics.py
#	tests/pytest.ini
#	tests/sensor_probes.py
#	tests/test_liveliness.py
```

**One log file per test execution**, plus separate `airstack_env.*.log` files for fixture narration (the `up`/`down` of each parametrize tuple). The fixture log file is named to track the rewritten test ID so it lands next to the triggering test.
There is **no `logs/` subdirectory**. Live output streams to the terminal during

@JohnYanxinLiu JohnYanxinLiu Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there a reason logs were taken out? I feel they would be useful. Feel free to push back tho, I could see it going either way.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, i created a single summary.txt that takes the key metrics from every run and puts all the info in one clean file. when we ran the logs, there were just way too many that were formatted weird. Now all the info from the logs gets cleanly formatted into a single file that you can read easily without diving in to a ton of logs

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we could actually start up a new file called end_to_end_testing.md instead of very specifically "fixed_trajectory_testing." This would match the rest of the structure and better set the precedent in the future for proper CI/CD structure.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that works for me John. We could do that for sure

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then we can have a specific fixed_trajectory testing subsection as well, but a lot of these things can build into stronger e2e tests later on I feel like.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the e2e testing suggestion — I reworked the docs around it in this PR. Here's what I did and what I'm leaving as follow-up.

What changed (docs-only in this PR)

  • Renamed fixed_trajectory_testing.mdend_to_end_testing.md (via git mv, so history is preserved).
  • Reframed the page as the end-to-end testing home: the H1 is now End-to-End Testing with an intro defining e2e (the full flight chain — takeoff → action → land — exercising the whole stack rather than one module), and it lists the two e2e suites we have today: takeoff/hover/land (takeoff_hover_land) and the fixed-trajectory path-tracker benchmark (autonomy). The existing fixed-trajectory content now sits as a section under that page.
  • Updated every reference to match: the mkdocs.yml nav, the testing index.md link + heading, and tests/README.md.
  • Added a "Future work" note in the page itself so the intent isn't lost.

I kept this docs-only on purpose so the change stays focused and doesn't touch the test/CI interface in this PR.

Follow-up (separate PR)
The two flight suites are really the same 4-phase chain (px4_ready → takeoff → [hover | trajectory] → land) — they differ only in the middle phase and the swept parameter, and conftest.py already orders them with shared logic. So the follow-up is to unify the two marks into a single e2e mark (-m e2e instead of -m "takeoff_hover_land or autonomy"), and optionally factor the shared phases/helpers (px4_ready, takeoff, landing, odom + ground-truth capture) into a common base to remove the duplication.

I'm deferring that because it's cross-cutting: it touches pytest.ini, the conftest.py ordering, the CI workflow, the README/AGENTS mark tables, and the metrics baselines keyed on the current mark names — and renaming the marks changes the -m interface. Doing it here would bloat this PR and risk the baselines/CI, so I'd rather land it as its own change.


### Observed baseline (Circle, Isaac Sim, 10 headless runs)

Validated on branch `pkumaraTrajectoryTesting` — see `tests/results/2026-06-05_18-26-52/summary.txt`:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we really need to reference internal branches. These docs are meant to become somewhat public. We may archive old branches. I think it's fine to just leave the suggested numbers. Also make sure to specify what hardware these are done on. AirStations? Cloud Instances (through github's CI/CD integration?) On your local machine?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the mention of my internal branch and I made sure to say that it was on a AirStation where the code was validated.


## Running tests (complete CLI reference)

### Prerequisites

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could these prereqs be moved to index.md, and then this doc just references index.md?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is actually a much better setup. I put the prereqs into index.md and then I added a reference in the .md file for the end to end testing referencing the index.md prereq section.


---

## Path tracker bug fixes (this PR)

@JohnYanxinLiu JohnYanxinLiu Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this should go into the PR template instead of into the direct documentation.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah lol, I kept that there for myself and forgot to remove it. I removed that section from the file now.


## Manual stack usage (without pytest)

To fly a fixed trajectory interactively:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't most of this stuff in the beginner docs for AirStack? Can we remove this fluff? Or is there something about this section that the other docs don't have?

Back here in the getting_started docs is probably a better place to be updating these docs:
(https://github.com/castacks/AirStack/blob/78f5772e8fc3a3bc28a9f5f0a1fbea4e4142975c/docs/getting_started/index.md)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the initial boiler plate stuff that does the airstack up bring-up, the takeoff action block, the land action block, and the airstack down stuff and only left the fixed trajectory task dispatch command. I think we should just leave the fixed trajectory task stuff in this document because it is separate and more complicated from the getting started stuff.

for the multi-drone Pegasus script. Details: **`tests/README.md`** → *Isaac Sim and
the sensors mark*.

### Fixed-trajectory path-tracker benchmark

@JohnYanxinLiu JohnYanxinLiu Jul 1, 2026

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess once the fixed_trajectory.md file is renamed, this should be switched to e2e (end_to_end) benchmarking.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep switched that with my new end to end testing file changes.

<param name="virtual_tracking_ahead_time" value="0.5" />
<param name="min_virtual_tracking_velocity" value="0.5" />
<param name="sphere_radius" value="1.0" />
<param name="sphere_radius" value="2.0" />

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, what was the reason for this?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I think this is an artifact of me trying to fix why the drone was stalling during simulation, so I changed the radius to give the drone more look ahead so the pure pursuit path tracker could work better. It has no effect because the velocity_sphere_radius_multiplier=1.0 and that makes the radius velocity-proportional so this value is just a fallback and isn't actually used. I'll change it back to 1.0 to make it consistent though.

<param name="virtual_tracking_ahead_time" value="0.5" />
<param name="min_virtual_tracking_velocity" value="0.5" />
<param name="sphere_radius" value="1.0" />
<param name="sphere_radius" value="2.0" />

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as in local.launch.xml. What is the purpose of this?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same issue that was in the local.launch.xml, I will change it back to 1.0 to make it consistent.

Comment thread tests/pytest.ini
liveliness: Container and process health (Docker, tmux, sentinel ROS 2 nodes)
sensors: Sim and robot sensor topic rates, LiDAR validation, sim RTF
takeoff_hover_land: End-to-end takeoff / hover / land action tests
autonomy: Fixed-pattern trajectory path-tracker benchmark (test_fixed_trajectory.py)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the e2e testing precedent above, it might be nice to begin putting things together. I feel takeoff_hover_land could be combined with your new autonomy testing to become a general e2e testing pipeline.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I made a doc combining how everything is going to be lined out in this PR, but I think we should do the entire upheaval in another PR down the line to actually confirm all this so this PR doesn't get too big

Comment thread tests/README.md

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're combining other things into an e2e categorization, make sure to update this documentation

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah once the next PR happens and we're good, I'm going to update the README documentation.

Comment thread tests/conftest.py
# `pytest tests/` and `airstack test -m unit` discover them without any
# sys.path manipulation here. Each proxy file sets up its own paths.
RUN_DIR = None
LOGS_DIR = None

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we removing this? Can't we leave it in as optional logs?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logs are still all in the terminal, so you can technically still see all of them. The issue with the logs was that they spit out a bunch of different unstructured information for every test, so if you are doing tons of runs, thats a ton of tests that are super hard to wade through. The summary.txt file gives all of the information in the logs just structured and easy to read all in one place.

pvkumara5 and others added 7 commits July 2, 2026 11:40
…kes the fixed value inert

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
…d-trajectory doc

Co-authored-by: Cursor <cursoragent@cursor.com>
…ption)

Co-authored-by: Cursor <cursoragent@cursor.com>
…rted

Co-authored-by: Cursor <cursoragent@cursor.com>
Rename fixed_trajectory_testing.md to end_to_end_testing.md (history preserved), add e2e intro and future-work note, fix stale test path to tests/system, and update mkdocs nav, testing index, and tests/README references.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants