Add a small streaming runbook and groundtruth to test_data by magdalendobson · Pull Request #1127 · microsoft/DiskANN

magdalendobson · 2026-06-02T19:44:27Z

Currently we don't have a way to benchmark streaming algorithms using the existing test data. This PR adds a streaming runbook and groundtruth for the 256-point slice of sift that already exists in test_data. It also updates the example dynamic index in diskann-benchmark to use these files, and to be able to run correctly. This will help the existing and future dynamic benchmarks stay in sync with any changes, and allow us to run small tests.

Copilot

Pull request overview

This PR adds a small “streaming” runbook + corresponding groundtruth files to the existing test_data/disk_index_search dataset, and updates the diskann-benchmark dynamic graph-index example to use the in-repo test data instead of external Big ANN Benchmarks paths. This makes it possible to run small, self-contained dynamic/streaming benchmark runs that stay aligned with future code changes.

Changes:

Added a streaming runbook YAML under test_data/disk_index_search/.
Added per-step groundtruth artifacts under test_data/disk_index_search/example_runbook_gt/.
Updated diskann-benchmark/example/graph-index-dynamic.json to use the in-repo SIFT-small-256 slice + new runbook/GT directory.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
test_data/disk_index_search/example_runbook.yaml	Adds the streaming runbook (Git LFS-tracked) for the small SIFT slice.
test_data/disk_index_search/example_runbook_gt/step2.gt10	Adds runbook step groundtruth (Git LFS-tracked).
test_data/disk_index_search/example_runbook_gt/step4.gt10	Adds runbook step groundtruth (Git LFS-tracked).
test_data/disk_index_search/example_runbook_gt/step6.gt10	Adds runbook step groundtruth (Git LFS-tracked).
test_data/disk_index_search/example_runbook_gt/step8.gt10	Adds runbook step groundtruth (Git LFS-tracked).
diskann-benchmark/example/graph-index-dynamic.json	Switches the dynamic example to `test_data/disk_index_search` and wires it to the new runbook + GT directory.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

codecov-commenter · 2026-06-02T19:59:26Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.87%. Comparing base (68cc3c4) to head (2ec71c9).

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1127      +/-   ##
==========================================
- Coverage   88.87%   88.87%   -0.01%     
==========================================
  Files         485      485              
  Lines       92112    92112              
==========================================
- Hits        81868    81865       -3     
- Misses      10244    10247       +3

Flag	Coverage Δ
miri	`88.87% <ø> (-0.01%)`	⬇️
unittests	`88.52% <ø> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

hildebrandmw · 2026-06-03T00:19:17Z

  "search_directories": [
-    "../big-ann-benchmarks/data/MSTuringANNS",
-    "../big-ann-benchmarks/neurips23/runbooks"
+    "test_data/disk_index_search"


Care to wire this up to the integration tests to prevent regression?

Magdalen Manohar added 12 commits May 14, 2026 17:56

finish up recall computation patch

97b36ef

Merge branch 'main' of github.com:microsoft/DiskANN

07b3671

Merge branch 'main' of github.com:microsoft/DiskANN

eac3ffb

fix conflict

17780f8

fix conflict

43eb517

Merge branch 'main' of github.com:microsoft/DiskANN

1d3a52b

Merge branch 'main' of github.com:microsoft/DiskANN

17eac62

Merge branch 'main' of github.com:microsoft/DiskANN

0ab4baa

Merge branch 'main' of github.com:microsoft/DiskANN

4ddca60

update dynamic example to use small runbook

446f362

add new runbook and gt

65e6109

remove tags files

2ec71c9

magdalendobson marked this pull request as ready for review June 2, 2026 19:45

magdalendobson requested review from a team and Copilot June 2, 2026 19:45

Copilot started reviewing on behalf of magdalendobson June 2, 2026 19:45 View session

Copilot AI reviewed Jun 2, 2026

View reviewed changes

JordanMaples approved these changes Jun 2, 2026

View reviewed changes

hildebrandmw reviewed Jun 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a small streaming runbook and groundtruth to test_data#1127

Add a small streaming runbook and groundtruth to test_data#1127
magdalendobson wants to merge 12 commits into
mainfrom
users/magdalen/add_streaming_runbook_and_gt

magdalendobson commented Jun 2, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

codecov-commenter commented Jun 2, 2026

Uh oh!

hildebrandmw Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

magdalendobson commented Jun 2, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

codecov-commenter commented Jun 2, 2026

Codecov Report

Uh oh!

hildebrandmw Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants