keys-only Range optimization#21791
Conversation
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: nwnt The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
converted to ready to run the CI tests |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files
... and 23 files with indirect coverage changes @@ Coverage Diff @@
## main #21791 +/- ##
=======================================
Coverage 70.20% 70.21%
=======================================
Files 426 427 +1
Lines 35241 35323 +82
=======================================
+ Hits 24741 24801 +60
- Misses 9107 9125 +18
- Partials 1393 1397 +4 Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
|
Please fix tests |
32daca3 to
bfc97ae
Compare
| Limit int64 | ||
| Rev int64 | ||
| CountOnly bool | ||
| InMemoryKeysOnly bool |
There was a problem hiding this comment.
What doesn "InMemory" mean in that context? Think "KeysOnly" should be good enough. Please make sure that mutually exclusive options like "KeysOnly" and "CountOnly".
There was a problem hiding this comment.
Renamed back to KeysOnly. I was thinking using the previous name might give the impression that it's the same value from the --keys-only flag, but in this case, it's actually a special case of it.
Should the mutual exclusivity logic be put in this PR or a separate one? I just had a look and the checking logic seems missing from the server, but anyway it's also orthogonal to this PR. But anyway let me what you think.
There was a problem hiding this comment.
Raised #21802 for the mutual exclusivity check.
There was a problem hiding this comment.
I just meant for code here. We should at maintain the same behavior. Do we have any tests covering this case?
bfc97ae to
568e776
Compare
568e776 to
449b8b0
Compare
| if len(keys) == 0 { | ||
| return &RangeResult{KVs: nil, Count: 0, Rev: curRev}, nil | ||
| } | ||
| cap := sliceCapWithLimit(int(ro.Limit), keys) |
There was a problem hiding this comment.
cap is a golang build-in function, let's try to use a different name, such as keyCount
There was a problem hiding this comment.
Done - I renamed them to cappedKeysCount and cappedEntriesCount.
| Limit: limit, | ||
| Rev: r.Revision, | ||
| CountOnly: r.CountOnly, | ||
| KeysOnly: r.KeysOnly && r.SortTarget != pb.RangeRequest_VALUE, |
There was a problem hiding this comment.
Two comments:
- rename
KeyOnlytoFastKeyOnly - Why add
r.SortTarget != pb.RangeRequest_VALUEhere? I think we need to reuse functionIsDefaultOrderinghere
There was a problem hiding this comment.
Renamed to FastKeysOnly.
We can't use IsDefaultOrdering because we can do all other ordering with the data stored in memory except in the case of ordering by value. That's why Range was modified to return the versions and modified revisions, and avoid retrieving them from bbolt. The only case we can't avoid bbolt is when --sort-by=value is specified.
There was a problem hiding this comment.
We can't use
IsDefaultOrderingbecause we can do all other ordering with the data stored in memory except in the case of ordering by value. That's whyRangewas modified to return the versions and modified revisions, and avoid retrieving them from bbolt. The only case we can't avoid bbolt is when--sort-by=valueis specified.
OK. In that case, please add a comment.
Also I feel it's safer to add a feature gate, e.g. FastKeyOnlyRead, at least there is a way for users to disable it just in case.
There was a problem hiding this comment.
I don't think the change is large enough or risky enough for feature gate. The we use the data already was present in index. We have good test coverage for range there is not much complication in code paths.
We just need the test for server preference between flags that are rejected on etcdctl side #21791 (comment)
There was a problem hiding this comment.
I don't think the change is large enough or risky enough for feature gate. The we use the data already was present in index.
I am OK not to add a feature gate. If the data (revisions stuff) isn't consistent with bbolt, then it's a critical bug.
| if limit <= 0 || limit > len(revpairs) { | ||
| limit = len(revpairs) | ||
| } | ||
| cap := sliceCapWithLimit(int(ro.Limit), revpairs) |
Signed-off-by: Nont <9658731+nwnt@users.noreply.github.com>
449b8b0 to
7624a8a
Compare
|
@nwnt: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
A draft for #20386.
Technically,
Rangewith--keys-onlycan avoid retrieving data from bbolt and instead return the keys from the in memory index b-tree. However, to support sorting and filtering by modified/created revisions or versions options, we can't return just the keys. Even though we still need to retrieve all the data when--sort-by=valueis specified, luckily the rest of the metadata required by the sorting and filtering are available in the index b-tree.This PR does the following:
keysOnlyto mvcc.RangeOption, which is set when--keys-onlyis specified but not--sort-by=value.treeIndex.Rangereturn the list of created revisions and versions in addition to keys and modified revisions.storeTxnCommon.rangeKeysuse the newly added option to call the newly adjustedtreeIndex.Rangeto create theRangeResult.kvswithCreateRevisionandVersion.Filtering and sorting would work the same way, with the data and metadata they need to do their jobs.