[FEATURE] Provide a possibility to store and load an empty graph#685
[FEATURE] Provide a possibility to store and load an empty graph#685reta wants to merge 6 commits into
Conversation
c9c2ef2 to
a2b357f
Compare
Signed-off-by: Andriy Redko <drreta@gmail.com>
There was a problem hiding this comment.
@reta thanks for this PR. I've left a few comments. Additionally, I tried running a test by saving and loading an empty heap graph to an OnDiskGraphIndex, but this failed an assertion while loading the graph. The test in this PR only covers OnHeapGraphIndex::save and OnHeapGraphIndex::load.
I'll try and take a look at this later. In the meantime, if you manage to figure out the issue, please do update the PR.
Failure details
The error:
java.lang.AssertionError: Node ID -1419795886 out of bounds for layer 1
The sample program is below, please run with assertions enabled.
import java.io.IOException;
import java.nio.file.Path;
import java.util.List;
import java.util.Map;
import io.github.jbellis.jvector.disk.ReaderSupplierFactory;
import io.github.jbellis.jvector.graph.GraphIndexBuilder;
import io.github.jbellis.jvector.graph.ListRandomAccessVectorValues;
import io.github.jbellis.jvector.graph.disk.GraphIndexWriter;
import io.github.jbellis.jvector.graph.disk.GraphIndexWriterTypes;
import io.github.jbellis.jvector.graph.disk.OnDiskGraphIndex;
import io.github.jbellis.jvector.graph.disk.feature.FeatureId;
import io.github.jbellis.jvector.graph.disk.feature.InlineVectors;
import io.github.jbellis.jvector.graph.similarity.BuildScoreProvider;
import io.github.jbellis.jvector.vector.VectorSimilarityFunction;
public class Test {
public static void main(String[] forwardArgs) throws IOException {
var dim = 1024;
var vsf = VectorSimilarityFunction.DOT_PRODUCT;
var M = 32;
var ef = 100;
var nOv = 1.2f;
var alpha = 1.2f;
var hier = true;
var ravv = new ListRandomAccessVectorValues(List.of(), dim);
var bsp = BuildScoreProvider.randomAccessScoreProvider(ravv, vsf);
var graphPath = Path.of("./local/tmp.jvgraph");
try (
var builder = new GraphIndexBuilder(bsp, dim, M, ef, nOv, alpha, hier);
var graph = builder.build(ravv);
var writer = GraphIndexWriter.getBuilderFor(GraphIndexWriterTypes.RANDOM_ACCESS_PARALLEL, graph, graphPath)
.with(new InlineVectors(dim))
.build();
) {
writer.write(Map.of(FeatureId.INLINE_VECTORS, i -> new InlineVectors.State(ravv.getVector(i))));
System.out.println("OHG max level = " + graph.getMaxLevel());
}
try (
var readerSupplier = ReaderSupplierFactory.open(graphPath);
var graph = OnDiskGraphIndex.load(readerSupplier);
) {
var count = 0;
var niter = graph.getNodes(0);
while (niter.hasNext()) {
count++;
System.out.println(">>> " + niter.next());
}
System.out.println(String.format("Counted %d nodes", count));
graph.getDegree(0);
}
}
}Signed-off-by: Andriy Redko <drreta@gmail.com>
Signed-off-by: Andriy Redko <drreta@gmail.com>
Signed-off-by: Andriy Redko <drreta@gmail.com>
Thanks @ashkrisk , sorry forgot to update on this one, I run it against latest changes with assertions enabled - no issues, please let me know if I am missing something, thanks! |
ashkrisk
left a comment
There was a problem hiding this comment.
@reta the new test for for the empty OnDiskGraphIndex looks good. I think the earlier failure I was seeing had to do with improper file cleanup on my system, I sorted that out and it runs fine now.
My remaining comments are mostly nitpicks. There are a lot of lines consisting of multiple spaces followed by a newline (Bob has a tendency to preserve indentation for empty lines), it would be better if you can remove those as well.
|
|
||
| @Override | ||
| public int maxDegree() { | ||
| return 0; |
There was a problem hiding this comment.
Let's return a valid maxDegree instead of defaulting to zero. Similarly for maxDegrees, etc.
Signed-off-by: Andriy Redko <drreta@gmail.com>
Signed-off-by: Andriy Redko <drreta@gmail.com>
Thanks @ashkrisk , really appreciate it, I think we should be all set now |
Provide a possibility to store and load an empty graph, see please #684 for motivation and background.
Closes #684