Conversation
There was a problem hiding this comment.
Pull request overview
Removes the redundant DebugProvider from diskann-providers by migrating remaining test/example usage to diskann::graph::test::provider (test provider), and updates supporting caching/index test utilities to match the new provider behaviors (notably metrics timing and start-point handling).
Changes:
- Delete
DebugProvidermodule and migrate caching example/tests todiskann::graph::test::provider. - Extend the test provider with cacheability hooks and an inplace-delete post-processor that filters deleted IDs.
- Generalize
check_grid_searchtest helper to support providers whoseContext: Default, and add explicit start-point expectations.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
diskann/src/graph/test/provider.rs |
Adds cache accessor hooks and a deleted-ID filtering post-processor to support inplace-delete + caching scenarios. |
diskann-providers/src/model/graph/provider/async_/mod.rs |
Removes the debug_provider module export. |
diskann-providers/src/model/graph/provider/async_/debug_provider.rs |
Deletes the old DebugProvider implementation and its tests. |
diskann-providers/src/model/graph/provider/async_/caching/provider.rs |
Enables cached accessors to participate in deleted-ID filtering by delegating AsDeletionCheck. |
diskann-providers/src/model/graph/provider/async_/caching/example.rs |
Ports caching example/tests from DebugProvider to the core test provider and updates expectations around metrics. |
diskann-providers/src/index/diskann_async.rs |
Updates test helpers (check_grid_search) to support non-DefaultContext providers and adds start-point expectation control. |
diskann-providers/Cargo.toml |
Enables diskann’s testing feature to access the core test provider. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
❌ Your patch status has failed because the patch coverage (26.78%) is below the target coverage (90.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #923 +/- ##
==========================================
+ Coverage 89.34% 89.37% +0.03%
==========================================
Files 444 448 +4
Lines 83986 84503 +517
==========================================
+ Hits 75036 75525 +489
- Misses 8950 8978 +28
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
hildebrandmw
left a comment
There was a problem hiding this comment.
Thanks @JordanMaples! It will be great to get rid of the debug provider, sorry the caching provider is such a mess!
I left a few small comments for improvements, but nothing major.
diskann-providers/src/model/graph/provider/async_/caching/provider.rs
Outdated
Show resolved
Hide resolved
diskann-providers/src/model/graph/provider/async_/caching/example.rs
Outdated
Show resolved
Hide resolved
| &index, | ||
| &vectors, | ||
| &[], | ||
| StartPointExpectation::Visible(start_id), |
There was a problem hiding this comment.
I do not understand why any grid test would care about the start point being visible. So why does the caching provider care? It feels like we should just remove all that stuff instead of making the test hardness more complicated but only for certain providers.
There was a problem hiding this comment.
It comes down to whether or not the provider decides to include or exclude the start points in the vector. this gives flexibility to the provider implementation
There was a problem hiding this comment.
The providers should never include the start point in search results, because it is an "invisible" vector that exists only for purposes of the graph algorithm; it may not even be a real point in the dataset. Why does bf-tree and bf-tree only want this returned?
There was a problem hiding this comment.
It's not universally true that a start point needs to be excluded from search results. In the disk index (lol - here we go again with the disk index being weird), the start point is actually a valid point within the dataset and can be returned. Some ideas that have been thrown around as well for filtering is to use multiple start points, each of which can be candidates for being returned from search.
And I think this particular issue is related to the semantics of the test provider, not bf-tree? Or am I missing something?
There was a problem hiding this comment.
The code is cfg'ed to just bf_tree feature.
The start point is supposed to be frozen, which I guess is why the disk provider gets to cheat here. But in reality it may be that the disk provider uses multiple start points or start points that aren't real points. Those should never be returned. I agree that whether the start point is returned or not is up to the provider. It may choose to not filter start points if it has chosen real vectors are start points and also those vectors are immutable. Otherwise, it will need to filter them.
In any case, I still do not understand why the debug provider needs special casing just for bf-tree. Can the test not be expressed in such a way that whether start points are filtered or not is irrelevant?
My suspicion is that the reason is that the old debug provider was only used by bf-tree, and had a janky unwanted behavior, and copilot has tried to inject this code to rationalize that instead of just correcting the bf-tree provider or whatever is undesirably using the start point knowledge.
There was a problem hiding this comment.
I want to understand the code you added. I'm not asking for scope creep, the new scope was added by you in your PR. I do not understand why this was added, what it is needed for, etc. It doesn't sound like you do either.
I will go spelunking through your branch and the history I guess and figure this out myself.
There was a problem hiding this comment.
No, I really don't know much about this repo yet or how much of anything here works under the hood. I've been looking at it for less than a week. Most of my design decisions were a product of pestering Mark, asking questions of copilot as to why things were or weren't working, and trying to get back to a non-broken state after removing the DebugProvider. Ultimately, the StartPointExpectation was a snipped provided to me by Mark after I asked him about some test behaviors I was looking into. So, I'm going to defer to him about this one.
There was a problem hiding this comment.
Hey Jack, here's what's happening. The check_grid_search test function is fairly opinionated on what it's looking for: searching for the all zeros vector should return the all zeros point and searching for the numeric value of the start point yields that corner of the grid space. This was written specifically for the in-mem providers which do by default filter start points. The DebugProvider had this behavior as well.
Replacing the DebugProvider with the test Provider changes this behavior for the CachingProvider (the test Provider does return start points). This puts us in a little bit of a bind on what to do with the tests in the caching provider. I thought the solution here of making the inclusion of the start point a decision for passed to check_grid_search was the quickest way to keep tests passing while still allowing most of the tests for the caching provider to work as-is, and it makes the implicit assumption of the start point ID that is made by the check a bit more explicit. That the enum is gated behind the bf_tree feature is a byproduct of the caching provider using a bf-tree, not the actual bf-tree provider.
Really, the tests for the caching provider are asking "have I preserved the behavior of the underlying provider" and unfortunately the change in behavior from switching to the test Provider is the underlying motivation for the enum.
There was a problem hiding this comment.
Ok, after removing StartPointExpectation and staring at this a while, there are a few bugs combining together.
- The diskann::graph::test::Provider does not filter start ids as it should. That causes caching tests to fail. Adding FilterStartIds as a post processor fixes all those issues.
- The diskann::graph::test:cases::grid_search stuff appears to fail because it expects to get 10 results with k=10. And it seems like the NPQ is configured incorrectly. We need the queue to always be K + num_start_points large because otherwise we may have start points consuming queue slots. That appears to be what happens. There are K results, one of them is the start point, which is filtered out, then the result set returned is only 9 long.
When I originally wrote the filtering code, the NPQ was correctly sized. So somewhere in the last year this has regressed. If we fix this, I think none of this extra start point stuff is needed, and we can simply fix the missing filter in the caching provider and remove StartPointExpectation completely and have the check_grid_search omit any checking of the start points (which won't be returned ever since the providers all filter them).
Separately, it seems we have a footgun in that it's very easy to forget to add start point filtering to your post processors; i'm not sure what to do about that.
There was a problem hiding this comment.
For next steps, we should, in another PR, fix the underlying bug with NPQ sizing. Then in this PR we can simply update the caching tests to do start point filtering and remove this StartPointExpectation and everything should work fine.
As part of the cleanup efforts in the diskann_providers, we are removing the largely redundant DebugProvider.
This change largely revolves around migrating the
diskann-providers/src/model/graph/provider/async_/caching/example.rsuse of DebugProvider to using the provider fromdiskann/src/graph/test/provider.rs(I'll refer to this as test_provider).There were some inconsistencies between the DebugProvider and test_provider::Provider implementations. Most notably, DebugProvider updates counts in its metrics pre-emptively whereas the test_provider does it lazily. You'll see examples of this below where the accessor needed to be dropped to update the providers counts.
There was also an inconsistency between the in-memory indexes and the test_provider where the in-memory ones require the starting point to be appended to the end of the vector list whereas the test_provider expect the starting point to be included in the config. Some extra handling had to be added to account for that in diskann_async.
Some of this is copilot generated code, some is copilot assisted. A lot of advice and help was provided by Mark.
This closes #852.