Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The verify command was not intuitive and could be renamed and/or relocated. #23

Open
gammazero opened this issue May 23, 2023 · 2 comments

Comments

@gammazero
Copy link
Contributor

It was unclear what verify did without reading a lot of text. If someone wanted to see that ingestion was working between a provider and indexer, it was not clear that this was found under "verify ingest". Need better naming or command structure.

@rvagg Any suggestions?

@rvagg
Copy link

rvagg commented May 23, 2023

I think the name might be OK, my feedback is mainly about not understanding what it's trying to do, which I still don't quite understand. Let me talk ramble through my confusion and attempt to clarify it using the information at hand, and maybe that'll help, it might just be a matter of adding a few clarifying words.

Verifies an indexer's ingestion of multihashes.

What am I testing here? Am I asking the indexer whether it has specific multihashes recorded? The command for dogfooding had me just supply an advertisement ID; but in that case why am I not asking "have you ingested this advertisement", so perhaps this verification is doing something else. Is it testing something internal to the indexer, is there some uncertainty in the relationship between an advertisement and the multihashes it contains that just asking it "have you got this advertisement" isn't enough? Then the fact that the command says it can read multihashes from a CAR, that would suggest that this is not actually related to advertisements but is really just "verify that you have these multihashes recorded for this peer ID"—if that's the case, then that seems pretty straightforward and it's not so much an "ingest" thing, just an assertion of an expectation. So maybe the word "ingest" (possibly combined with the dogfooding command just using a specific advertisement) is tripping me up here by suggesting that the ingest process needs verification because it's not reliable. Is it just a verify? All the complexity with the parameters are also possibly adding to the confusion. @davidd8 had feedback that might suggest that: "What are all these flags for --edl=1 --batch-size=25 --sampling-prob=0.125". It could just be that the dogfooding task threw us in the deep end with very advanced usage? I think I can figure out what these options are for and do, but I don't understand why I'd want to change edl; is that just for the case of "oh, you have the first chunk, that's enough for me" (i.e. the use-cases are either "all" or "something").

@gammazero
Copy link
Contributor Author

Maybe something like "check indexed" or "isindexed", and flags that make it clear that you need to tell the command where to get the data that you want to check is indexed. From an advertisement (and specify where), from CAR, etc. Then finally how you want to perform the check: check everything, check some things (sample), etc.

I think the flags like edl (entry-depth-limit) are really just advanced usage, and your guess is completely correct "oh, you have the first chunk..." It was included in the exercise to get the example command to complete in a short amount of time, but also to encourage the curious to look at the advanced options.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants