Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up and consolidate error types #159

Open
V0ldek opened this issue Jun 18, 2023 · 1 comment
Open

Clean up and consolidate error types #159

V0ldek opened this issue Jun 18, 2023 · 1 comment
Assignees
Labels
acceptance: go ahead Reviewed, implementation can start area: library Improvements to the library API quality type: reliability Tests, code quality
Milestone

Comments

@V0ldek
Copy link
Member

V0ldek commented Jun 18, 2023

Is your feature request related to a problem? Please describe.
Lately I've posted on r/rust to get folks' opinion on whether error kind enums should be non_exhaustive or not.
A lot of water was shed about best error practices under that post.

https://www.reddit.com/r/rust/comments/13p5ipy/should_error_enums_be_non_exhaustive/

No matter anyone's take on the issue, I think our current errors are miserable, and we should try to consolidate them.

Currently there are four different "top-level" error types in the crate.

There's also ArrayIndexError, which is a bit special and created specifically for the NonNegativeArrayIndex type.

This is endemic of a larger issue that we have, namely that the library public API surface is way too wide. We allow extending basically every core part of rsonpath, with the exception of the classifier. This is something that needs to be tackled for v1.0.0, since we sure as hell don't want to force ourselves to support it as-is until the next major version.

We should expect users of the library to be similar to users of other query-language libraries. I imagine the 99% use case to be that you have some query you want to run or have it supplied as a string, and some input you read from a file or from memory. Then you

  1. parse and compile the query;
  2. run it on the input to get one, specific type of result.

From that perspective, I would care only about two kinds of errors - error during step 1. or step 2. It makes sense to have both, since they represent two completely different scenarios with different ways of handling.

  1. Either the developer made a mistake writing the query, or some external user that sent the query; in either case, we should display a pretty parser result, and the only sensible error resolution is for the query writer to fix the error in the query string.
  2. Assuming we don't have a bug, something is wrong with the input, in that either reading from it failed, or it contains malformed JSON that stopped us from preceeding at some point. If reading failed, the lib user might want to rerun. If it's malformed, then this is an unresolvable error.

From a programmer's perspective when using the lib dealing with those errors, there are three cases that require potentially different codepaths that arise from the above deliberations:

  1. Query was wrong. Tell that to the external user, or fail if its an internal query.
  2. Input failed to read properly. I might've been reading from file or some socket, in which case I should probably retry.
  3. Input is invalid and the operation of running a query on it doesn't make sense. Tell that to the user if the document is user-supplied, or reevaluate my life choices of running rsonpath on invalid documents.

In none of those cases does the code writer care whether the error with the document is DepthBelowZero or DepthAboveLimit. There's nothing actionable they can do with either choice, why would they give a damn.

Describe the solution you'd like

My current vision is as follows:

  1. Make QueryError an opaque struct with a nice Display implementation. The user of the lib doesn't care if the error was an "internal nom error" or something else, they just want to tell the user what the error was and how they can fix it.
  2. Rollup CompilerError into EngineError and rename EngineError to RsonpathError. This is going to be the main error type for most of the public API. Make it an enum with two kinds:
  • InputError,
  • EngineError.
    The first will indicate an error with the input, the same as currently. The second one will be an opaque struct with a kind() function, returning one of two kinds: NotSupported and DocumentError, telling the user that either the document is incorrect or that whatever they're trying to do is not supported with current version of rsonpath. The kinds are non-exhaustive. Nothing else will be publicly exposed. The InputError type will only be raised publicly via RsonpathError, and as something you should return for a custom Input implementation.
  1. ArrayIndexError can be removed. It currently has two cases, one of which is obsolete. Error-handling here can be done the same ways as the stdlib handles similar types, like NonZeroUsize, so by returning an Option<Self> with None if the bound was exceeded.

This achieves a couple of things.

  1. Since different causes of a QueryError don't give any actionable decisions for the code that calls the parser, the only thing you can do with it is display it.
  2. As discussed in the first section, an error reading the input and error due to its structure are two different things, with different actions that can be taken by the consumer. The different kinds of InputError are also important, as an IoError can potentially be handled and retried, while AllocationSizeExceeded is presumably fatal. In the future we might want to have input types that have configurable heap size limits, and exceeding that could also be handled in different ways from an IoError.

The EngineError can give you more info on what exactly happened under the presumption that the consumer might want to know, but we're giving no stability guarantees on the enum. Also the realm of possible things will be limited to "the JSON is borked" and "we don't support this", nothing more specific. If we ever need more kinds we can put them there, for example "we exceeded the memory limit of the stack" or something like that. But crucially, we assume 99% of the time the users won't care, and if they do then they should be aware that it's basically implementation details that might be unstable.

The key property that we want to keep when going forward is that we should expose different error variants only if a programmer using the lib should do different things for different variants.

@V0ldek V0ldek added the type: feature New feature or request label Jun 18, 2023
@V0ldek V0ldek self-assigned this Jun 18, 2023
@github-actions github-actions bot added the acceptance: triage Waiting for owner's input label Jun 18, 2023
@github-actions
Copy link

Tagging @V0ldek for notifications

@V0ldek V0ldek added this to the v1.0.0 milestone Jun 18, 2023
@github-actions github-actions bot added acceptance: go ahead Reviewed, implementation can start and removed acceptance: triage Waiting for owner's input labels Jun 18, 2023
@V0ldek V0ldek added mod: engine type: reliability Tests, code quality area: library Improvements to the library API quality and removed type: feature New feature or request labels Jun 18, 2023
@V0ldek V0ldek moved this from Todo to Committed in Active rsonpath development Jun 18, 2023
@V0ldek V0ldek modified the milestones: v1.0.0, lib v1.0.0 Nov 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acceptance: go ahead Reviewed, implementation can start area: library Improvements to the library API quality type: reliability Tests, code quality
Projects
Development

No branches or pull requests

1 participant