-
Notifications
You must be signed in to change notification settings - Fork 12
Description
Is your feature request related to a problem? Please describe.
Lately I've posted on r/rust to get folks' opinion on whether error kind enums should be non_exhaustive or not.
A lot of water was shed about best error practices under that post.
https://www.reddit.com/r/rust/comments/13p5ipy/should_error_enums_be_non_exhaustive/
No matter anyone's take on the issue, I think our current errors are miserable, and we should try to consolidate them.
Currently there are four different "top-level" error types in the crate.
ParserErrorCompilerErrorInputErrorEngineError(of note,EngineError::InputErrorencapsulates anInputError)
There's also ArrayIndexError, which is a bit special and created specifically for the NonNegativeArrayIndex type.
This is endemic of a larger issue that we have, namely that the library public API surface is way too wide. We allow extending basically every core part of rsonpath, with the exception of the classifier. This is something that needs to be tackled for v1.0.0, since we sure as hell don't want to force ourselves to support it as-is until the next major version.
We should expect users of the library to be similar to users of other query-language libraries. I imagine the 99% use case to be that you have some query you want to run or have it supplied as a string, and some input you read from a file or from memory. Then you
- parse and compile the query;
- run it on the input to get one, specific type of result.
From that perspective, I would care only about two kinds of errors - error during step 1. or step 2. It makes sense to have both, since they represent two completely different scenarios with different ways of handling.
- Either the developer made a mistake writing the query, or some external user that sent the query; in either case, we should display a pretty parser result, and the only sensible error resolution is for the query writer to fix the error in the query string.
- Assuming we don't have a bug, something is wrong with the input, in that either reading from it failed, or it contains malformed JSON that stopped us from preceeding at some point. If reading failed, the lib user might want to rerun. If it's malformed, then this is an unresolvable error.
From a programmer's perspective when using the lib dealing with those errors, there are three cases that require potentially different codepaths that arise from the above deliberations:
- Query was wrong. Tell that to the external user, or fail if its an internal query.
- Input failed to read properly. I might've been reading from file or some socket, in which case I should probably retry.
- Input is invalid and the operation of running a query on it doesn't make sense. Tell that to the user if the document is user-supplied, or reevaluate my life choices of running rsonpath on invalid documents.
In none of those cases does the code writer care whether the error with the document is DepthBelowZero or DepthAboveLimit. There's nothing actionable they can do with either choice, why would they give a damn.
Describe the solution you'd like
My current vision is as follows:
- Make
QueryErroran opaque struct with a niceDisplayimplementation. The user of the lib doesn't care if the error was an "internal nom error" or something else, they just want to tell the user what the error was and how they can fix it. - Rollup
CompilerErrorintoEngineErrorand renameEngineErrortoRsonpathError. This is going to be the main error type for most of the public API. Make it an enum with two kinds:
InputError,EngineError.
The first will indicate an error with the input, the same as currently. The second one will be an opaque struct with akind()function, returning one of two kinds:NotSupportedandDocumentError, telling the user that either the document is incorrect or that whatever they're trying to do is not supported with current version ofrsonpath. The kinds are non-exhaustive. Nothing else will be publicly exposed. TheInputErrortype will only be raised publicly viaRsonpathError, and as something you should return for a customInputimplementation.
ArrayIndexErrorcan be removed. It currently has two cases, one of which is obsolete. Error-handling here can be done the same ways as the stdlib handles similar types, likeNonZeroUsize, so by returning anOption<Self>withNoneif the bound was exceeded.
This achieves a couple of things.
- Since different causes of a
QueryErrordon't give any actionable decisions for the code that calls the parser, the only thing you can do with it is display it. - As discussed in the first section, an error reading the input and error due to its structure are two different things, with different actions that can be taken by the consumer. The different kinds of
InputErrorare also important, as anIoErrorcan potentially be handled and retried, whileAllocationSizeExceededis presumably fatal. In the future we might want to have input types that have configurable heap size limits, and exceeding that could also be handled in different ways from anIoError.
The EngineError can give you more info on what exactly happened under the presumption that the consumer might want to know, but we're giving no stability guarantees on the enum. Also the realm of possible things will be limited to "the JSON is borked" and "we don't support this", nothing more specific. If we ever need more kinds we can put them there, for example "we exceeded the memory limit of the stack" or something like that. But crucially, we assume 99% of the time the users won't care, and if they do then they should be aware that it's basically implementation details that might be unstable.
The key property that we want to keep when going forward is that we should expose different error variants only if a programmer using the lib should do different things for different variants.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status