Overview
Deserr is a crate for deserializing data, with the ability to return custom, type-specific errors upon failure. It was also designed with user-facing APIs in mind and thus provides better defaults than serde for this use case.
Unlike serde, deserr does not parse the data in its serialization format itself but offloads the work to other crates. Instead, it deserializes the already-parsed serialized data into the final type. For example:
// bytes of the serialized value
let s: &str = ".." ;
// parse serialized data using another crate, such as `serde_json`
let json: serde_json::Value = serde_json::from_str(s).unwrap();
// finally deserialize with deserr
let data = T::deserialize_from_value(json.into_value()).unwrap();
// `T` must implement `Deserr`.
Why would I use it
The main place where you should use deserr is on your user-facing API, especially if it's supposed to be read by a human. Since deserr gives you full control over your error types, you can improve the quality of your error messages. Here is a little preview of what you can do with deserr:
Let's say I sent this payload to update my Meilisearch settings:
{
"filterableAttributes": ["doggo.age", "catto.age"],
"sortableAttributes": ["uploaded_at"],
"typoTolerance": {
"minWordSizeForTypos": {
"oneTypo": 1000, "twoTypo": 80
},
"enabled": true
},
"displayedAttributes": ["*"],
"searchableAttributes": ["doggo.name", "catto.name"]
}
With serde
With serde, we don't have much customization; this is the typical kind of message we would get in return:
{
"message": "Json deserialize error: invalid value: integer `1000`, expected u8 at line 6 column 21",
"code": "bad_request",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#bad_request"
}
The message
Json deserialize error: invalid value: integer
1000
, expected u8 at line 6 column 21
- The message uses the word
u8
, which definitely won't help a user who doesn't know rust or is unfamiliar with types. - The location is provided in terms of lines and columns. While this is generally good, when most of your users read this message in their terminal, it doesn't actually help much.
The rest of the payload
Since serde returned this error, we cannot know what happened or on which field it happened. Thus, the best we
can do is generate a code bad_request
that is common for our whole API. We then use this code to generate
a link to our documentation to help our users. But such a generic link does not help our users because it
can be thrown by every single route of Meilisearch.
With deserr
{
"message": "Invalid value at `.typoTolerance.minWordSizeForTypos.oneTypo`: value: `1000` is too large to be deserialized, maximum value authorized is `255`",
"code": "invalid_settings_typo_tolerance",
"type": "invalid_request",
"link": "https://docs.meilisearch.com/errors#invalid-settings-typo-tolerance"
}
The message
Invalid value at
.typoTolerance.minWordSizeForTypos.oneTypo
: value:1000
is too large to be deserialized, maximum value authorized is255
- We get a more human-readable location;
.typoTolerance.minWordSizeForTypos.oneTypo
. It gives us the faulty field. - We also get a non-confusing and helpful message this time; it explicitly tells us that the maximum value authorized is
255
.
The rest of the payload
Since deserr called one of our functions in the process, we were able to use a custom error code + link to redirect our user to the documentation specific to this feature and this field.
More possibilities with deserr that were impossible with serde
Adding constraints on multiples fields
In Meilisearch, there is another constraint on this minWordSizeForTypos
, the twoTypo
field must be greater than
the oneType
field.
Serde doesn't provide any feature to do that. You could write your own implementation of Deserialize
for the
entire sub-object minWordSizeForTypos
, but that's generally hard and wouldn't even let you customize the
error type.
Thus, that's the kind of thing you're going to check by hand in your code later on. This is error-prone and
may bring inconsistencies between most of the deserialization error messages and your error message.
With deserr, we provide attributes that allow you to validate your structure once it's deserialized.
When a field is missing
It's possible to provide your own function when a field is missing.
pub fn missing_field<E: DeserializeError>(field: &str, location: ValuePointerRef) -> E {
todo!()
}
At Meilisearch, we use this function to specify a custom error code, but we keep the default error message which is pretty accurate.
When an unknown field is encountered
It's possible to provide your own function when a field is missing.
fn unknown_field<E: DeserializeError>(
field: &str,
accepted: &[&str],
location: ValuePointerRef,
) -> E {
todo!()
}
Here is a few ideas we have or would like to implement at Meilisearch;
- In the case of a resource you can
PUT
with some fields, but can'tPATCH
all its fields. We can throw a specialimmutable field x
error instead of anunknown field x
. - Detecting when you use the field name of an alternative; for example, we use
q
to make aquery
while some Meilisearch alternatives usequery
. We could help our users with adid you mean?
message that corrects the field to its proper name in Meilisearch. - Trying to guess what the user was trying to say by computing the levenstein distance
between what the user typed and what is accepted to provide a
did you mean?
message that attempts to correct typos.
When multiple errors are encountered
Deserr lets you accumulate multiple errors with its MergeWithError
trait while trying to deserialize the value into your type.
This is a good way to improve your user experience by reducing the number of interactions
a user needs to have to fix an invalid payload.
The main parts of deserr are:
Deserr<E>
is the main trait for deserialization, unlike Serde, it's very easy to deserialize this trait manually, see theimplements_deserr_manually.rs
file in our examples directory.IntoValue
andValue
describes the shape that the parsed serialized data must haveDeserializeError
is the trait that all deserialization errors must conform toMergeWithError<E>
describe how to combine multiple errors together. It allows deserr to return multiple deserialization errors at once.ValuePointerRef
andValuePointer
point to locations within the value. They are used to locate the origin of an error.deserialize<Ret, Val, E>
is the main function to use to deserialize a value.Ret
is the returned value or the structure you want to deserialize.Val
is the value type you want to deserialize from. Currently, only an implementation forserde_json::Value
is provided in this crate, but you could add your own! Feel free to look into ourserde_json
module.E
is the error type that should be used if an error happens during the deserialization.
- The
Deserr
derive proc macro
Example
Implementing deserialize for a custom type with a custom error
In the following example, we're going to deserialize a structure containing a bunch of fields and uses a custom error type that accumulates all the errors encountered while deserializing the structure.
#![allow(unused)] fn main() { use deserr::{deserialize, DeserializeError, Deserr, ErrorKind, errors::JsonError, Value, ValueKind, IntoValue, take_cf_content, MergeWithError, ValuePointerRef, ValuePointer}; use serde_json::json; use std::str::FromStr; use std::ops::ControlFlow; use std::fmt; use std::convert::Infallible; /// This is our custom error type. It'll accumulate multiple `JsonError`. #[derive(Debug)] struct MyError(Vec<JsonError>); impl DeserializeError for MyError { /// Create a new error with the custom message. /// /// Return `ControlFlow::Continue` to continue deserializing even though an error was encountered. /// We could return `ControlFlow::Break` as well to stop right here. fn error<V: IntoValue>(self_: Option<Self>, error: ErrorKind<V>, location: ValuePointerRef) -> ControlFlow<Self, Self> { /// The `take_cf_content` return the inner error in a `ControlFlow<E, E>`. let error = take_cf_content(JsonError::error(None, error, location)); let errors = if let Some(MyError(mut errors)) = self_ { errors.push(error); errors } else { vec![error] }; ControlFlow::Continue(MyError(errors)) } } /// We have to implements `MergeWithError` between our error type _aaand_ our error type. impl MergeWithError<MyError> for MyError { fn merge(self_: Option<Self>, mut other: MyError, _merge_location: ValuePointerRef) -> ControlFlow<Self, Self> { if let Some(MyError(mut errors)) = self_ { other.0.append(&mut errors); } ControlFlow::Continue(other) } } #[derive(Debug, Deserr, PartialEq, Eq)] #[deserr(deny_unknown_fields)] struct Search { #[deserr(default = String::new())] query: String, #[deserr(try_from(&String) = FromStr::from_str -> IndexUidError)] index: IndexUid, #[deserr(from(String) = From::from)] field: Wildcard, #[deserr(default)] filter: Option<serde_json::Value>, // Even though this field is an `Option` it IS mandatory. limit: Option<usize>, #[deserr(default)] offset: usize, } /// An `IndexUid` can only be composed of ascii characters. #[derive(Debug, PartialEq, Eq)] struct IndexUid(String); /// If we encounter a non-ascii character this is the error type we're going to throw. struct IndexUidError(char); impl FromStr for IndexUid { type Err = IndexUidError; fn from_str(s: &str) -> Result<Self, Self::Err> { if let Some(c) = s.chars().find(|c| !c.is_ascii()) { Err(IndexUidError(c)) } else { Ok(Self(s.to_string())) } } } impl fmt::Display for IndexUidError { fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result { write!( f, "Encountered invalid character: `{}`, only ascii characters are accepted in the index", self.0 ) } } /// We need to define how the `IndexUidError` error is going to be merged with our /// custom error type. impl MergeWithError<IndexUidError> for MyError { fn merge(self_: Option<Self>, other: IndexUidError, merge_location: ValuePointerRef) -> ControlFlow<Self, Self> { // To be consistent with the other error and automatically get the position of the error we re-use the `JsonError` // type and simply define ourself as an `Unexpected` error. let error = take_cf_content(JsonError::error::<Infallible>(None, ErrorKind::Unexpected { msg: other.to_string() }, merge_location)); let errors = if let Some(MyError(mut errors)) = self_ { errors.push(error); errors } else { vec![error] }; ControlFlow::Continue(MyError(errors)) } } /// A `Wildcard` can either contains a normal value or be a unit wildcard. #[derive(Deserr, Debug, PartialEq, Eq)] #[deserr(from(String) = From::from)] enum Wildcard { Wildcard, Value(String), } impl From<String> for Wildcard { fn from(s: String) -> Self { if s == "*" { Wildcard::Wildcard } else { Wildcard::Value(s) } } } // Here is an example of a typical payload we could deserialize: let data = deserialize::<Search, _, MyError>( json!({ "index": "mieli", "field": "doggo", "filter": ["id = 1", ["catto = jorts"]], "limit": null }), ).unwrap(); assert_eq!(data, Search { query: String::new(), index: IndexUid(String::from("mieli")), field: Wildcard::Value(String::from("doggo")), filter: Some(json!(["id = 1", ["catto = jorts"]])), limit: None, offset: 0, }); // And here is what happens when everything goes wrong at the same time: let error = deserialize::<Search, _, MyError>( json!({ "query": 12, "index": "mieli 🍯", "field": true, "offset": "🔢" }), ).unwrap_err(); // We're going to stringify all the error so it's easier to read assert_eq!(error.0.into_iter().map(|error| error.to_string()).collect::<Vec<String>>().join("\n"), "\ Invalid value type at `.query`: expected a string, but found a positive integer: `12` Invalid value type at `.offset`: expected a positive integer, but found a string: `\"🔢\"` Invalid value at `.index`: Encountered invalid character: `🍯`, only ascii characters are accepted in the index Invalid value type at `.field`: expected a string, but found a boolean: `true` Missing field `limit`\ "); }