feat(serde_with): make BytesOrString adjustable#793
feat(serde_with): make BytesOrString adjustable#793sivizius wants to merge 1 commit intojonasbb:masterfrom
BytesOrString adjustable#793Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #793 +/- ##
==========================================
+ Coverage 67.34% 67.59% +0.25%
==========================================
Files 40 40
Lines 2468 2509 +41
==========================================
+ Hits 1662 1696 +34
- Misses 806 813 +7 ☔ View full report in Codecov by Sentry. |
1fc3306 to
554d127
Compare
554d127 to
e030aa0
Compare
jonasbb
left a comment
There was a problem hiding this comment.
Thanks for the PR. This looks really good already :) The addition looks reasonable to me.
| fn test_bytes_or_string_as_bytes() { | ||
| #[serde_as] | ||
| #[derive(Debug, Serialize, Deserialize, PartialEq)] | ||
| struct S(#[serde_as(as = "BytesOrString")] Vec<u8>); |
There was a problem hiding this comment.
Can you please add a test for an explicit PreferBytes too?
| /// When serializing a value of a type, | ||
| /// that allows multiple types during deserialization, | ||
| /// prefer a specific type. | ||
| pub trait TypePreference: SerializeAs<[u8]> {} |
There was a problem hiding this comment.
That's neat to just reuse the SerializeAs trait for the behavior.
Does the trait need to be fully public? We can't make it private, since it appears in public bounds, but seal all implementations. Similar to
serde_with/serde_with/src/base64.rs
Lines 160 to 168 in bc20634
I would like to keep the serialization and deserialization behavior matching if possible. For that a fully public trait feels unnecessary. Having ASCII and unicode strings supported feels sufficient for now. If it turns out that more freedom is necessary, it can be unsealed later.
|
Given this PR is #794 really necessary? The |
It was more like an obvious addition to this PR. However, making this trait work for |
Oh yeah, that is changing the behavior. In JSON it is not visible, as there is no efficient way to store binary data. There are some tests for |
It’s not breaking for |
| { | ||
| match core::str::from_utf8(source) { | ||
| Ok(text) if text.is_ascii() => serializer.serialize_str(text), | ||
| _ => serializer.serialize_bytes(source), |
There was a problem hiding this comment.
As discussed, these serialize_bytes should just be serialize.
| _ => serializer.serialize_bytes(source), | |
| _ => serializer.serialize(source), |
| where | ||
| S: Serializer, | ||
| { | ||
| serializer.serialize_bytes(source) |
There was a problem hiding this comment.
| serializer.serialize_bytes(source) | |
| serializer.serialize(source) |
| { | ||
| match core::str::from_utf8(source) { | ||
| Ok(text) => serializer.serialize_str(text), | ||
| _ => serializer.serialize_bytes(source), |
There was a problem hiding this comment.
| _ => serializer.serialize_bytes(source), | |
| _ => serializer.serialize(source), |
|
The suggested changes ( |
At the moment, any
Vec<u8>annotated with#[serde(as = "BytesOrString")]will always be serialised as an array of integers in the range0–255. However, often these bytes are valid strings, the internal representation asVec<u8>is often just an implementation detail. This PR adds a generic type parameterPREFERENCE, which must implement the new traitTypePreference, which is a subtrait ofSerializeAs<[u8]>. This parameter defaults to the new marker-strutPreferBytes, which serialisesVec<u8>as an array like before. Two additional marker-structs arePreferString, which tries to convert&[u8]to&strfirst and will fallback to serialising as array only if the bytes do not represent a valid string, as well asPreferAsciiString, which additionally checks that all characters of the string are valid ASCII-characters, otherwise falling back to the array as well.