【问题标题】:How can we write a generic function for checking Serde serialization and deserialization?我们如何编写一个通用函数来检查 Serde 序列化和反序列化?
【发布时间】:2017-10-01 16:12:09
【问题描述】:

在一个涉及自定义 Serde (1.0) 序列化和反序列化方法的项目中,我依靠这个测试例程来检查序列化对象并返回是否会产生等效对象。

// let o: T = ...;
let buf: Vec<u8> = to_vec(&o).unwrap();
let o2: T = from_slice(&buf).unwrap();
assert_eq!(o, o2);

做这个内联效果很好。我朝着可重用性迈进的下一步是为此目的创建一个函数check_serde

pub fn check_serde<T>(o: T)
where
    T: Debug + PartialEq<T> + Serialize + DeserializeOwned,
{
    let buf: Vec<u8> = to_vec(&o).unwrap();
    let o2: T = from_slice(&buf).unwrap();
    assert_eq!(o, o2);
}

这适用于拥有类型,但不适用于具有生命周期界限的类型 (Playground):

check_serde(5);
check_serde(vec![1, 2, 5]);
check_serde("five".to_string());
check_serde("wait"); // [E0279]

错误:

error[E0279]: the requirement `for<'de> 'de : ` is not satisfied (`expected bound lifetime parameter 'de, found concrete lifetime`)
  --> src/main.rs:24:5
   |
24 |     check_serde("wait"); // [E0277]
   |     ^^^^^^^^^^^
   |
   = note: required because of the requirements on the impl of `for<'de> serde::Deserialize<'de>` for `&str`
   = note: required because of the requirements on the impl of `serde::de::DeserializeOwned` for `&str`
   = note: required by `check_serde`

由于我想让函数在这些情况下工作(包括带有字符串切片的结构),我尝试了一个具有显式对象反序列化生命周期的新版本:

pub fn check_serde<'a, T>(o: &'a T)
where
    T: Debug + PartialEq<T> + Serialize + Deserialize<'a>,
{
    let buf: Vec<u8> = to_vec(o).unwrap();
    let o2: T = from_slice(&buf).unwrap();
    assert_eq!(o, &o2);
}

check_serde(&5);
check_serde(&vec![1, 2, 5]);
check_serde(&"five".to_string());
check_serde(&"wait"); // [E0405]

此实现导致另一个问题,并且无法编译 (Playground)。

error[E0597]: `buf` does not live long enough
  --> src/main.rs:14:29
   |
14 |     let o2: T = from_slice(&buf).unwrap();
   |                             ^^^ does not live long enough
15 |     assert_eq!(o, &o2);
16 | }
   | - borrowed value only lives until here
   |
note: borrowed value must be valid for the lifetime 'a as defined on the function body at 10:1...
  --> src/main.rs:10:1
   |
10 | / pub fn check_serde<'a, T>(o: &'a T)
11 | |     where T: Debug + PartialEq<T> + Serialize + Deserialize<'a>
12 | | {
13 | |     let buf: Vec<u8> = to_vec(o).unwrap();
14 | |     let o2: T = from_slice(&buf).unwrap();
15 | |     assert_eq!(o, &o2);
16 | | }
   | |_^

我已经预料到了这一点:这个版本意味着序列化的内容(以及反序列化的对象)与输入对象一样长,这是不正确的。缓冲区只意味着与函数的作用域一样长。

我的第三次尝试试图构建原始输入的自有版本,从而避免了具有不同生命周期边界的反序列化对象的问题。 ToOwned trait 似乎适合这个用例。

pub fn check_serde<'a, T: ?Sized>(o: &'a T)
where
    T: Debug + ToOwned + PartialEq<<T as ToOwned>::Owned> + Serialize,
    <T as ToOwned>::Owned: Debug + DeserializeOwned,
{
    let buf: Vec<u8> = to_vec(&o).unwrap();
    let o2: T::Owned = from_slice(&buf).unwrap();
    assert_eq!(o, &o2);
}

这使得该函数现在适用于纯字符串切片,但不适用于包含它们的复合对象 (Playground):

check_serde(&5);
check_serde(&vec![1, 2, 5]);
check_serde(&"five".to_string());
check_serde("wait");
check_serde(&("There's more!", 36)); // [E0279]

再次,我们偶然发现了与第一个版本相同的错误类型:

error[E0279]: the requirement `for<'de> 'de : ` is not satisfied (`expected bound lifetime parameter 'de, found concrete lifetime`)
  --> src/main.rs:25:5
   |
25 |     check_serde(&("There's more!", 36)); // [E0279]
   |     ^^^^^^^^^^^
   |
   = note: required because of the requirements on the impl of `for<'de> serde::Deserialize<'de>` for `&str`
   = note: required because of the requirements on the impl of `for<'de> serde::Deserialize<'de>` for `(&str, {integer})`
   = note: required because of the requirements on the impl of `serde::de::DeserializeOwned` for `(&str, {integer})`
   = note: required by `check_serde`

当然,我很茫然。我们如何构建一个通用函数,使用 Serde 序列化一个对象并将其反序列化回一个新对象?特别是,这个功能可以在 Rust 中实现(稳定或夜间),如果可以,我的实现缺少哪些调整?

【问题讨论】:

    标签: serialization rust lifetime serde


    【解决方案1】:

    不幸的是,您需要的是尚未在 Rust 中实现的功能:泛型关联类型。

    让我们看看check_serde的不同变体:

    pub fn check_serde<T>(o: T)
    where
        for<'a> T: Debug + PartialEq<T> + Serialize + Deserialize<'a>,
    {
        let buf: Vec<u8> = to_vec(&o).unwrap();
        let o2: T = from_slice(&buf).unwrap();
        assert_eq!(o, o2);
    }
    
    fn main() {
        check_serde("wait"); // [E0279]
    }
    

    这里的问题是o2不能是T类型:o2指的是buf,这是一个局部变量,但是类型参数不能推断为受生命周期限制的类型函数的主体。我们希望 T 类似于 &amp;str 没有附加特定的生命周期。

    对于泛型关联类型,这可以用这样的方法来解决(显然我无法测试它,因为它还没有实现):

    trait SerdeFamily {
        type Member<'a>: Debug + for<'b> PartialEq<Self::Member<'b>> + Serialize + Deserialize<'a>;
    }
    
    struct I32Family;
    struct StrFamily;
    
    impl SerdeFamily for I32Family {
        type Member<'a> = i32; // ignoring a parameter is allowed
    }
    
    impl SerdeFamily for StrFamily {
        type Member<'a> = &'a str;
    }
    
    pub fn check_serde<'a, Family>(o: Family::Member<'a>)
    where
        Family: SerdeFamily,
    {
        let buf: Vec<u8> = to_vec(&o).unwrap();
        // `o2` is of type `Family::Member<'b>`
        // with a lifetime 'b different from 'a
        let o2: Family::Member = from_slice(&buf).unwrap();
        assert_eq!(o, o2);
    }
    
    fn main() {
        check_serde::<I32Family>(5);
        check_serde::<StrFamily>("wait");
    }
    

    【讨论】:

    • 可能值得重新审视这个答案,因为我们应该有足够的 GAT 在每晚为此实施。不幸的是,即使经过一些调整,给定的代码也不起作用。 This 是迄今为止我能做的最好的,它无法编译。
    【解决方案2】:

    answer from Francis Gagné 表明,如果没有泛型关联类型,我们将无法有效地做到这一点。建立反序列化对象的深度所有权是我在此处描述的一种可能的解决方法。

    第三次尝试非常接近灵活的解决方案,但由于std::borrow::ToOwned 的工作方式而失败。该特征不适合检索对象的深度拥有版本。例如,尝试为 &amp;str 使用 ToOwned 的实现会给您另一个字符串切片。

    let a: &str = "hello";
    let b: String = (&a).to_owned(); // expected String, got &str
    

    同样,包含字符串切片的结构的Owned 类型不能是包含Strings 的结构。在代码中:

    #[derive(Debug, PartialEq, Serialize, Deserialize)]
    struct Foo<'a>(&str, i32);
    
    #[derive(Debug, PartialEq, Serialize, Deserialize)]
    struct FooOwned(String, i32);
    

    我们不能为Foo 实现ToOwned 提供FooOwned,因为:

    • 如果我们推导出Clone,那么ToOwnedT: Clone的实现只适用于Owned = Self
    • 即使使用ToOwned 的自定义实现,该特征也要求可以将拥有的类型借用到原始类型中(由于约束Owned: Borrow&lt;Self&gt;)。也就是说,我们应该能够从FooOwned 中检索出&amp;Foo(&amp;str, i32),但它们的内部结构不同,因此无法实现。

    这意味着,为了遵循第三种方法,我们需要一个不同的特征。让我们有一个新的 trait ToDeeplyOwned 将一个对象变成一个完全拥有的对象,不涉及切片或引用。

    pub trait ToDeeplyOwned {
        type Owned;
        fn to_deeply_owned(&self) -> Self::Owned;
    }
    

    这里的目的是从任何东西中产生一个深拷贝。似乎没有一个简单的包罗万象的实现,但有些技巧是可能的。首先,我们可以将它实现到所有T: ToDeeplyOwned 的引用类型。

    impl<'a, T: ?Sized + ToDeeplyOwned> ToDeeplyOwned for &'a T {
        type Owned = T::Owned;
        fn to_deeply_owned(&self) -> Self::Owned {
            (**self).to_deeply_owned()
        }
    }
    

    此时,我们必须有选择地将其实现为我们知道可以的非引用类型。我写了一个宏来减少这个过程的冗长,它在内部使用to_owned()

    macro_rules! impl_deeply_owned {
        ($t: ty, $t2: ty) => { // turn $t into $t2
            impl ToDeeplyOwned for $t {
                type Owned = $t2;
                fn to_deeply_owned(&self) -> Self::Owned {
                    self.to_owned()
                }
            }
        };
        ($t: ty) => { // turn $t into itself, self-contained type
            impl ToDeeplyOwned for $t {
                type Owned = $t;
                fn to_deeply_owned(&self) -> Self::Owned {
                    self.to_owned()
                }
            }
        };
    }
    

    为了使问题中的示例起作用,我们至少需要这些:

    impl_deeply_owned!(i32);
    impl_deeply_owned!(String);
    impl_deeply_owned!(Vec<i32>);
    impl_deeply_owned!(str, String);
    

    一旦我们在Foo/FooOwned 上实现必要的特征并调整serde_check 以使用新特征,代码现在编译并成功运行(Playground):

    #[derive(Debug, PartialEq, Serialize)]
    struct Foo<'a>(&'a str, i32);
    
    #[derive(Debug, PartialEq, Clone, Deserialize)]
    struct FooOwned(String, i32);
    
    impl<'a> ToDeeplyOwned for Foo<'a> {
        type Owned = FooOwned;
    
        fn to_deeply_owned(&self) -> FooOwned {
            FooOwned(self.0.to_string(), self.1)
        }
    }
    
    impl<'a> PartialEq<FooOwned> for Foo<'a> {
        fn eq(&self, o: &FooOwned) -> bool {
            self.0 == o.0 && self.1 == o.1
        }
    }
    
    pub fn check_serde<'a, T: ?Sized>(o: &'a T)
    where
        T: Debug + ToDeeplyOwned + PartialEq<<T as ToDeeplyOwned>::Owned> + Serialize,
        <T as ToDeeplyOwned>::Owned: Debug + DeserializeOwned,
    {
        let buf: Vec<u8> = to_vec(&o).unwrap();
        let o2: T::Owned = from_slice(&buf).unwrap();
        assert_eq!(o, &o2);
    }
    
    // all of these are ok
    check_serde(&5);
    check_serde(&vec![1, 2, 5]);
    check_serde(&"five".to_string());
    check_serde("wait");
    check_serde(&"wait");
    check_serde(&Foo("There's more!", 36));
    

    【讨论】:

      【解决方案3】:

      更新(04.09.2021):

      最新的 nightly 对 GAT 进行了一些修复,基本上允许原始示例:

      #![feature(generic_associated_types)]
      
      use serde::{Deserialize, Serialize};
      use serde_json::{from_slice, to_vec};
      use std::fmt::Debug;
      
      trait SerdeFamily {
          type Member<'a>:
              Debug +
              for<'b> PartialEq<Self::Member<'b>> +
              Serialize +
              Deserialize<'a>;
      }
      
      struct I32Family;
      struct StrFamily;
      
      impl SerdeFamily for I32Family {
          type Member<'a> = i32;
      }
      
      impl SerdeFamily for StrFamily {
          type Member<'a> = &'a str;
      }
      
      fn check_serde<F: SerdeFamily>(o: F::Member<'_>) {
          let buf: Vec<u8> = to_vec(&o).unwrap();
          let o2: F::Member<'_> = from_slice(&buf).unwrap();
          assert_eq!(o, o2);
      }
      
      fn main() {
          check_serde::<I32Family>(5);
          check_serde::<StrFamily>("wait");
      }
      

      上面的例子现在编译:playground


      到目前为止,可以在 rust nightly 上实现这一点(使用明确的差异解决方法):

      #![feature(generic_associated_types)]
      
      use serde::{Deserialize, Serialize};
      use serde_json::{from_slice, to_vec};
      use std::fmt::Debug;
      
      trait SerdeFamily {
          type Member<'a>: Debug + PartialEq + Serialize + Deserialize<'a>;
          
          // https://internals.rust-lang.org/t/variance-of-lifetime-arguments-in-gats/14769/19
          fn upcast_gat<'short, 'long: 'short>(long: Self::Member<'long>) -> Self::Member<'short>;
      }
      
      struct I32Family;
      struct StrFamily;
      
      impl SerdeFamily for I32Family {
          type Member<'a> = i32; // we can ignore parameters
      
          fn upcast_gat<'short, 'long: 'short>(long: Self::Member<'long>) -> Self::Member<'short> {
              long
          }
      }
      
      impl SerdeFamily for StrFamily {
          type Member<'a> = &'a str;
      
          fn upcast_gat<'short, 'long: 'short>(long: Self::Member<'long>) -> Self::Member<'short> {
              long
          }
      }
      
      fn check_serde<F: SerdeFamily>(o: F::Member<'_>) {
          let buf: Vec<u8> = to_vec(&o).unwrap();
          let o2: F::Member<'_> = from_slice(&buf).unwrap();
          assert_eq!(F::upcast_gat(o), o2);
      }
      
      fn main() {
          check_serde::<I32Family>(5);
          check_serde::<StrFamily>("wait");
      }
      

      Playground

      【讨论】:

        【解决方案4】:

        简单(但有点尴尬)的解决方案:从函数外部提供buf

        pub fn check_serde<'a, T>(o: &'a T, buf: &'a mut Vec<u8>)
        where
            T: Debug + PartialEq<T> + Serialize + Deserialize<'a>,
        {
            *buf = to_vec(o).unwrap();
            let o2: T = from_slice(buf).unwrap();
            assert_eq!(o, &o2);
        }
        

        buf 可以与Cursor 重复使用

        pub fn check_serde_with_cursor<'a, T>(o: &'a T, buf: &'a mut Vec<u8>)
        where
            T: Debug + PartialEq<T> + Serialize + Deserialize<'a>,
        {
            buf.clear();
            let mut cursor = Cursor::new(buf);
            to_writer(&mut cursor, o).unwrap();
            let o2: T = from_slice(cursor.into_inner()).unwrap();
            assert_eq!(o, &o2);
        }
        

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 2018-08-25
          • 2014-03-19
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2018-08-30
          • 1970-01-01
          相关资源
          最近更新 更多