如何使用 Tensorflow Federated 中的多个功能构建模型？答案

【问题标题】：How to build a model using multiple features in Tensorflow Federated?如何使用 Tensorflow Federated 中的多个功能构建模型？
【发布时间】：2020-07-14 10:52:40
【问题描述】：

在尝试为多个特征输入（即特征 a-g）和一个标签 h 创建 OrderedDict 时，我遇到了以下代码和问题。


def preprocess(dataset):

  def batch_format_fn(element):

    return collections.OrderedDict(
        x=collections.OrderedDict(
            a=tf.TensorSpec(shape=[None,], dtype=tf.int32),
            b=tf.TensorSpec(shape=[None,], dtype=tf.int32),
            c=tf.TensorSpec(shape=[None,], dtype=tf.int32),
            d=tf.TensorSpec(shape=[None,], dtype=tf.int32),
            e=tf.TensorSpec(shape=[None,], dtype=tf.int32),
            f=tf.TensorSpec(shape=[None,], dtype=tf.int32),
            g=tf.TensorSpec(shape=[None,], dtype=tf.int32)),
        y=tf.TensorSpec(shape=[None,], dtype=tf.int32))
  return dataset.map(batch_format_fn).prefetch(PREFETCH_BUFFER)

preprocessed_sample_dataset = preprocess(example_dataset)

def create_keras_model():
    model = Sequential([
    feature_layer,
    Dense(64, activation='relu'),
    Dense(64, activation='relu'),
    Dense(3, activation='softmax') #classification 3 outputs
    ])
    return model

def model_fn():

  keras_model = create_keras_model()
  return tff.learning.from_keras_model(
      keras_model,
      input_spec=preprocessed_sample_dataset.element_spec,
      loss=losses.SparseCategoricalCrossentropy(),
      metrics=[metrics.SparseCategoricalAccuracy()])

执行input_spec=preprocessed_sample_dataset.element_spec时显示如下错误：

TypeError: Unsupported return value from function passed to Dataset.map(): OrderedDict([('x', OrderedDict([('a', TensorSpec(shape=(None,), dtype=tf.int32, name=None)), ('b', TensorSpec(shape=(None,), dtype=tf.int32, name=None)), ('c', TensorSpec(shape=(None,), dtype=tf.int32, name=None)), ('d', TensorSpec(shape=(None,), dtype=tf.int32, name=None)), ('e', TensorSpec(shape=(None,), dtype=tf.int32, name=None)), ('f', TensorSpec(shape=(None,), dtype=tf.int32, name=None)), ('g', TensorSpec(shape=(None,), dtype=tf.int32, name=None))])), ('y', TensorSpec(shape=(None,), dtype=tf.int32, name=None))]).

我已阅读此替代方案solution，但尚不清楚如何在我的情况下实现它。那么，如何正确地为 TFF 中的多个特征分配有序 dict？

当前的example_dataset.element_spec如下：

OrderedDict([
('a', TensorSpec(shape=(None,), dtype=tf.int32, name=None)), 
('b', TensorSpec(shape=(None,), dtype=tf.int32, name=None)), 
('c', TensorSpec(shape=(None,), dtype=tf.int32, name=None)), 
('d', TensorSpec(shape=(None,), dtype=tf.int32, name=None)), 
('e', TensorSpec(shape=(None,), dtype=tf.int32, name=None)), 
('f', TensorSpec(shape=(None,), dtype=tf.int32, name=None)), 
('g', TensorSpec(shape=(None,), dtype=tf.int32, name=None)), 
('y', TensorSpec(shape=(None,), dtype=tf.int32, name=None))])

我希望 element_spec 变成这样：

OrderedDict([('x', OrderedDict([
('a', TensorSpec(shape=(None,), dtype=tf.int32, name=None)), 
('b', TensorSpec(shape=(None,), dtype=tf.int32, name=None)), 
('c', TensorSpec(shape=(None,), dtype=tf.int32, name=None)), 
('d', TensorSpec(shape=(None,), dtype=tf.int32, name=None)), 
('e', TensorSpec(shape=(None,), dtype=tf.int32, name=None)), 
('f', TensorSpec(shape=(None,), dtype=tf.int32, name=None)), 
('g', TensorSpec(shape=(None,), dtype=tf.int32, name=None))])), 
('y', TensorSpec(shape=(None,), dtype=tf.int32, name=None))])

如何使用 batch_format_fn 将 element_spec 设为后一个？

【问题讨论】：

看batch_format_fn，貌似返回值与参数element无关；我怀疑如果您只是迭代数据集，您会得到同样的错误。也许返回的OrderedDict 应该重塑element？
@KeithRush 返回的 OrderedDict 如何重塑元素？
可以在下面给出一个扩展的答案，但是你能用example_dataset 的element_spec 属性更新问题吗？我相信我们需要这个来了解如何改变
@KeithRush 我已经用element_spec更新了这个问题

标签： python tensorflow tensorflow-federated

【解决方案1】：

batch_format_fn 当前返回张量结构types； tf.data.Dataset.map 期望接收张量的结构作为函数的返回值。

我们应该更新 batch_format_fn 以重新格式化其 element 参数并返回它。让我们尝试一下：

def batch_format_fn(element):
  feature_dict = collections.OrderedDict(
      a=element['a'],
      b=element['b'],
      c=element['c'],
      d=element['d'],
      e=element['e'],
      f=element['f'],
      g=element['g'],
  )
  return collections.OrderedDict(x=feature_dict, y=element['y'])

并保持其他一切不变。

【讨论】：

我再次更新了我的问题。当前 element_spec 缺少 ('x', OrderedDict([。我希望当前的element_spec 与input_spec 相同以使其工作。实际上，我尝试使用 batch_format_fn(self) 但仍然发生相同的错误。如何绕过它？因为我需要包含.map(batch_format_fn)