'with strategy.scope():' 或 'with tf.distribute.experimental.TPUStrategy(tpu).scope():' 对创建 NN 有什么作用？答案

【问题标题】：What does 'with strategy.scope():' or 'with tf.distribute.experimental.TPUStrategy(tpu).scope():' do to the creation of a NN?'with strategy.scope():' 或 'with tf.distribute.experimental.TPUStrategy(tpu).scope():' 对创建 NN 有什么作用？
【发布时间】：2020-12-18 14:21:34
【问题描述】：

在这里的代码中： https://www.kaggle.com/ryanholbrook/detecting-the-higgs-boson-with-tpus

在编译模型之前，使用以下代码制作模型：

with strategy.scope():
    # Wide Network
    wide = keras.experimental.LinearModel()

    # Deep Network
    inputs = keras.Input(shape=[28])
    x = dense_block(UNITS, ACTIVATION, DROPOUT)(inputs)
    x = dense_block(UNITS, ACTIVATION, DROPOUT)(x)
    x = dense_block(UNITS, ACTIVATION, DROPOUT)(x)
    x = dense_block(UNITS, ACTIVATION, DROPOUT)(x)
    x = dense_block(UNITS, ACTIVATION, DROPOUT)(x)
    outputs = layers.Dense(1)(x)
    deep = keras.Model(inputs=inputs, outputs=outputs)
    
    # Wide and Deep Network
    wide_and_deep = keras.experimental.WideDeepModel(
        linear_model=wide,
        dnn_model=deep,
        activation='sigmoid',
    )

我不明白with strategy.scope() 在这里做了什么，以及它是否会以任何方式影响模型。它具体是做什么的？

将来我怎么能弄清楚这是做什么的？我需要研究哪些资源才能解决这个问题？

【问题讨论】：

标签： tensorflow tensorflow2.0 tpu

【解决方案1】：

在 TF2 中引入了分布策略，以帮助将训练分布在多个 GPU、多台机器或 TPU 上，而代码更改最少。我会推荐这个guide to distributed training for starters。

在TPUStrategy 下专门创建一个模型会将模型以复制（每个核心上的权重相同）的方式放置在 TPU 上，并将通过添加适当的集体通信来保持副本权重同步（所有这些都减少梯度）。有关更多信息，请查看API doc on TPUStrategy 以及 TF2 中的 TPU 简介colab notebook。

【讨论】：