【问题标题】:Problem with "Regression with Probabilistic Layers in TensorFlow Probability"“TensorFlow Probability 中的概率层回归”问题
【发布时间】:2020-06-10 10:15:16
【问题描述】:

我在使用 tfp.layers.DistributionLambda 时遇到问题,我是一个 TF 新手,正在努力使张量流动。 有人可以提供一些关于如何设置输出分布参数的见解吗?

上下文:

TFP 团队在Regression with Probabilistic Layers in TensorFlow Probability 上写了一个教程,它建立了以下模型:

# Build model.
model = tfk.Sequential([
  tf.keras.layers.Dense(1 + 1),
  tfp.layers.DistributionLambda(
      lambda t: tfd.Normal(loc=t[..., :1],
                           scale=1e-3 + tf.math.softplus(0.05 * t[..., 1:]))),
])

我的问题:

它使用 tfp.layers.DistributionLambda 输出正态分布,但我不清楚 tfd.Normal 的参数(平均值/位置和标准差/比例)是如何设置的,所以我无法将 Normal 更改为伽马分布。我尝试了以下方法,但没有奏效(预测分布参数为 nan)。

def dist_output_layer (t, softplus_scale=0.05):
    """Create distribution with variable mean and variance
    """
    mean = t[..., :1]
    std_dev = 1e-3 + tf.math.softplus(softplus_scale * mean)

    alpha = (mean/std_dev)**2
    beta = alpha/mean

    return tfd.Gamma(concentration = alpha, 
                     rate = beta
                    )

# Build model.
model = tf.keras.Sequential([
    tf.keras.layers.Dense(20,activation="relu"), # "By using a deeper neural network and introducing nonlinear activation functions, however, we can learn more complicated functional dependencies!
    tf.keras.layers.Dense(1 + 1), #two neurons here b/c the output layer's distribution's mean and std. deviation
    tfp.layers.DistributionLambda(dist_output_layer)
])

非常感谢。

【问题讨论】:

    标签: python tensorflow tensorflow-probability


    【解决方案1】:

    说实话,关于您从 Medium 粘贴的代码 sn-p 有很多话要说。

    不过,我希望您会发现下面我的 cmets 有点用处。

    # Build model.
    model = tfk.Sequential([
    
        # The first layer is a Dense layer with 2 units, one for each of the parameters that will
        # be learnt (see next layer). Its implied shape is (batch_size, 2).
        # Note that this Dense layer has no activation function as we want are any real value that will be used
        # to parameterize the Normal distribution in the Normal distribution component of the following
        # layer
        tf.keras.layers.Dense(1 + 1),
    
        # The following layer is a DistributionLambda that encapsulates a Normal distribution. The
        # DistributionLambda takes a function in its constructor, and this function should take the output
        # tensor from the previous layer as its input (this is the Dense layer and the comments above).
        # The goal is to learn the 2 parameters of the distribution that is loc (the mean) and scale (the standard
        # deviation). For this, a lambda construct is used. The ellipsis you can see for the loc
        # and scale arguments (that is the 3 dots) are for the batch size. Also note that scale (the standard deviation)
        # cannot be negative. The softplus function was used to make sure that the learnt parameter scale doesn't get
        # negative.
        tfp.layers.DistributionLambda(
          lambda t: tfd.Normal(loc=t[..., :1],
                           scale=1e-3 + tf.math.softplus(0.05 * t[..., 1:]))),
    ]) 
    

    【讨论】:

    • 很好的解释!关于so​​ftplus中的“0.05”来自哪里的任何想法?我很难理解
    • @LucasMiranda 很抱歉,我没有看到您的评论。无论如何,Winthrop Harvey 在上面提供了一些线索。或许你也可以参考这个github页面github.com/tensorflow/probability/issues/703这里也有讨论
    【解决方案2】:

    关于添加 .05 的问题,这是一个小的偏移量,可以解决没有它可能出现的一些梯度问题。基本上前面说过,我们确信真正的可变性不小于 epsilon(此处为 .05),因此我们将确保 std dev 永远不会通过添加它来更小。

    https://github.com/tensorflow/probability/issues/751

    金钱报价:

    “如果在给定任务的实践中,无穷小尺度最终成为问题,我们通常使用的解决方法是 softplus-and-shift,例如 scale = epsilon + tf.math.softplus(unconstrained_scale),其中 epsilon 是一些我们先验地确信的像 1e-5 这样的微小值远小于真实规模。”

    编辑:由于我上面描述的原因,实际上添加的是 1e-3。至于乘法……可能又只是缩放或梯度调整。或者也许让 scale 参数从某个大小开始。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2019-04-04
      • 2021-04-28
      • 1970-01-01
      • 1970-01-01
      • 2019-03-14
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多