【问题标题】:tensorflow estimator accuracy and loss is zero张量流估计器的准确性和损失为零
【发布时间】:2019-04-11 17:29:22
【问题描述】:

我的模型的准确率和损失评估为 0。
全局步数应该是 1625,但它是 1。
acc 和 loss 不应该等于 0,因为它们是相互矛盾的。

我的输入函数,keras estimator,train_and_evaluate如下:

def make_input_fn(addrs,labels,batch_size,mode):

 filename_dataset = tf.data.Dataset.from_tensor_slices((addrs,labels))     

 dataset = filename_dataset.apply(tf.contrib.data.map_and_batch(lambda 
 addrs, labels: tuple(tf.py_func(
    process, [addrs, labels], [tf.uint8, labels.dtype])),batch_size,

 num_parallel_batches=2,

 drop_remainder=False))
 if mode == tf.estimator.ModeKeys.TRAIN:
  num_epochs = None # indefinitely
  dataset = dataset.apply(tf.contrib.data.shuffle_and_repeat(buffer_size = 10000))
 else:
  num_epochs = 1
  dataset = dataset.repeat(num_epochs)

 dataset = dataset.prefetch(buffer_size=batch_size)
 images,labels = dataset.make_one_shot_iterator().get_next()
 images.set_shape([None,512,512,3])
 labels.set_shape([None,1])
 return images,labels

def keras_estimator(model_dir,config):
 base_model = Xception(weights='imagenet', include_top=False,input_shape = 
  (512,512,3),classes = 5)
 x = base_model.output
 x = GlobalAveragePooling2D()(x)

 x = Dense(1024, activation='relu')(x)
 x = Dropout(0.2)(x)
 x = Dense(256, activation='relu')(x)
 x = Dropout(0.2)(x)

 predictions = Dense(5, activation='softmax')(x)


 model = Model(inputs=base_model.input, outputs=predictions)


 for layer in base_model.layers:
   layer.trainable = False
 model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', 
       metrics=['acc'])


 estimator=tf.keras.estimator.model_to_estimator(keras_model=model,
      model_dir=model_dir,
      config=config)
 return estimator

def train_and_evaluate(model_dir):
 t_batch_size = 512
 e_batch_size = 64
 num_epochs = 25
 import pandas as pd
 df = pd.read_csv('/content/trainLabels.csv')
 from random import shuffle
 addrs = ['/content/train/train/' + str(df.iloc[i]['image']) + '.jpeg' for i 
 in range(len(df))]
 labels = df['level'].values.tolist()
 c = list(zip(addrs, labels))
 shuffle(c)
 addrs1, labels1 = zip(*c)
 train_addrs = addrs1[0 : int(0.9 * len(addrs))]
 train_labels = labels1[0 : int(0.9 * len(labels))]
 val_addrs = addrs1[ int(0.9 * len(addrs)) : ]
 val_labels = labels1[ int(0.9 * len(addrs)) : ]
 train_addrs = list(train_addrs)
 train_labels = list(train_labels)
 val_addrs = list(val_addrs)
 val_labels = list(val_labels)

 run_config = tf.estimator.RunConfig(save_checkpoints_secs=300)

 estimator = keras_estimator(model_dir,run_config)

 t_max_steps = (len(train_addrs) // t_batch_size) * num_epochs

 train_spec = tf.estimator.TrainSpec(input_fn = lambda : 
 make_input_fn(train_addrs,train_labels,
 t_batch_size,mode=tf.estimator.ModeKeys.TRAIN),max_steps = t_max_steps)

 eval_spec = tf.estimator.EvalSpec(input_fn = lambda : 
 make_input_fn(val_addrs,val_labels,
 e_batch_size,mode=tf.estimator.ModeKeys.EVAL),steps = 
 None,start_delay_secs=10,
    throttle_secs=300)


 tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)

这里是日志文件:

INFO:tensorflow:在本地运行训练和评估 (非分布式)。 INFO:tensorflow:开始训练和评估循环。这 评估将在每个检查点之后发生。检查点频率为 根据 RunConfig 参数确定:save_checkpoints_steps 无 或 save_checkpoints_secs 300。警告:tensorflow:来自 :9: map_and_batch (来自 tensorflow.contrib.data.python.ops.batching) 已弃用并将 在未来的版本中删除。更新说明:使用 tf.data.experimental.map_and_batch(...)。警告:张量流:从 :12: shuffle_and_repeat (来自 tensorflow.contrib.data.python.ops.shuffle_ops) 已弃用并将 在未来的版本中被删除。更新说明:使用 tf.data.experimental.shuffle_and_repeat(...)。信息:张量流:调用 模型_fn。信息:张量流:完成调用 model_fn。 信息:tensorflow:使用 WarmStartSettings 热启动: WarmStartSettings(ckpt_to_initialize_from='/content/training/keras/keras_model.ckpt', vars_to_warm_start='.*', var_name_to_vocab_info={}, var_name_to_prev_var_name={}) INFO:tensorflow:Warm-starting from: ('/content/training/keras/keras_model.ckpt',) INFO:tensorflow:暖启动变量:dense/kernel; prev_var_name: 不变的 INFO:tensorflow:Warm-starting 变量:dense/bias; prev_var_name:未更改的 INFO:tensorflow:暖启动变量: 密集_1/内核; prev_var_name:未更改的 INFO:tensorflow:暖启动 变量:dense_1/bias; prev_var_name:不变 INFO:tensorflow:暖启动变量:dense_2/kernel; prev_var_name: 不变的信息:张量流:暖启动变量:dense_2/bias; prev_var_name:未更改的 INFO:tensorflow:暖启动变量: 亚当/迭代; prev_var_name:不变 INFO:tensorflow:暖启动变量:Adam/lr; prev_var_name: 不变的 INFO:tensorflow:Warm-starting 变量:Adam/beta_1; prev_var_name:未更改的 INFO:tensorflow:暖启动变量: 亚当/beta_2; prev_var_name:未更改的 INFO:tensorflow:暖启动 变量:亚当/衰变; prev_var_name:不变 INFO:tensorflow:暖启动变量:training/Adam/Variable; prev_var_name:未更改的 INFO:tensorflow:暖启动变量: 培训/亚当/变量_1; prev_var_name:不变 INFO:tensorflow:暖启动变量:training/Adam/Variable_2; prev_var_name:未更改的 INFO:tensorflow:暖启动变量: 培训/亚当/变量_3; prev_var_name:不变 INFO:tensorflow:暖启动变量:training/Adam/Variable_4; prev_var_name:未更改的 INFO:tensorflow:暖启动变量: 培训/亚当/变量_5; prev_var_name:不变 INFO:tensorflow:暖启动变量:training/Adam/Variable_6; prev_var_name:未更改的 INFO:tensorflow:暖启动变量: 培训/亚当/变量_7; prev_var_name:不变 INFO:tensorflow:暖启动变量:training/Adam/Variable_8; prev_var_name:未更改的 INFO:tensorflow:暖启动变量: 培训/亚当/变量_9; prev_var_name:不变 INFO:tensorflow:暖启动变量:training/Adam/Variable_10; prev_var_name:未更改的 INFO:tensorflow:暖启动变量: 培训/亚当/变量_11; prev_var_name:不变 INFO:tensorflow:暖启动变量:training/Adam/Variable_12; prev_var_name:未更改的 INFO:tensorflow:暖启动变量: 培训/亚当/变量_13; prev_var_name:不变 INFO:tensorflow:暖启动变量:training/Adam/Variable_14; prev_var_name:未更改的 INFO:tensorflow:暖启动变量: 培训/亚当/变量_15; prev_var_name:不变 INFO:tensorflow:暖启动变量:training/Adam/Variable_16; prev_var_name:未更改的 INFO:tensorflow:暖启动变量: 培训/亚当/变量_17; prev_var_name:不变 信息:张量流:创建 CheckpointSaverHook。信息:张量流:图是 敲定。信息:张量流:运行 local_init_op。信息:张量流:完成 运行 local_init_op。 INFO:tensorflow:将检查点保存为 0 到 /content/training/model.ckpt。 INFO:tensorflow:为 1 保存检查点 进入 /content/training/model.ckpt。信息:张量流:调用 model_fn。 信息:张量流:完成调用 model_fn。信息:张量流:开始 评估于 2018-11-05-13:21:17 INFO:tensorflow:Graph 完成。 信息:张量流:从恢复参数 /content/training/model.ckpt-1 INFO:tensorflow:Running local_init_op。 信息:张量流:完成运行 local_init_op。信息:张量流:完成 评估时间为 2018-11-05-13:22:08 INFO:tensorflow:Saving dict for 全局步骤 1:acc = 0.0,global_step = 1,loss = 0.0 信息:张量流:保存全局步骤 1 的“检查点路径”摘要: /content/training/model.ckpt-1 INFO:tensorflow:最后一步的损失: 没有。

【问题讨论】:

  • 欢迎来到 SO,您是否尝试过调试并缩小可能的问题范围?目前它接近“我的代码已损坏,请修复”,这不太可能吸引答案。尝试找出您认为导致准确性和丢失问题的原因,然后对其进行编辑以包含有关您的代码部分的特定问题。
  • 我无法弄清楚如何调试我的代码。一切似乎都很好。

标签: python tensorflow deep-learning tensorflow-datasets tensorflow-estimator


【解决方案1】:

我之前遇到过这个问题。这是因为我为数据集指定了错误的目录。最终 tensorflow 没有输入数据。我希望这有帮助。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2019-04-28
    • 2021-02-25
    • 2020-04-09
    • 2019-02-04
    • 2018-12-07
    • 2018-12-16
    • 1970-01-01
    • 2018-04-09
    相关资源
    最近更新 更多