【问题标题】:CNTK Sequence model error: Different minibatch layouts detectedCNTK 序列模型错误:检测到不同的小批量布局
【发布时间】:2017-05-15 18:31:13
【问题描述】:

我正在尝试使用 CNTK 训练一个模型,该模型接受两个输入序列并输出一个二维标量标签。我已经这样定义了模型:

def create_seq_model(num_tokens):
    with C.default_options(init=C.glorot_uniform()):
        i1 = sequence.input(shape=num_tokens, is_sparse=True, name='i1')
        i2 = sequence.input(shape=num_tokens, is_sparse=True, name='i2')
        s1 = Sequential([Embedding(300), Fold(GRU(64))])(i1)
        s2 = Sequential([Embedding(300), Fold(GRU(64))])(i2)
        combined = splice(s1, s2)
        model = Sequential([Dense(64, activation=sigmoid),
                        Dropout(0.1, seed=42),
                        Dense(2, activation=softmax)])
        return model(combined)

我已将我的数据转换为 CTF 格式。当我尝试使用以下 sn-p 进行训练时(对示例 here 进行了轻微修改),我收到错误消息:

def train(reader, model, max_epochs=16):
    criterion = create_criterion_function(model)

    criterion.replace_placeholders({criterion.placeholders[0]: C.input(2, name='labels')})

    epoch_size = 500000
    minibatch_size=128

    lr_per_sample = [0.003]*4+[0.0015]*24+[0.0003]
    lr_per_minibatch= [x*minibatch_size for x in lr_per_sample]
    lr_schedule = learning_rate_schedule(lr_per_minibatch, UnitType.minibatch, epoch_size)

    momentum_as_time_constant = momentum_as_time_constant_schedule(700)

    learner = fsadagrad(criterion.parameters,
                   lr=lr_schedule, momentum=momentum_as_time_constant,
                   gradient_clipping_threshold_per_sample=15,
                   gradient_clipping_with_truncation=True)

    progress_printer = ProgressPrinter(freq=1000, first=10, tag='Training', num_epochs=max_epochs)

    trainer = Trainer(model, criterion, learner, progress_printer)

    log_number_of_parameters(model)

    t = 0
    for epoch in range(max_epochs):
        epoch_end = (epoch+1) * epoch_size
        while(t < epoch_end):
            data = reader.next_minibatch(minibatch_size, input_map={
                criterion.arguments[0]: reader.streams.i1,
                criterion.arguments[1]: reader.streams.i2,
                criterion.arguments[2]: reader.streams.labels
            })
            trainer.train_minibatch(data)
            t += data[criterion.arguments[1]].num_samples 
        trainer.summarize_training_progress()

错误是这样的:

Different minibatch layouts detected (difference in sequence lengths or count or start flags) in data specified for the Function's arguments 'Input('i2', [#, *], [132033])' vs. 'Input('i1', [#, *], [132033])', though these arguments have the same dynamic axes '[*, #]'

我注意到,如果我选择两个输入序列长度相同的示例,那么训练功能就会起作用。不幸的是,这代表了非常少量的数据。处理具有不同数据长度的序列的正确机制是什么?是否需要填充输入(类似于 Keras 的 pad_sequence())?

【问题讨论】:

    标签: python deep-learning cntk


    【解决方案1】:

    i1i2 这两个序列被意外地视为具有相同的长度。这是因为sequence.input(...)sequence_axis 参数的默认值为default_dynamic_axis()。解决这个问题的一种方法是告诉 CNTK 这两个序列的长度不同,方法是给每个序列一个唯一的序列轴,如下所示:

    i1_axis = C.Axis.new_unique_dynamic_axis('1')
    i2_axis = C.Axis.new_unique_dynamic_axis('2')
    i1 = sequence.input(shape=num_tokens, is_sparse=True, sequence_axis=i1_axis, name='i1')
    i2 = sequence.input(shape=num_tokens, is_sparse=True, sequence_axis=i2_axis, name='i2')
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2021-02-18
      • 1970-01-01
      • 2021-05-22
      • 1970-01-01
      • 2015-07-22
      • 2019-02-24
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多