【问题标题】:How to fix: 'Error occurred in generator: subscript out of bounds'如何解决:“生成器中发生错误:下标越界”
【发布时间】:2019-06-05 12:29:11
【问题描述】:

我一直在玩 Keras 的神经网络。在尝试应用循环神经网络时,我偶然发现了代码蓝图,但在实现代码并尝试根据我的需要进行调整时,我总是收到错误:

Error occurred in generator: subscript out of bounds
Error in py_call_impl(callable, dots$args, dots$keywords) : 
  StopIteration: 

Detailed traceback: 
  File "/anaconda3/envs/r-tensorflow/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 91, in wrapper
    return func(*args, **kwargs)
  File "/anaconda3/envs/r-tensorflow/lib/python3.6/site-packages/keras/engine/training.py", line 1418, in fit_generator
    initial_epoch=initial_epoch)
  File "/anaconda3/envs/r-tensorflow/lib/python3.6/site-packages/keras/engine/training_generator.py", line 181, in fit_generator
    generator_output = next(output_generator)
  File "/Library/Frameworks/R.framework/Versions/3.5/Resources/library/reticulate/python/rpytools/generator.py", line 23, in __next__
    return self.next()
  File "/Library/Frameworks/R.framework/Versions/3.5/Resources/library/reticulate/python/rpytools/generator.py", line 40, in next
    raise StopIteration()

我使用的数据框只是一个变量的时间序列。我怀疑发电机是罪魁祸首,但我不是 100% 确定。

非常感谢你们的帮助。

我尝试过使用不同版本的 fit_generator() 函数,但每个人都会抛出相同的错误。

generator <- function(data, lookback, delay, min_index, max_index, shuffle = FALSE, batch_size = 128, step = 2) {
    if (is.null(max_index)) max_index <- nrow(data) - delay -   1
   i <- min_index + lookback
   function() {
     if (shuffle) {
       rows <- sample(c((min_index+lookback):max_index),size = batch_size)
     } else {
       if (i + batch_size >= max_index)
         i <<- min_index + lookback
       rows <- c(i:min(i+batch_size, max_index))
       i <<- i + length(rows)
 }
     samples <- array(0, dim = c(length(rows),
                                 lookback / step,
                                 dim(data)[[-1]]))
     targets <- array(0, dim = c(length(rows)))
     for (j in 1:length(rows)) {
       indices <- seq(rows[[j]] - lookback+1, rows[[j]],
                      length.out = dim(samples)[[2]])
       samples[j,,] <- data[indices,]
       targets[[j]] <- data[rows[[j]] + delay,2]
     }
     list(samples, targets)
   }
 }

lookback <- 30
step <- 2
delay <- 365
batch_size <- 128 

train_gen <- generator(
  data,
  lookback = lookback,
  delay = delay,
  min_index = 1,
  max_index = nrow(data),
  shuffle = TRUE,
step = step, 
  batch_size = batch_size
)
val_gen = generator(
  data,
  lookback = lookback,
  delay = delay,
  min_index = floor(nrow(lightning_ts_red)*0.6)+1,
  max_index = floor(nrow(lightning_ts_red)*0.8),
  step = step,
  batch_size = batch_size
) 
test_gen <- generator(
  data,
  lookback = lookback,
  delay = delay,
  min_index = floor(nrow(lightning_ts_red)*0.8)+1,
  max_index = NULL,
  step = step,
  batch_size = batch_size


test_steps <- (nrow(lightning_ts_red) - floor(nrow(lightning_ts_red)*0.8)+1 - lookback) / batch_size


val_steps <- (floor(nrow(lightning_ts_red)*0.8) - floor(nrow(data)*0.6)+1 - lookback) / batch_size
history <- model %>% fit_generator(
train_gen,
steps_per_epoch=500,
epochs=20,
validation_data= val_gen,
validation_steps = val_steps,
verbose=1, view_metrics="auto")

【问题讨论】:

    标签: r keras time-series


    【解决方案1】:

    您提供的是train_gen 的数据总数,如果您将数据划分并明确给出train_gen 的行数会更好。

    我做了以下工作并为我工作。我有 8614 行的时间序列数据集:

    train_gen <- generator(
    data, 
    lookback= lookback,
    delay = delay,
    min_index = 1,
    max_index= 5500, # you say that the first 5500 value consider as a training data
    shuffle = TRUE,
    step = step,
    batch_size = batch_size
    )
    
    val_gen <- generator(
    data, 
    lookback= lookback,
    delay = delay,
    min_index = 5501, 
    max_index= 7000,
    step = step,
    batch_size = batch_size
    )
    
    test_gen <- generator(
    data, 
    lookback= lookback,
    delay = delay,
    min_index = 7001,
    max_index= NULL,
    step = step,
    batch_size = batch_size
    )
    
    ## How many steps to draw from val_gen in order to see the entire validation set
    val_step <- (7000 - 5501 - lookback) / batch_size
    test_step <- (nrow(data) - 7001 - lookback) / batch_size
    
    

    【讨论】:

    • 感谢您的回答。但是,它并没有解决它。我仍然遇到同样的错误。我给出的索引指示只是在数据长度方面。如果我没记错的话,'max_index = floor(nrow(lightning_ts_red)*0.6)' 会给我 60% 的数据作为集合的训练部分。我尝试插入您的建议,但仍然收到下标越界错误。
    • 尝试首先计算 Val_step 和 test_step 的值作为您的 nrow(data) 并考虑 batch_size。如果 val_step 小于 1,就会得到下标越界错误! val_step = (max_index - min_index - 回溯) / batch_size
    • 我不太明白你的意见。到目前为止,val_step 和 test_step 都是 17,还是我需要在每个生成器函数中定义它们?如果我在他们之外做这件事,max_index 和 min_index 都没有定义。在此先感谢 ;)
    【解决方案2】:

    看看生成器函数中的这一行:

    targets[[j]] <- data[rows[[j]] + delay,2]
    

    第二个参数 2 定义您要预测的数据列。在最初的例子中(来自 Chollet 和 Allaire),摄氏温度在第二列('T (degC)'),这就是他们试图预测的。

    如果您使用单变量数据,则只有一列,因此生成器函数将抛出“下标越界”错误。

    您应该将其更改为 1(如下所示),它应该可以正常工作。

    targets[[j]] <- data[rows[[j]] + delay,1]
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2012-01-12
      • 2017-04-29
      • 1970-01-01
      • 1970-01-01
      • 2011-03-26
      • 1970-01-01
      • 2020-01-01
      • 2015-04-03
      相关资源
      最近更新 更多