model.fit(...) 和“无法将 NumPy 数组转换为张量”答案

【问题标题】：model.fit(...) and "Failed to convert a NumPy array to a Tensor"model.fit(...) 和“无法将 NumPy 数组转换为张量”
【发布时间】：2020-01-22 09:18:39
【问题描述】：

我正在使用 TensorFlow 2.0 进行文本分类。

数据的结构或多或少是这样的：

第一种方法：

x: List[List[int]] # list of sentences consisting of a list of word IDs for each word in the sentence
y: List[int] # binary truth indicator

但是，当调用model.fit(...) 时，我收到以下错误消息：

Failed to find data adapter that can handle input: (<class 'list'> containing values of types {'(<class \'list\'> containing values of types {"<class \'int\'>"})', "(<class 'list'> containing values of types set())"}), <class 'numpy.ndarray'>

尽管没有在任何地方使用set。

第二种方法：

我尝试对内部列表使用 numpy 数组，如下所示：

x: List[np.ndarray[np.int32]]
y: np.ndarray[np.int32]

但我收到以下错误：

Input arrays should have the same number of samples as target arrays. Found 32 input samples and 479 target samples.

第三种方法：

这促使我将数据结构更改为：

x: np.ndarray[np.ndarray[np.int32]]
y: np.ndarray[np.int32]

这导致了以下错误：

Failed to convert a NumPy array to a Tensor (Unsupported object type numpy.ndarray).

第四种方法：

尝试，

x: np.ndarray[List[int]]
y: np.ndarray[int]

导致以下类似的错误消息：

Failed to convert a NumPy array to a Tensor (Unsupported object type list).

TLDR;

所以问题是：发生了什么事？为什么model.fit(...) 不接受这些参数？

请看下面我的回答。

【问题讨论】：

标签： python numpy tensorflow tensorflow2.0

【解决方案1】：

我记录这种混乱的原因是根本问题与错误消息无关。

根本问题是输入数据 (x) 需要填充。

句子自然有不同的长度。 TensorFlow 的 model.fit(...) 不喜欢这样。为了让它玩得更好，我需要填充句子以确保句子列表中每个句子包含相同数量的单词。（我只是对它们进行了零填充。）

如果您填充输入，3rd Approach 和 4th Approach 都应该可以工作。

【讨论】：

这是我非常相似的问题的答案。
如何设置内边距？
@user5520049：可能有很多方法，但是对于 3 维输入，我使用 np.zeros((max_dim1, max_dim2, max_dim3), dtype=np.float32) 创建批处理空间，然后在输入上迭代添加 results[sample_i:sample_i + sample_x_len, sample_y:sample_y + sample_y_len, 0:sample_feature_len] = sample。我希望这会有所帮助。