Kaggle ASL Dataset: ValueError: If using all scalar values, you must pass a index答案

【问题标题】：Kaggle ASL Dataset: ValueError: If using all scalar values, you must pass an indexKaggle ASL Dataset: ValueError: If using all scalar values, you must pass a index
【发布时间】：2021-03-19 13:42:21
【问题描述】：

我正在尝试处理Kaggle ASL Dataset，在预处理过程中，我尝试针对每个像素缩放值。

我在 Google Colab 中做了以下步骤：

import pandas as pd
from sklearn.preprocessing import MinMaxScaler
train = pd.read_csv("sign-language-mnist/sign_mnist_train.csv")
scaler = MinMaxScaler()
new_df = train.apply(lambda x: scaler.fit_transform(x.values.reshape(1,-1)),axis=0)

在尝试运行这段代码时，我收到以下错误：

/usr/local/lib/python3.6/dist-packages/pandas/core/frame.py in apply(self, func, axis, raw, result_type, args, **kwds)
   7550             kwds=kwds,
   7551         )
-> 7552         return op.get_result()
   7553 
   7554     def applymap(self, func) -> "DataFrame":

/usr/local/lib/python3.6/dist-packages/pandas/core/apply.py in get_result(self)
    178             return self.apply_raw()
    179 
--> 180         return self.apply_standard()
    181 
    182     def apply_empty_result(self):

/usr/local/lib/python3.6/dist-packages/pandas/core/apply.py in apply_standard(self)
    272 
    273         # wrap results
--> 274         return self.wrap_results(results, res_index)
    275 
    276     def apply_series_generator(self) -> Tuple[ResType, "Index"]:

/usr/local/lib/python3.6/dist-packages/pandas/core/apply.py in wrap_results(self, results, res_index)
    313         # see if we can infer the results
    314         if len(results) > 0 and 0 in results and is_sequence(results[0]):
--> 315             return self.wrap_results_for_axis(results, res_index)
    316 
    317         # dict of scalars

/usr/local/lib/python3.6/dist-packages/pandas/core/apply.py in wrap_results_for_axis(self, results, res_index)
    369 
    370         try:
--> 371             result = self.obj._constructor(data=results)
    372         except ValueError as err:
    373             if "arrays must all be same length" in str(err):

/usr/local/lib/python3.6/dist-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
    466 
    467         elif isinstance(data, dict):
--> 468             mgr = init_dict(data, index, columns, dtype=dtype)
    469         elif isinstance(data, ma.MaskedArray):
    470             import numpy.ma.mrecords as mrecords

/usr/local/lib/python3.6/dist-packages/pandas/core/internals/construction.py in init_dict(data, index, columns, dtype)
    281             arr if not is_datetime64tz_dtype(arr) else arr.copy() for arr in arrays
    282         ]
--> 283     return arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
    284 
    285 

/usr/local/lib/python3.6/dist-packages/pandas/core/internals/construction.py in arrays_to_mgr(arrays, arr_names, index, columns, dtype, verify_integrity)
     76         # figure out the index, if necessary
     77         if index is None:
---> 78             index = extract_index(arrays)
     79         else:
     80             index = ensure_index(index)

/usr/local/lib/python3.6/dist-packages/pandas/core/internals/construction.py in extract_index(data)
    385 
    386         if not indexes and not raw_lengths:
--> 387             raise ValueError("If using all scalar values, you must pass an index")
    388 
    389         if have_series:

ValueError: If using all scalar values, you must pass an index

但是，以下代码可以正常工作：

new_df = pd.DataFrame(scaler.fit_transform(train), columns=train.columns)

所以，问题是出了什么问题？任何人都可以回答吗？或者，谁能解释一下，我需要知道什么才能找出第一个给出那个奇怪错误的原因？

提前致谢。

【问题讨论】：

标签： python pandas numpy scikit-learn

【解决方案1】：

你可以试试

train.iloc[:,1:] = scaler.fit_transform(train.iloc[:,1:])

无论如何，您也不想缩放标签值。

【讨论】：