【发布时间】:2021-12-09 12:18:33
【问题描述】:
在执行 MinMax Scaling 后我有这个 numpy 数组:
dftransformed = scaler.transform(df1)
dftransformed
array([[0.70186067, 0.63422294, 0.60840393, ..., 0.57706373, 0.67144751,
0.57292072],
[0.70976009, 0.75551699, 0.55909346, ..., 0.73020882, 0.71565513,
0.76358491],
[0.54763595, 0.58429507, 0.66676546, ..., 0.53096619, 0.587302 ,
0.66410096],
...,
[0.58223568, 0.20418524, 0.34276947, ..., 0.59893092, 0.38758242,
0.12860918],
[0.11992947, 0.19754072, 0.19837881, ..., 0. , 0. ,
0. ],
[0.01558628, 0.03226724, 0.09110852, ..., 0.01744946, 0. ,
0. ]])
此操作擦除了最后一个数据帧的索引,这是我的最后一个数据帧:
现在我正在尝试将此 numpy 数组转换为 pandas 数据框并将我尝试过的索引添加到此。
dftransformedtoDF = pd.DataFrame( dftransformed = dftransformed[1:,1:], index = df1[1:,0], columns = df1[0,1:])
# dftransformed is the numpy array, df1 is my dataframe
但是我收到了这个输出:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-71-6331ce1c5564> in <module>
----> 1 dftransformedtoDF = pd.DataFrame( dftransformed = dftransformed[1:,1:], index = df1[1:,0], columns = df1[0,1:])
~\Anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
2925 if self.columns.nlevels > 1:
2926 return self._getitem_multilevel(key)
-> 2927 indexer = self.columns.get_loc(key)
2928 if is_integer(indexer):
2929 indexer = [indexer]
~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
2655 'backfill or nearest lookups')
2656 try:
-> 2657 return self._engine.get_loc(key)
2658 except KeyError:
2659 return self._engine.get_loc(self._maybe_cast_indexer(key))
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
TypeError: '(slice(1, None, None), 0)' is an invalid key
我想也许我的问题是索引是数据框,列是数据框。熊猫对此有问题,但我还有什么其他选择?将索引、列数据框转换为numpy,然后添加到数据框函数中?
【问题讨论】:
-
df1[1:,0]是二维 numpy 数组的有效索引。数据框是二维的,但需要某种loc表达式来选择列和行。 -
你确定你不只是想要
dftransformedtoDF = pd.DataFrame(dftransformed, index = df1.index, columns = df1.columns)吗? -
@Riley 的解决方案对我有用。
标签: python pandas dataframe numpy numpy-ndarray