【问题标题】：pandas- kernel restarting: the kernel for .ipynb appears to have died. it will restart automaticallypandas-内核重新启动：.ipynb 的内核似乎已经死了。它会自动重启
【发布时间】：2021-06-24 20:57:34
【问题描述】：

更新

我为jupyter-notebook 运行了docker-container，但是在运行基于pandas 的块时，几秒钟后系统返回：

内核重新启动：.ipynb 的内核似乎已经死机。它会自动重启。

只需重新启动内核的选项。

这是出现消息的代码块：

import pandas as pd


def remove_typos(string):
    
    string=str(string)
    string=str(string).replace('≤', '')
    string=str(string).replace('+', '')
    
    # if "%" detected then convert to numeric format
    if "%" in string: 
        string=string.replace('%', '')
        string=float(string)/100
        
    else:
        pass
        
    return string


data = {k: v.replace([r'\+', '≤'], '', regex=True) for k, v in data.items()}
data = {k: v.applymap(remove_typos) for k, v in data.items()}

我已经尝试了什么？

我已经尝试在容器 cli 中运行 pip install pandas：这将返回我的下一条消息：

试图给容器更多的本地内存：

尝试从 anaconda 提示更新 conda 并重新安装所有软件包：

# conda config --set quiet True
# conda update --force conda

#conda install pandas

在所有情况下，结果都是一样的。

补充说明：

总处理器利用率达到 100%
函数应用于 10,000 多个单元格

还有其他方法可以解决这个问题吗？

数据演示

原始 df 格式相同，但尺寸要大得多。

data = {'dataframe_1':pd.DataFrame({'col1': ['John', 'Ashley'], 'col2': ['+10', '-1']}), 'dataframe_2':pd.DataFrame({'col3': ['Italy', 'Brazil', 'Japan'], 'col4': ['Milan', 'Rio do Jaineiro', 'Tokio'], 'percentage':['+95%', '≤0%', '80%+']})}

会话信息

{'commit_hash': '2486838d9',
 'commit_source': 'installation',
 'default_encoding': 'UTF-8',
 'ipython_path': '/usr/local/lib/python3.6/site-packages/IPython',
 'ipython_version': '7.16.1',
 'os_name': 'posix',
 'platform': 'Linux-5.10.25-linuxkit-x86_64-with-debian-10.9',
 'sys_executable': '/usr/local/bin/python',
 'sys_platform': 'linux',
 'sys_version': '3.6.13 (default, May 12 2021, 16:40:31) \n[GCC 8.3.0]'}

【问题讨论】：

标签： python pandas linux docker jupyter-notebook

【解决方案1】：

问题与迭代次数有关，需要减少迭代次数。

首先，将函数重命名为convert_to_percentage()，然后遍历每个键和值来替换字符：


############# convert_to_percentage(string) #################

# string :: strings which represent a percentage.

def convert_to_percentage(string):
    
    #string=str(string).replace([r'\+', '≤'], '', regex=True)

    # if "%" detected then convert to numeric format
    if "%" in string: 
        string=str(string)
        string=string.replace('%', '')
        string=float(string)/100
        
    else:
        pass
        
    
    
    return string

############################################################
#                                                          #
# removin typos for each string and converting to float    #
#                                                          #
############################################################

######## removing trailing whitespaces and typos ###########

# for all job title reports

data= {k: v.replace([r'\+', '≤'], '', regex=True) for k, v in data.items()}



print('Succesful removing of typos!')

其次，将for key, value in data.items()替换为for key in data:

############################################################
#                                                          #
# conversion of specific columns to percentages (%)        #
#                                                          #
############################################################

for key in data:
     data[key].apply(lambda x: convert_to_percentage(x), axis=1)

【讨论】：