【发布时间】:2018-03-06 17:22:54
【问题描述】:
我正在尝试将此威斯康星乳腺癌数据集从列表转换为带有列的数据框。
这些是列名:
# Attribute Domain
-- -----------------------------------------
1. Sample code number id number
2. Clump Thickness 1 - 10
3. Uniformity of Cell Size 1 - 10
4. Uniformity of Cell Shape 1 - 10
5. Marginal Adhesion 1 - 10
6. Single Epithelial Cell Size 1 - 10
7. Bare Nuclei 1 - 10
8. Bland Chromatin 1 - 10
9. Normal Nucleoli 1 - 10
10. Mitoses 1 - 10
11. Class: (2 for benign, 4 for malignant)
我是这样把数据集导入python的
导入请求
link = "http://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.data"
f = requests.get(link)
print (f.text)
并将数据视为带有逗号的列表:
1000025,5,1,1,1,2,1,3,1,1,2
1002945,5,4,4,5,7,10,3,2,1,2
1015425,3,1,1,1,2,2,3,1,1,2
1016277,6,8,8,1,3,4,3,7,1,2
1017023,4,1,1,3,2,1,3,1,1,2
我需要将逗号分隔成列并在列中添加名称
我试过了,但是没用
import requests
import pandas as pd
import io
urlData = requests.get(f.text).content
rawData = pd.read_csv(io.StringIO(urlData.decode('utf-8')))
【问题讨论】:
-
可能重复link
-
只是
pd.read_csv(link, header=None)- 相当简单:)