假设您的原始数据是列表元组的列表。您可以将元组的内部列表转换为字典,然后将其传递给pd.DataFrame() 以创建数据帧。
data= [[('Date', '2015 02 10 '), ('Age', ' 23'), ('Sex', 'Female'), ('Height(cm)', '209'), ('Weight(kg)', '42.0')],
[('Date', '2015 04 21 '), ('Age', ' 31'), ('Sex', 'Male' ), ('Height(cm)', '301'), ('Weight(kg)', '69.0')],
[('Date', '2015 04 21 '), ('Age', ' 11'), ('Sex', 'Male' ), ('Height(cm)', '150'), ('Weight(kg)', '69.0')]]
df = pd.DataFrame(map(dict, data))
print(df)
Date Age Sex Height(cm) Weight(kg)
0 2015 02 10 23 Female 209 42.0
1 2015 04 21 31 Male 301 69.0
2 2015 04 21 11 Male 150 69.0
您的数据中似乎有多余的空格,您可以使用删除它们
df.columns = df.columns.str.strip()
df = df.applymap(lambda x: x.strip())
你也可以strip创建中的多余空格
df = pd.DataFrame([{k.strip(): v.strip() for k, v in d.items()} for d in map(dict, data)])
如果您的Date 键是可能的重复键,您可以将重复键的值组合到一个列表中。然后在该列上展开以将列表变为多行。
import collections
data= [[('Date', '2015 02 10 '), ('Date', '2015 02 10 '), ('Age', ' 23'), ('Sex', 'Female'), ('Height(cm)', '209'), ('Weight(kg)', '42.0')],
[('Date', '2015 04 21 '), ('Age', ' 31'), ('Sex', 'Male' ), ('Height(cm)', '301'), ('Weight(kg)', '69.0')],
[('Date', '2015 04 21 '), ('Age', ' 11'), ('Sex', 'Male' ), ('Height(cm)', '150'), ('Weight(kg)', '69.0')]]
data_ = []
for list_of_tuple in data:
c = collections.defaultdict(list)
for a, b in list_of_tuple:
c[a].append(b)
data_.append(c.items())
df = pd.DataFrame(map(dict, data_))
df = df.apply(pd.Series.explode)
print(df)
Date Age Sex Height(cm) Weight(kg)
0 2015 02 10 23 Female 209 42.0
0 2015 02 10 23 Female 209 42.0
1 2015 04 21 31 Male 301 69.0
2 2015 04 21 11 Male 150 69.0