【问题标题】:How to avoid automatic changing of data type in panda dataframe and convert into CSV in python?如何避免自动更改熊猫数据框中的数据类型并在python中转换为CSV?
【发布时间】:2020-04-20 08:42:57
【问题描述】:

我正在尝试在 python 中使用 pandas 将 Json 文件转换为 csv

json文件数据:

[{
    "source": "https://www.na-kd.com/en/sweaters/cardigans/button-up-ribbed-cropped-cardigan-pink",
    "class_ids": "3_33",
    "id_matrix": "0_0_0_1_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_0_1_0_0_0_0_0_0_0_0_0_0_0_0",
    "tags": "cardigan_neckline",
    "front": "https://www.na-kd.com/globalassets/nakd_button_up_ribbed_cropped_cardigan_1018-004495-0211_01g.jpg",
    "back": "https://www.na-kd.com/globalassets/nakd_button_up_ribbed_cropped_cardigan_1018-004495-0211_02a.jpg",
    "left": "https://www.na-kd.com/globalassets/nakd_button_up_ribbed_cropped_cardigan_1018-004495-0211_03b.jpg",
    "right": "https://www.na-kd.com/globalassets/nakd_button_up_ribbed_cropped_cardigan_1018-004495-0211_04c.jpg",
    "zoomedin": "https://www.na-kd.com/globalassets/nakd_button_up_ribbed_cropped_cardigan_1018-004495-0211_05g.jpg",
    "otherurl": "https://www.na-kd.com/globalassets/nakd_button_up_ribbed_cropped_cardigan_1018-004495-0211_05g.jpg"
}, {...}, {...}]

我要转换成 CSV 文件的代码..(但 pandas 会自动更改“id_matrix”和“class_ids”的数据类型,我想要这些列字符串..

raw_data=pd.read_json('/home/mobin/PycharmProjects/na-kd/Jsons/mapped_improvedcheck.json')
raw_data.to_csv("csv_file/samplecheck.csv")

result = raw_data.dtypes
print(result)
print(raw_data['id_matrix'][:10])

这段代码的输出:

source        object
class_ids      int64
id_matrix    float64
tags          object
front         object
back          object
left          object
right         object
zoomedin      object
otherurl      object
dtype: object
0    1.000000e+43
1    1.000000e+43
2    1.000000e+43
3    1.000000e+43
4    1.000000e+43
5    1.000000e+43
6    1.000000e+43
7    1.000000e+43
8    1.000000e+43
9    1.000000e+43

【问题讨论】:

    标签: python json pandas csv dataframe


    【解决方案1】:

    您可以使用pandas.DataFrameastype 属性:

    import pandas as pd
    
    raw_data = pd.read_json('/home/mobin/PycharmProjects/na-kd/Jsons/mapped_improvedcheck.json')
    raw_data2 = raw_data.astype('object')
    raw_data2.to_csv('csv_file/samplecheck.csv')
    
    result = raw_data2.dtypes
    print(result)
    print(raw_data2['id_matrix'][:10])
    

    更新:我在samplecheck.csv 文件中得到的是:

    ,back,class_ids,front,id_matrix,left,otherurl,right,source,tags,zoomedin
    0,https://www.na-kd.com/globalassets/nakd_button_up_ribbed_cropped_cardigan_1018-004495-0211_02a.jpg,333,https://www.na-kd.com/globalassets/nakd_button_up_ribbed_cropped_cardigan_1018-004495-0211_01g.jpg,1e+42,https://www.na-kd.com/globalassets/nakd_button_up_ribbed_cropped_cardigan_1018-004495-0211_03b.jpg,https://www.na-kd.com/globalassets/nakd_button_up_ribbed_cropped_cardigan_1018-004495-0211_05g.jpg,https://www.na-kd.com/globalassets/nakd_button_up_ribbed_cropped_cardigan_1018-004495-0211_04c.jpg,https://www.na-kd.com/en/sweaters/cardigans/button-up-ribbed-cropped-cardigan-pink,cardigan_neckline,https://www.na-kd.com/globalassets/nakd_button_up_ribbed_cropped_cardigan_1018-004495-0211_05g.jpg
    

    【讨论】:

    • 这段代码将浮点数转换为对象,但没有给我我的要求输出,而在 csv 中写入时发生了同样的问题。您的代码输出:source object class_ids object id_matrix object tags object front object back object left object right object zoomedin object otherurl object dtype: object 0 1e+43 1 1e+43 2 1e+43 3 1e+43 4 1e+43 5 1e+43 6 1e+43 7 1e+43 8 1e+43 9 1e+43 Name: id_matrix, dtype: object
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2022-01-20
    • 2022-01-18
    • 1970-01-01
    • 2021-12-03
    • 2015-11-07
    相关资源
    最近更新 更多