【问题标题】:How to make a scipy array from custom data format?如何从自定义数据格式制作 scipy 数组?
【发布时间】:2014-01-18 17:30:39
【问题描述】:

警告:Python 新手...

我的文本看起来像这样,来自数据库查询:

2000;"SCHOOLS OF MEDICINE";416765.0
2000;"SCHOOLS OF ARTS AND SCIENCES";36000.0
2000;"SCHOOLS OF MEDICINE";2000.0
2000;"SCHOOLS OF MEDICINE";179728.0
2000;"OTHER DOMESTIC HIGHER EDUCATION";244547.0
2000;"SCHOOLS OF MEDICINE";107325.0
2000;"OTHER DOMESTIC HIGHER EDUCATION";61609.0
2000;"SCHOOLS OF MEDICINE";93600.0
2000;"SCHOOLS OF EARTH SCIENCES/NATURAL RESOURCES";64865.0
2000;"SCHOOLS OF MEDICINE";50000.0
...

我想制作一个图表,显示所有年份的平均奖励金额以及每个部门的误差线。

但是,我不确定如何将这些数据放入 scipy 数组中以制作图表。我尝试了以下方法:

data = asarray(2000;"SCHOOLS OF MEDICINE";416765.0
    2000;"SCHOOLS OF ARTS AND SCIENCES";36000.0
    2000;"SCHOOLS OF MEDICINE";2000.0
    2000;"SCHOOLS OF MEDICINE";179728.0
    2000;"OTHER DOMESTIC HIGHER EDUCATION";244547.0
    2000;"SCHOOLS OF MEDICINE";107325.0
    2000;"OTHER DOMESTIC HIGHER EDUCATION";61609.0
    2000;"SCHOOLS OF MEDICINE";93600.0
    2000;"SCHOOLS OF EARTH SCIENCES/NATURAL RESOURCES";64865.0
    2000;"SCHOOLS OF MEDICINE";50000.0)

我也试过data = sp.array()。两者都给出以下错误:

    data = sp.asarray(2000;"SCHOOLS OF MEDICINE";416765.0
                          ^
SyntaxError: invalid syntax

所以,在我看来array()asarray() 方法不喜欢用分号分隔的数据。

任何关于如何做到这一点的建议都会很棒。如果可能的话,我不希望先将数据保存到文件中。

谢谢!

【问题讨论】:

    标签: python arrays numpy scipy ipython


    【解决方案1】:

    考虑将pandas 用于这样的数据:

    import pandas as pd
    from StringIO import StringIO
    import matplotlib.pyplot as plt
    
    input = """2000;"SCHOOLS OF MEDICINE";416765.0
    2000;"SCHOOLS OF ARTS AND SCIENCES";36000.0
    2000;"SCHOOLS OF MEDICINE";2000.0
    2000;"SCHOOLS OF MEDICINE";179728.0
    2001;"SCHOOLS OF MEDICINE";1234.0
    2001;"SCHOOLS OF ARTS AND SCIENCES";100.0
    2002;"SCHOOLS OF MEDICINE";9999.0
    2002;"SCHOOLS OF MEDICINE";8436.0"""
    
    df = pd.read_csv(StringIO(input), sep=';', header=None, names=['year', 'division', 'award'])
    print df
    yeartotals = df.groupby(['year'])[['award']].sum()
    print yeartotals
    yeartotals.plot()
    plt.show()
    

    我不确定你想要绘制什么,但pandas integrates quite nicely with matplotlib for plotting

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2021-12-12
      • 2011-06-29
      • 1970-01-01
      • 1970-01-01
      • 2016-10-29
      • 2013-05-06
      • 2020-03-14
      • 1970-01-01
      相关资源
      最近更新 更多