【问题标题】:plot pandas multi-index time series of year and week of the year绘制一年中一年和一周的熊猫多索引时间序列
【发布时间】:2021-09-07 20:11:20
【问题描述】:

我有一个多索引 pandas 数据框,第一个索引是年份,第二个是一年中的第几周(1-53),53 每隔几年发生一次),如下所示。如何将其绘制成从 1923 年第 1 周到 2020 年第 53 周的一个连续时间序列?我发现的其他帖子为第一个索引的不同值绘制了不同的曲线。

year  weekofyear
1923  1             1474.895230
      2              551.502235
      3              701.706289
      4             1299.355552
      5              615.746137
                       ...     
2020  24             804.406049
      25             741.519411
      26             690.395742
      27              92.746479
      53             687.228790

这是一个较长的 tsv 示例:

year    weekofyear  Hohokus_gage
1923    1   1474.8952296960001
1923    2   551.5022346624
1923    3   701.7062887296
1923    4   1299.3555520512002
1923    5   615.7461373056
1923    6   471.4235672832001
1923    7   412.60872683520006
1923    8   287.74029634560003
1923    9   889.9137781632
1923    10  1555.8787407744
1923    11  3254.2703641728
1923    12  3719.8124474112
1923    13  1543.6633508352002
1923    14  1327.4057067263998
1923    15  742.4242551936
1923    16  443.3734126079999
1923    17  942.3947127168
1923    18  842.4094839552001
1923    19  551.954656512
1923    20  549.692547264
1923    21  472.32841098240004
1923    22  209.92373821440003
1923    23  423.4668512256
1923    24  178.2542087424
1923    25  95.91343211520001
1923    26  93.65132286720001
1923    27  101.34249431040001
1923    28  64.2439026432
1923    29  48.4091379072
1923    30  48.40913790719999
1923    31  43.8849194112
1923    32  51.1236690048
1923    33  40.2655446144
1923    34  37.098591667200004
1923    35  47.50429420800001
1923    36  59.719684147200006
1923    37  57.00515304960001
1923    38  95.00858841600001
1923    39  94.5561665664
1923    40  50.67124715520001
1923    41  56.55273120000001
1923    42  57.00515304960001
1923    43  1145.9845450368
1923    44  303.5750610816001
1923    45  221.68670630399996
1923    46  162.4194440064
1923    47  260.59498536960007
1923    48  821.145657024
1923    49  1577.5949895551998
1923    50  765.4977695232002
1923    51  466.44692693760004
1923    52  827.9319847679999
1924    1   937.8704942208001
1924    2   1147.7942324352
1924    3   1825.0697412864004
1924    4   917.0590891392
1924    5   690.3957424896
1924    6   754.6396451328002
1924    7   477.30505132800005
1924    8   493.59223791359994
1924    9   404.91755539199994
1924    10  921.1308857856002
1924    11  1021.1161145472003
1924    12  998.4950220671999
1924    13  1707.4400603904005
1924    14  2891.4280407936003
1924    15  2825.8268726016
1924    16  1787.0663059200006
1924    17  1232.8495401599998
1924    18  945.1092438144001
1924    19  1869.4070825472002
1924    20  2651.6444605056
1924    21  1091.2415012351998
1924    22  853.7200301952003
1924    23  653.749572672
1924    24  466.44692693760004
1924    25  346.10271494400007
1924    26  280.0491249024
1924    27  151.10889776640005
1924    28  422.1095856768
1924    29  204.4946760192
1924    30  95.4610102656
1924    31  109.4860876032
1924    32  113.55788424959998
1924    33  232.09240884480002
1924    34  165.58639695359997
1924    35  214.44795671039998
1924    36  74.19718333440001
1924    37  230.28272144640002
1924    38  92.746479168
1924    39  103.1521817088
1924    40  1199.8227451392002
1924    41  205.39951971840003
1924    42  125.7732741888
1924    43  100.43765061119998
1924    44  106.77155650560002
1924    45  137.0838204288
1924    46  150.20405406720002
1924    47  406.72724279040006
1924    48  319.86224766720005
1924    49  347.4599804928
1924    50  557.8361405568002
1924    51  380.9391973631999
1924    52  302.67021738240004
2021    1   1255.0182107904
2021    2   1712.8691225856
2021    3   1165.4386845696001
2021    4   971.8021329408001
2021    5   1026.5451767424001
2021    6   937.8704942208001
2021    7   1464.9419490048003
2021    8   1740.9192772607998
2021    9   1986.1319197440002
2021    10  1444.1305439232
2021    11  1469.9185893504
2021    12  1759.0161512448
2021    13  1498.4211658752001
2021    14  1211.1332913792
2021    15  1927.7695011456003
2021    16  1257.2803200383999
2021    17  1105.2665785728
2021    18  1245.5173519488
2021    19  1066.8107213568
2021    20  794.4527678976001
2021    21  1899.2669246208
2021    22  1340.0735185151998
2021    23  908.9154958464001
2021    24  804.4060485888001
2021    25  741.5194114944001
2021    26  690.3957424896001
2021    27  92.746479168
2021    53  687.2287895424

【问题讨论】:

    标签: python pandas


    【解决方案1】:

    您可以通过两种方式做到这一点:

    df.plot()
    

    plt.plot(df.reset_index()['Hohokus_gage'])
    

    编辑:要显示 xticks,您可以执行以下操作:

    tick_frequency = range(0, df.shape[0], 10) # set this appropriately 
    plt.xticks(tick_frequency, x['year'].iloc[tick_frequency], rotation=45)
    

    【讨论】:

    • 我试过了,但后来我丢失了日期。你知道我怎样才能保留日期吗?
    • @BernardoTrindade 您希望情节如何? x 轴是否有年份或周值?
    【解决方案2】:
    • plotly 的解决方案,类似的方法可以应用于 ma​​tplotlib
    • x定义为一个直线递增的数字
    • 更新 xaxis 使刻度对应实际数据
    • hover_data 也进行了修改,以确保它显示多索引的详细信息

    解决方案

    import io
    import plotly.express as px
    import numpy as np
    import pandas as pd
    
    df = pd.read_csv(
        io.StringIO(
            """year    weekofyear  Hohokus_gage
    1923    1   1474.8952296960001
    1923    2   551.5022346624
    1923    3   701.7062887296
    1923    4   1299.3555520512002
    1923    5   615.7461373056
    1923    6   471.4235672832001
    1923    7   412.60872683520006
    1923    8   287.74029634560003
    1923    9   889.9137781632
    1923    10  1555.8787407744
    1923    11  3254.2703641728
    1923    12  3719.8124474112
    1923    13  1543.6633508352002
    1923    14  1327.4057067263998
    1923    15  742.4242551936
    1923    16  443.3734126079999
    1923    17  942.3947127168
    1923    18  842.4094839552001
    1923    19  551.954656512
    1923    20  549.692547264
    1923    21  472.32841098240004
    1923    22  209.92373821440003
    1923    23  423.4668512256
    1923    24  178.2542087424
    1923    25  95.91343211520001
    1923    26  93.65132286720001
    1923    27  101.34249431040001
    1923    28  64.2439026432
    1923    29  48.4091379072
    1923    30  48.40913790719999
    1923    31  43.8849194112
    1923    32  51.1236690048
    1923    33  40.2655446144
    1923    34  37.098591667200004
    1923    35  47.50429420800001
    1923    36  59.719684147200006
    1923    37  57.00515304960001
    1923    38  95.00858841600001
    1923    39  94.5561665664
    1923    40  50.67124715520001
    1923    41  56.55273120000001
    1923    42  57.00515304960001
    1923    43  1145.9845450368
    1923    44  303.5750610816001
    1923    45  221.68670630399996
    1923    46  162.4194440064
    1923    47  260.59498536960007
    1923    48  821.145657024
    1923    49  1577.5949895551998
    1923    50  765.4977695232002
    1923    51  466.44692693760004
    1923    52  827.9319847679999
    1924    1   937.8704942208001
    1924    2   1147.7942324352
    1924    3   1825.0697412864004
    1924    4   917.0590891392
    1924    5   690.3957424896
    1924    6   754.6396451328002
    1924    7   477.30505132800005
    1924    8   493.59223791359994
    1924    9   404.91755539199994
    1924    10  921.1308857856002
    1924    11  1021.1161145472003
    1924    12  998.4950220671999
    1924    13  1707.4400603904005
    1924    14  2891.4280407936003
    1924    15  2825.8268726016
    1924    16  1787.0663059200006
    1924    17  1232.8495401599998
    1924    18  945.1092438144001
    1924    19  1869.4070825472002
    1924    20  2651.6444605056
    1924    21  1091.2415012351998
    1924    22  853.7200301952003
    1924    23  653.749572672
    1924    24  466.44692693760004
    1924    25  346.10271494400007
    1924    26  280.0491249024
    1924    27  151.10889776640005
    1924    28  422.1095856768
    1924    29  204.4946760192
    1924    30  95.4610102656
    1924    31  109.4860876032
    1924    32  113.55788424959998
    1924    33  232.09240884480002
    1924    34  165.58639695359997
    1924    35  214.44795671039998
    1924    36  74.19718333440001
    1924    37  230.28272144640002
    1924    38  92.746479168
    1924    39  103.1521817088
    1924    40  1199.8227451392002
    1924    41  205.39951971840003
    1924    42  125.7732741888
    1924    43  100.43765061119998
    1924    44  106.77155650560002
    1924    45  137.0838204288
    1924    46  150.20405406720002
    1924    47  406.72724279040006
    1924    48  319.86224766720005
    1924    49  347.4599804928
    1924    50  557.8361405568002
    1924    51  380.9391973631999
    1924    52  302.67021738240004
    2021    1   1255.0182107904
    2021    2   1712.8691225856
    2021    3   1165.4386845696001
    2021    4   971.8021329408001
    2021    5   1026.5451767424001
    2021    6   937.8704942208001
    2021    7   1464.9419490048003
    2021    8   1740.9192772607998
    2021    9   1986.1319197440002
    2021    10  1444.1305439232
    2021    11  1469.9185893504
    2021    12  1759.0161512448
    2021    13  1498.4211658752001
    2021    14  1211.1332913792
    2021    15  1927.7695011456003
    2021    16  1257.2803200383999
    2021    17  1105.2665785728
    2021    18  1245.5173519488
    2021    19  1066.8107213568
    2021    20  794.4527678976001
    2021    21  1899.2669246208
    2021    22  1340.0735185151998
    2021    23  908.9154958464001
    2021    24  804.4060485888001
    2021    25  741.5194114944001
    2021    26  690.3957424896001
    2021    27  92.746479168
    2021    53  687.2287895424"""
        ),
        sep="\s+",
    )
    
    # make it multi-index
    df = df.set_index(["year", "weekofyear"])
    
    # on axis show every 5th weekofyear
    px.line(
        df.assign(week=range(1, len(df) + 1)),
        x="week",
        y="Hohokus_gage",
        hover_data={
            "year": df.index.get_level_values("year").values,
            "weekofyear": df.index.get_level_values("weekofyear").values,
        },
    ).update_layout(
        xaxis={
            "tickmode": "array",
            "tickvals": list(range(1, len(df) + 1, 5)),
            "ticktext": df.index.get_level_values("weekofyear").values[::5],
        }
    ).show()
    

    【讨论】:

    • 这太棒了。我没有在我的帖子中提到(我应该有)在一张图表中显示一百个时间序列,并有可能放大到一年,所以 matplotlib 方法可能更好。不过谢谢!
    猜你喜欢
    • 1970-01-01
    • 2013-05-23
    • 2015-09-19
    • 2013-08-17
    • 2021-10-09
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多