【问题标题】:Only last row of json list is populating in Pandas Dataframe - DraftkingsPandas Dataframe 中只有最后一行 json 列表填充 - Draftkings
【发布时间】:2021-08-08 06:42:42
【问题描述】:

我正在尝试将 Draftkings Sportsbook 中的数据加载到 Pandas 数据框中。 不幸的是,在我的尝试中,我只能将列表的最后一行填充到数据框中。关于如何让它发挥作用有什么建议吗?

这是我目前所拥有的:

import requests
import pandas as pd
import numpy as np
from pandas import json_normalize
from functools import reduce


def parse_data(jsonData):
    results_df = pd.DataFrame()
    for alpha in jsonData['eventGroup']['offerCategories']:
    
        alpha_df = json_normalize(alpha).drop('offerSubcategoryDescriptors',axis=1)
     
        for theta in alpha['offerSubcategoryDescriptors']:
            theta_df = json_normalize(theta)
            theta_df.columns = [str(col) + '_offerssub' for col in theta_df.columns]
   
    temp_df = reduce(lambda left,right: pd.merge(left,right, left_index=True, right_index=True), [alpha_df,theta_df])
    results_df = results_df.append(temp_df, sort=True).reset_index(drop=True)
    return results_df

jsonData_dk_nba = requests.get('https://gaming-us-in.draftkings.com//sites/US-IN-SB/api/v2/eventgroup/103/full?includePromotions=true&format=json').json()

nba = parse_data(jsonData_dk_nba)
nbapanda=pd.DataFrame(nba)

【问题讨论】:

标签: python json pandas dataframe python-requests


【解决方案1】:

这是一个非常嵌套的响应。不确定您是否可以限制它或者您是否真的想要全部,但使用flatten_json 可以很好地提取所有数据。您只需导入模块。

如果您想要特定的优惠类别(似乎有 8 个),您可以为每个类别创建数据框。下面是一个数据框中的整个集合。

from flatten_json import flatten
dic_flattened = (flatten(d, '.') for d in jsonData_dk_nba['eventGroup']['offerCategories'])
df = pd.DataFrame(dic_flattened)
dft = df.T

df 输出这个(超过 16000 列和 8 行):

       offerCategoryId          name  ...  offerSubcategoryDescriptors.4.offerSubcategory.offers.0.0.outcomes.30.participant offerSubcategoryDescriptors.4.offerSubcategory.offers.0.0.outcomes.30.hidden
    0               62    Game Lines  ...                                                                                NaN                                                                          NaN
    1               66  Series Lines  ...                                                                                NaN                                                                          NaN
    2               65  Player Props  ...                                                                                NaN                                                                          NaN
    3               60        Halves  ...                                                                                NaN                                                                          NaN
    4               61      Quarters  ...                                                                                NaN                                                                          NaN
    5               59    Game Props  ...                                                                                NaN                                                                          NaN
    6               69  Team Futures  ...                                                                                NaN                                                                          NaN
    7              576        Awards  ...                                                                     Gregg Popovich                                                                         True
    
    [8 rows x 16282 columns]

有时转置它会使切片更容易一些 dft给你这个

                                                                                                   0              1                 2         3            4                    5                    6               7
offerCategoryId                                                                                   62             66                65        60           61                   59                   69             576
name                                                                                      Game Lines   Series Lines      Player Props    Halves     Quarters           Game Props         Team Futures          Awards
offerSubcategoryDescriptors.0.subcategoryId                                                      262            979              2198       233          239                 4097                 1422            4896
offerSubcategoryDescriptors.0.name                                                              Game  Series Winner  First Field Goal  1st Half  1st Quarter  First Team to Score  Championship Winner             MVP
offerSubcategoryDescriptors.0.offerSubcategory.name                                             Game  Series Winner  First Field Goal  1st Half  1st Quarter  First Team to Score  Championship Winner             MVP
...                                                                                              ...            ...               ...       ...          ...                  ...                  ...             ...
offerSubcategoryDescriptors.4.offerSubcategory.offers.0.0.outcomes.30.oddsDecimal                NaN            NaN               NaN       NaN          NaN                  NaN                  NaN           251.0
offerSubcategoryDescriptors.4.offerSubcategory.offers.0.0.outcomes.30.oddsDecimalDisplay         NaN            NaN               NaN       NaN          NaN                  NaN                  NaN           251.0
offerSubcategoryDescriptors.4.offerSubcategory.offers.0.0.outcomes.30.oddsFractional             NaN            NaN               NaN       NaN          NaN                  NaN                  NaN           250/1
offerSubcategoryDescriptors.4.offerSubcategory.offers.0.0.outcomes.30.participant                NaN            NaN               NaN       NaN          NaN                  NaN                  NaN  Gregg Popovich
offerSubcategoryDescriptors.4.offerSubcategory.offers.0.0.outcomes.30.hidden                     NaN            NaN               NaN       NaN          NaN                  NaN                  NaN            True

【讨论】:

  • 谢谢!这确实提供了所有数据。你会建议我如何格式化数据,以便所有游戏线进入他们自己的数据框,并且每个团队都填充在单独的行上?它们目前都在同一行。
  • 您需要找到适合 json_normalize 或 flatten_json 的整体字典级别。您可能必须通过 python 将其处理为可接受的列表,然后将其传递给 pandas。我只花了 10 分钟再看一遍,不清楚你想要什么,因为我并不真正了解数据本身。也许 Draftkings 有一个架构,或者尝试我在我的 cmets 中发布的库到原始问题。
猜你喜欢
  • 1970-01-01
  • 2018-07-09
  • 1970-01-01
  • 2017-08-18
  • 2019-04-26
  • 2022-12-22
  • 1970-01-01
  • 2012-11-26
相关资源
最近更新 更多