【发布时间】:2020-07-18 10:33:41
【问题描述】:
以下代码尝试使用 Instagram 抓取工具 (https://github.com/realsirjoe/instagram-scraper) 从 10 个 instagram 帖子中提取 10 个 instagram cmets。 遇到的错误是 TypeError(NoneType 对象不可下标)。
from igramscraper.instagram import Instagram
from time import sleep
import pandas as pd
import requests
instagram = Instagram()
pepsi = instagram.get_account('pepsi')
pepsi.media = instagram.get_medias("pepsi", 10)
p = [pp.__dict__ for pp in pepsi.media]
df = pd.DataFrame(p)
df['link']
comments_df=pd.DataFrame(columns = ["link","post_id", "comment"])
x1=10;
def get_media_id(url):
req = requests.get('https://api.instagram.com/oembed/?url={}'.format(url))
media_id = req.json()['media_id']
return media_id
def get_posts(link):
id=get_media_id(link)
comment0 = instagram.get_media_comments_by_id(id, 10)
cdf=pd.DataFrame(columns = ["link","post_id", "comment"])
for comment in comment0['comments']:
cdf=cdf.append({"link":link,"post_id":id,"comment":comment.text},ignore_index = True)
return cdf;
for index, row in df.iterrows():
comments_df=comments_df.append(get_posts(row["link"]), ignore_index = True)
comments_df
我遇到的问题如下:
TypeError Traceback (most recent call last)
<ipython-input-27-8e72850a113a> in <module>
56
57 for index, row in df.iterrows():
---> 58 comments_df=comments_df.append(get_posts(row["link"]), ignore_index = True)
59
60
<ipython-input-27-8e72850a113a> in get_posts(link)
41 id=get_media_id(link)
42
---> 43 comment0 = instagram.get_media_comments_by_id(id, 10)
44
45 cdf=pd.DataFrame(columns = ["link","post_id", "comment"])
TypeError: 'NoneType' object is not subscriptable
【问题讨论】:
-
到目前为止你有什么尝试?该错误意味着没有可用的数据。
标签: python python-3.x pandas web-scraping python-requests