【发布时间】:2021-07-21 12:31:07
【问题描述】:
我编写了一个脚本来提取 YouTube 的视频 cmets 并将其存储在给定视频 ID 的文件中。如果视频少于 10-15 cmets,则没有问题,脚本运行良好,但是当有更多时,它会进入无限循环,我不知道为什么。
from googleapiclient.discovery import build
import os
api_key = '...'
def video_comments(video_id):
# empty file for storing comments
outputFile = open("comments_"+video_id+".txt", "w", encoding='utf-8')
# empty dictionnary to store the data
commentsDict = []
# empty list for storing reply
replies = []
# creating youtube resource object
youtube = build('youtube', 'v3',
developerKey=api_key)
# retrieve youtube video results
video_response=youtube.commentThreads().list(
part='snippet,replies',
videoId=video_id
).execute()
# iterate video response
while video_response:
# extracting required info
# from each result object
for item in video_response['items']:
# Extracting comments
comment = item['snippet']['topLevelComment']['snippet']['textDisplay']
commentEntrie = {"comment": comment, 'replies': []}
# counting number of reply of comment
replycount = item['snippet']['totalReplyCount']
# if reply is there
if replycount>0:
# iterate through all reply
for reply in item['replies']['comments']:
# Extract reply
reply = reply['snippet']['textDisplay']
# Store reply is list
replies.append(reply)
commentEntrie['replies'].append(reply)
# print comment with list of reply
print(comment, replies, end = '\n\n')
outputFile.write("%s" % comment)
outputFile.write("%s\n" % replies)
commentsDict.append(commentEntrie)
# empty reply list
replies = []
# Again repeat
if 'nextPageToken' in video_response:
video_response = youtube.commentThreads().list(
part = 'snippet,replies',
videoId = video_id
).execute()
else:
break
outputFile.close()
print(commentsDict)
# Enter video id
video_id = "aDHYbM9OqUc"
# Call function
video_comments(video_id)
我可以提供两个视频ID,这个LVgKlfw4DHc 工作正常,但这个以无限循环结束aDHYbM9OqUc
有什么想法吗?
[编辑] 我觉得nextPageToken 总是在这里,它会无限地运行
【问题讨论】:
标签: python loops youtube-api youtube-data-api