Martinaoh

针对弹幕的爬取我们如果只需要获取看到的网页里面的而数据,使用selenium就能实现,对于直播平台来说,往往有第三方平台api让你获取数据(可以获取发弹幕,发弹幕者的名字礼物等等,这需要客户端向弹幕服务器发送登录请求,心跳信息的发送等等)只获取弹幕信息储存到txt文件中,上代码,上图片


代码如下:

import time
from selenium import webdriver

chrome_options = webdriver.ChromeOptions()
# 使用headless无界面浏览器模式
# chrome_options.add_argument(\'--headless\')
# chrome_options.add_argument(\'--disable-gpu\')
prefs = {"profile.managed_default_content_settings.images": 2}
chrome_options.add_experimental_option("prefs", prefs)
browser = webdriver.Chrome(chrome_options=chrome_options)
url = \'https://www.douyu.com/\'


def getDanmu(homeId):
    homeHref = url+str(homeId)
    browser.get(homeHref)

    while 1: 
        time.sleep(2)
        try:
            for i in browser.find_elements_by_xpath(\'.//div[@class=" danmu-6e95c1"]/div/div\'):
                if len(i.text) > 0:
                    try:
                        print(i.text)
                    except:
                        pass
                    saveDanmu(i.text)
                else:
                    continue
        except:
            time.sleep(2)
            for i in browser.find_elements_by_xpath(\'.//div[@class=" danmu-6e95c1"]/div/div\'):
                if len(i.text) > 0:
                    try:
                        print(i.text)
                    except:
                        pass
                    saveDanmu(i.text)
                else:
                    continue


def saveDanmu(danmu):
    with open(\'danmu.txt\', \'a+\', encoding=\'utf-8\')as f:
        f.write(danmu+\'\n\')

if __name__ == \'__main__\':
    num = input(\'请输入需要查询的房间号:\')
    getDanmu(num)

 

 

这里爬了大司马的直播间, 芜湖起飞~

 

分类:

技术点:

相关文章: