从 Scrapy 获取统计信息的 Python Telegram 机器人答案

【问题标题】：Python Telegram bot that gets stats from Scrapy从 Scrapy 获取统计信息的 Python Telegram 机器人
【发布时间】：2020-08-01 11:32:52
【问题描述】：

我想编写一个 Telegram 机器人，它可以应要求提供 Scrapy 统计信息。我的尝试大部分都有效，唯一的问题是强制关闭蜘蛛（显然）不会停止机器人。

所以我有两个问题：

我的一般方法是否正确？
是否可以在强制关闭爬虫的情况下关闭机器人？

这里是相关的类：

class TelegramBot(object):
    telegram_token = telegram_credentials.token

    @classmethod
    def from_crawler(cls, crawler):
        return cls(crawler)

    def __init__(self, crawler):
        self.crawler = crawler

        cs = crawler.signals
        cs.connect(self._spider_closed, signal=signals.spider_closed)

        """Start the bot."""
        # Create the Updater and pass it your bot's token.
        # Make sure to set use_context=True to use the new context based callbacks
        # Post version 12 this will no longer be necessary
        self.updater = Updater(self.telegram_token, use_context=True)

        # Get the dispatcher to register handlers
        dp = self.updater.dispatcher

        # on different commands - answer in Telegram
        dp.add_handler(CommandHandler("stats", self.stats))

        # Start the Bot
        self.updater.start_polling()

    def _spider_closed(self, spider, reason):
        # Stop the Bot
        self.updater.stop()

    def stats(self, update, context):
        # Send a message with the stats
        msg = (
            "Spider "
            + self.crawler.spider.name
            + " stats: "
            + str(self.crawler.stats.get_stats())
        )

        update.message.reply_text(msg)

在这里你可以找到我在 Scrapy 教程引用蜘蛛 https://github.com/jtommi/scrapy_telegram-bot_example/blob/master/tutorial/tutorial/telegram-bot.py 中的完整代码

我的代码是一个组合

“Learning Scrapy”一书中的延迟扩展https://github.com/scalingexcellence/scrapybook/blob/master/ch09/properties/properties/latencies.py
来自 python-telegram-bot 库 https://github.com/python-telegram-bot/python-telegram-bot/blob/master/examples/echobot.py 的 echobot 示例
关于统计收集的官方scrapy文档https://docs.scrapy.org/en/latest/topics/stats.html

【问题讨论】：

标签： python-3.x scrapy twisted python-telegram-bot

【解决方案1】：

调用updater.stop() 肯定会阻止机器人。来自python-telegram-bot 的文档，

 """Stops the polling/webhook thread, the dispatcher and the job queue."""

检查蜘蛛关闭后是否调用了updater.stop()。机器人可能不会立即停止，但最终会停止。

【讨论】：

我认为这里的问题不是机器人，而是 Scrapy。当 _spider_closed 被调用时，机器人会停止。但是当我按 Ctrl-C 两次时，蜘蛛没有完全关闭，所以机器人没有停止。问题是是否有办法在我按两次 Ctrl-C 的情况下运行代码
你可以使用KeyboardInterrupt。当您按下 Ctrl-C 时，将引发 KeyboardInterrupt 异常，您可以在代码中捕获此异常，并优雅地停止蜘蛛。