1.Spider(整个爬虫的调度框架) 2.Downloader(页面下载) 3.PageProcessor(链接提取和页面分析) 4.Scheduler(URL管理) 5.Pipeline(离线分析和持久化) 相关文章: 2021-10-03 2021-10-05 2021-09-29 2022-02-02 2022-12-23 2021-07-18 2022-02-21