【发布时间】:2020-07-19 11:10:57
【问题描述】:
我正在尝试从 youtube 教程等创建网络爬虫。
我遇到了一个问题,我有一个不带参数的类(其他类确实带参数,结构或多或少相同)
类 (crawler.py)。我正在使用__init__,我在其中接受 3 个参数。
class Crawler:
# class variables are shared among all crawler instances
project_name = ''
home_url = ''
site_domain = ''
# use set to speed up read/write process
file_queue = ''
queue = set()
# use set to speed up read/write process
file_crawled = ''
crawled = set()
def __init__(self, project_name, home_url, site_domain):
Crawler.project_name = project_name
Crawler.home_url = home_url
Crawler.site_domain = site_domain
Crawler.file_queue = Crawler.project_name + '/links_on_queue.txt'
Crawler.file_crawled = Crawler.project_name + '/links_crawled.txt'
self.starter(Crawler.site_domain)
self.crawl_page('first_crawler', Crawler.home_url)
调用它的位置,在最后一行 (main.py)
import threading
from queue import Queue
# import from files
from crawler import Crawler
from domain_finder import *
from general_crawler_functions import *
# like multiple group members doing different parts, the program
# creates multiple thread that works simultaneously
PROJECT_NAME = 'Demoblaze'
HOME_URL = 'https://www.demoblaze.com/'
DOMAIN_NAME = get_domain(HOME_URL)
FILE_QUEUE = PROJECT_NAME + '/links_on_queue.txt'
FILE_CRAWLED = PROJECT_NAME + '/links_crawled.txt'
THREAD_COUNT = 4
# queue of threads
queue = Queue()
Crawler(PROJECT_NAME, HOME_URL, DOMAIN_NAME)
错误说这个类没有参数,3个给定
如果有帮助,我正在 Windows 上使用 PyCharm 社区
【问题讨论】:
-
Crawler.project_name ==>self.project_name =,其他的也一样 -
请在您的实际代码中缩进类 Crawler LIKE
-
代码中的缩进和这里一样吗?它看起来不对
-
我的缩进不好,粘贴在这里时会发生
-
您确定
Crawler没有被您的一个(或两个)明星导入覆盖/遮蔽吗?from general_crawler_functions import *听起来特别像是可以包含它。
标签: python python-3.x web-crawler