【问题标题】:Python web crawling error: transporting the data to DBPython网络爬取错误:将数据传输到数据库
【发布时间】:2021-08-14 06:39:41
【问题描述】:

Here's the error

导入请求 进口时间 从 bs4 导入 BeautifulSoup 从 DB.model 导入 CrawlingBook 从日期时间导入日期时间

url = "https://www.aladin.co.kr/shop/common/wbest.aspx?BestType=Bestseller&BranchType=1&CID=0&cnt=1000&SortOrder=1&page=" 对于范围内的 i (1,20):

pageUrl = url + str(i)
response = requests.get(pageUrl)
html = response.text


parsedHtml = BeautifulSoup(html, 'html.parser')
tableList = parsedHtml.select('#Myform .ss_book_box')

for book in tableList:
    imgUrl = book.select('table')[0].select('img')[0].get('src')
    title = book.select('.ss_book_list')[0].select('ul .bo3')[0].text

    
    authorIndex = 1;
    if(book.select('.ss_book_list')[0].select('ul .ss_ht1')):
        authorIndex = 2;
        author = book.select('.ss_book_list')[0].select('ul li')[authorIndex].select('a')[0].text
    else:
        author = book.select('.ss_book_list')[0].select('ul li')[authorIndex].select('a')[0].text
    
    now = datetime.now()

    crawlingBook = CrawlingBook()
    crawlingBook.title = title
    crawlingBook.author_name = author
    crawlingBook.img_url = imgUrl
    crawlingBook.create_at = str(now)
    
print(i, '페이지 크롤링 완료...')
time.sleep(1)

我想爬取图书信息(title, author_name, img_url) 并查看数据的创建时间。但我坚持将数据传输到我的数据库(MySQL)中。任何帮助的话将不胜感激。

【问题讨论】:

    标签: python database web-crawler


    【解决方案1】:

    根据报错,你没有安装pymysql库。

    你可以这样做

    pip install PyMySQL
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2011-10-24
      • 1970-01-01
      • 2023-01-26
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2016-11-02
      • 2015-09-12
      相关资源
      最近更新 更多