有没有一种简单的方法可以从文本文件读取到这个漂亮的汤库 python 脚本？答案

【问题标题】：Is there a simple way to readlines from text file to this beautiful soup lib python script?有没有一种简单的方法可以从文本文件读取到这个漂亮的汤库 python 脚本？
【发布时间】：2020-08-09 06:54:46
【问题描述】：

如何将 txt.file 中的行读入此脚本，而不必在脚本中列出 url？谢谢

from bs4 import BeautifulSoup
import requests

url = "http://www.url1.com"

response = requests.get(url)

data = response.text

soup = BeautifulSoup(data, 'html.parser')

categories = soup.find_all("a", {"class":'navlabellink nvoffset nnormal'})

for category in categories:
    print(url + "," + category.text)

我的 text.file 内容有一个换行符分隔符：

http://www.url1.com
http://www.url2.com
http://www.url3.com
http://www.url4.com
http://www.url5.com
http://www.url6.com
http://www.url7.com
http://www.url8.com
http://www.url9.com

【问题讨论】：

标签： python beautifulsoup python-requests text-files readline

【解决方案1】：

要从a.txt 读取 URL，您可以使用以下脚本：

import requests
from bs4 import BeautifulSoup


with open('a.txt', 'r') as f_in:
    for line in map(str.strip, f_in):
        if not line:
            continue

        response = requests.get(line)
        data = response.text
        soup = BeautifulSoup(data, 'html.parser')
        categories = soup.find_all("a", {"class":'navlabellink nvoffset nnormal'})

        for category in categories:
            print(url + "," + category.text)

【讨论】：

【解决方案2】：

file1 = open('text.file', 'r') 
Lines = file1.readlines() 

count = 0
# Strips the newline character 
for line in Lines: 
    print("Line{}: {}".format(count, line.strip()))

你只需用 url 变量替换你的行

【讨论】：

【解决方案3】：

为了这个例子，假设您的文件名为urls.txt。在 Python 中，打开文件并读取其内容非常容易。

with open('urls.txt', 'r') as f:
    urls = f.read().splitlines()
#Your list of URLs is now in the urls list!

'urls.txt' 之后的 'r' 只是告诉 Python 以读取模式打开文件。如果您不需要修改文件，最好以只读模式打开它。 f.read() 返回文件的全部内容，但它包含换行符 (\n)，因此splitlines() 将删除这些字符并为您创建一个列表。

【讨论】：