【问题标题】:Multiprocessing Pool usage with requests请求的多处理池使用情况
【发布时间】:2021-08-13 15:35:03
【问题描述】:

美好的一天

我正在开发一个目录扫描器,并试图尽可能加快它的速度。我一直在研究使用多处理,但是我不相信我使用它正确。

from multiprocessing import Pool
import requests
import sys

def dir_scanner(wordlist=sys.argv[1],dest_address=sys.argv[2],file_ext=sys.argv[3]):
    print(f"Scanning Target: {dest_address} looking for files ending in {file_ext}")
    # read a wordlist
    dir_file = open(f"{wordlist}").read() 
    dir_list = dir_file.splitlines()

    # empty list for discovered dirs
    discovered_dirs = []

    # make requests for each potential dir location 
    for dir_item in dir_list:
        req_url = f"http://{dest_address}/{dir_item}.{file_ext}"
        req_dir = requests.get(req_url)
        print(req_url)
        if req_dir.status_code==404:
            pass

        else:
            print("Directroy Discovered ", req_url)
            discovered_dirs.append(req_url)

    with open("discovered_dirs.txt","w") as f:
        for directtories in discovered_dirs:
            print(req_url,file=f)

if __name__ == '__main__':
    with Pool(processes=4) as pool:
        dir_scanner(sys.argv[1],sys.argv[2],sys.argv[3])

上面的例子是Pool的正确用法吗?最终,我试图加快向目标发出的请求。

更新:也许不是最优雅的解决方案,但是:

from multiprocessing import Pool
import requests
import sys

# USAGE EXAMPLE: python3 dir_scanner.py <wordlist> <target address> <file extension>

discovered_dirs = []
# read in the wordlist
dir_file = open(f"{sys.argv[1]}").read() 
dir_list = dir_file.splitlines()

def make_request(dir_list):
        # create a GET request URL base on items in the wordlist
        req_url = f"http://{sys.argv[2]}/{dir_list}.{sys.argv[3]}"
        return req_url, requests.get(req_url)
# map the requests made by make_requests to speed things up
with Pool(processes=4) as pool:
    for req_url, req_dir in pool.map(make_request, dir_list):
        # if the request resp is a 404 move on
        if req_dir.status_code == 404:
            pass
        # if not a 404 resp then add it to the list
        else:
            print("Directroy Discovered ", req_url)
            discovered_dirs.append(req_url)
    # create a new file and append it with directories that were discovered
    with open("discovered_dirs.txt","w") as f:
        for directories in discovered_dirs:
            print(req_url,file=f)

【问题讨论】:

    标签: python python-requests python-multiprocessing


    【解决方案1】:

    现在,您正在创建一个池而不是使用它。 您可以使用pool.map 将请求分发到多个进程中:

    ...
    def make_request(dir_item):
        req_url = f"http://{dest_address}/{dir_item}.{file_ext}"
        return req_url, requests.get(req_url)
    
    with Pool(processes=4) as pool:
        for req_url, req_dir in pool.map(make_request, dir_list):
            print(req_url)
            if req_dir.status_code == 404:
                pass
    
            else:
                print("Directroy Discovered ", req_url)
                discovered_dirs.append(req_url)
    ...
    

    在上面的示例中,函数make_request 在子进程中执行。

    Python documentation 举了很多例子。

    【讨论】:

      猜你喜欢
      • 2019-10-04
      • 1970-01-01
      • 1970-01-01
      • 2015-02-18
      • 1970-01-01
      • 2014-11-16
      • 2016-01-07
      • 1970-01-01
      • 2021-10-20
      相关资源
      最近更新 更多