【问题标题】:Python Webscrape (using BeautifulSoup) questionPython Webscrape(使用 BeautifulSoup)问题
【发布时间】:2022-01-05 05:02:47
【问题描述】:

我正在尝试对该网站https://www.edgeprop.sg/condo-apartment/aquarius-by-the-park 进行网络抓取,以获取概览表中的土地面积(平方米)。结果应该给我 40,608

但是,我无法得到我想要的结果。这是我的代码:

#[Python] test webscrape on edgeprop
import gspread
import json
from oauth2client.service_account import ServiceAccountCredentials
from openpyxl.worksheet import worksheet
from requests.api import request
import requests
import time
from requests.models import Response
import scrapy
from bs4 import BeautifulSoup
from six import add_metaclass, class_types


query_string='https://www.edgeprop.sg/condo-apartment/aquarius-by-the-park'  
resp = requests.get(query_string)   
soup = BeautifulSoup(resp.content,'html.parser')
print("soup is: ", query_string)

try:
    landsize = soup.find_all("h4",class_="detail-title__text")
    print("Landsize is: ", landsize)

except IndexError:
    pass

【问题讨论】:

    标签: python web-scraping beautifulsoup python-requests


    【解决方案1】:

    试试这个:

    import json
    import requests
    from bs4 import BeautifulSoup
    
    query_string='https://www.edgeprop.sg/condo-apartment/aquarius-by-the-park'  
    
    resp = requests.get(query_string) 
      
    soup = BeautifulSoup(resp.content,'html.parser')
    
    # get data with all info
    data = soup.find("script", id="__NEXT_DATA__").text
    
    # convert string to python dict
    json_data = json.loads(data)
    
    # get land_size from dict
    print(json_data["props"]["pageProps"]["projectInfo"]["data"]["land_size"])
    

    【讨论】:

    • 非常感谢!老实说,我不知道这是如何工作的,但我会尝试弄清楚!非常感谢
    • @Chrislee 如果您想更好地了解发生了什么,请尝试将您的回复保存为文件并查看内部
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-09-29
    • 2021-09-29
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-05-08
    相关资源
    最近更新 更多