【问题标题】:Loading web scraping results into Pandas DataFrame将网页抓取结果加载到 Pandas DataFrame 中
【发布时间】:2018-03-17 01:03:01
【问题描述】:

我有以下代码:

sauce = urllib.request.urlopen('https://www.iproperty.com.my/sale/selangor/all-commercial/?q=UOA%20Business%20Park').read()
soup = bs.BeautifulSoup(sauce,'html.parser')

price = soup.find_all('ul',class_='listing-primary-price jMWEse')

BUA = soup.find_all('li',class_='attributes-price-per-unit-size-item builtUp-attr fsbnan')


for data in price:
    Price =  data.text
    print(Price)

for data in BUA:
    BUA =  data.text
    print(BUA)

打印 PriceBUA 给我以下结果:

Price:
RM 1,067,490
RM 2,246,160
RM 929,160
RM 1,321,000
RM 103,840,000

BUA:
Built-up : 1,227 sq. ft.Built-up : 1,227 sq. ft.
Built-up : 2,292 sq. ft.Built-up : 2,292 sq. ft.
Built-up : 1,044 sq. ft.Built-up : 1,044 sq. ft.
Built-up : 1,335 sq. ft.Built-up : 1,335 sq. ft.
Built-up : 118,000 sq. ft.Built-up : 118,000 sq. ft.

我的问题是,我如何将 PriceBUA 加载到 Pandas Dataframe 中,因为我想加入它们并打印最终结果,例如:

    Price:              BUA:        
0   RM 1,067,490        Built-up : 1,227 sq. ft.Built-up : 1,227 sq. ft.
1   RM 2,246,160        Built-up : 2,292 sq. ft.Built-up : 2,292 sq. ft.
2   RM 929,160          Built-up : 1,044 sq. ft.Built-up : 1,044 sq. ft.
3   RM 1,321,000        Built-up : 1,335 sq. ft.Built-up : 1,335 sq. ft.
4   RM 103,840,000      Built-up : 118,000 sq. ft.Built-up : 118,000 sq. ft.

我想将它们放入 Pandas Dataframe 的另一个原因是我稍后需要在 Excel 中进行一些计算。

【问题讨论】:

    标签: python pandas dataframe beautifulsoup


    【解决方案1】:

    我相信你需要:

    a = [data.text for data in price]
    b = [data.text for data in BUA]
    
    df = pd.DataFrame({'price':a, 'BUA':b}, columns=['price','BUA'])
    

    【讨论】:

    • 工作得很好!!谢谢!
    【解决方案2】:
      df = pd.DataFrame()
      df['price'] = [data.text for data in price]
      df['bua'] = [data.text for data in bua]
    

    【讨论】:

      猜你喜欢
      • 2021-06-07
      • 2019-01-10
      • 2019-07-06
      • 2016-07-23
      • 1970-01-01
      • 2015-02-19
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多