【发布时间】:2020-09-13 08:03:17
【问题描述】:
奇怪!我将Beautiful Soup 4 和Pandas 与Python 3.8 一起使用。我正在尝试使用requests 从网页中抓取数据。我的代码在第一页上打印出所有属性,但无法将它们全部写入 CSV 文件,而是只写入最后一项。我在这里想念什么?代码如下。
listofProperties = []
for property in properties:
pdFrame = {}
pdFrame["House Price"] = property.find("a", class_ = "text-price").text.replace("\n", "").replace("Offers over", "").replace(" ", "")
pdFrame["House Address"] = property.find("a", class_ = "listing-results-address").text
pdFrame["Number of Beds"] = property.find("span", class_ = "num-beds")['title']
pdFrame["Number of Baths"] = property.find("span", class_ = "num-baths")['title']
pdFrame["Number of Reception Rooms"] = property.find("span", class_ = "num-reception")['title']
pdFrame["Sold By"] = property.find("p", class_ = "listing-results-marketed").find("span").text
propertyArea = random.randint(300,2200)
propertyArea = str(propertyArea)
pdFrame["House Size"] = propertyArea + " sq. ft"
pdFrame["Agent Phone"] = property.find("span", class_ = "agent_phone").text.replace(" **", "").replace("\n", "")
listofProperties.append(pdFrame)
print(len(listofProperties))
print(listofProperties)
df = pandas.DataFrame(listofProperties)
df.to_csv("scrapedProperties.csv", index=False)
任何帮助表示赞赏。 谢谢!
【问题讨论】:
-
pdFrame = {} 应该在循环之外。您每次都将其初始化为空并获取最后的数据。或者你可以把 append 放在循环中。
-
将
listofProperties.append(pdFrame)移动到for循环中。
标签: python python-3.x pandas dataframe beautifulsoup