【发布时间】:2017-07-19 16:51:13
【问题描述】:
我正在尝试从网站中提取数据。元素被隐藏。当我尝试“查看源代码”时,不显示标题文本。
<h4 data-bind="Text: Name"></h4>
但是当我尝试检查时,有文本可见。
<h4 data-bind="Text: Name">STM1F-1S-HC</h4>
使用的代码是:
def getlink(link):
try:
f = urllib.request.urlopen(link)
soup0 = BeautifulSoup(f)
except Exception as e:
print (e)
soup0 = 'abc'
for row2 in soup0.findAll("h4",{"data-bind":"text: Name"}):
Name = row2.text
print(Name)
#code to find all links to the products for further processing.
i=1
global i
for row in r1.findAll('a', { "class" : "col-xs-12 col-sm-6" }):
link = 'https://www.truemfg.com/USA-Foodservice/'+row['href']
print(link)
getlink(link)
print(productcount)
输出是:
https://www.truemfg.com/USA-Foodservice/Products/Traditional-Reach-Ins
C:\Users\Santosh\Anaconda3\lib\site-packages\bs4\__init__.py:181: UserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("lxml"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.
The code that caused this warning is on line 193 of the file C:\Users\Santosh\Anaconda3\lib\runpy.py. To get rid of this warning, change code that looks like this:
BeautifulSoup([your markup])
to this:
BeautifulSoup([your markup], "lxml")
markup_type=markup_type))
https://www.truemfg.com/USA-Foodservice/Products/Specification-Series
https://www.truemfg.com/USA-Foodservice/Products/Food-Prep-Tables
https://www.truemfg.com/USA-Foodservice/Products/Undercounters
https://www.truemfg.com/USA-Foodservice/Products/Worktops
https://www.truemfg.com/USA-Foodservice/Products/Chef-Bases
https://www.truemfg.com/USA-Foodservice/Products/Milk-Coolers
https://www.truemfg.com/USA-Foodservice/Products/Glass-Door-Merchandisers
https://www.truemfg.com/USA-Foodservice/Products/Air-Curtains
https://www.truemfg.com/USA-Foodservice/Products/Display-Cases
https://www.truemfg.com/USA-Foodservice/Products/Underbar-Refrigeration
我们发现没有打印名字。
谁能告诉我一个打印名称的解决方案。
谢谢, 桑托什
【问题讨论】:
标签: python html data-binding web-scraping data-extraction