【发布时间】:2017-05-07 00:52:36
【问题描述】:
我正在开发一个 python 程序来从here 中抓取数据。我以前也有过成功,但这一次对我来说是一个挑战。我正在使用漂亮的汤和机械化。我需要能够在文本框中输入邮政编码以生成我想要的结果。
这是包含输入文本框的 sn-p:
<div id="ContentPlaceHolder1_C001_pnlFindACenter" onkeypress="javascript:return WebForm_FireDefaultButton(event, 'ContentPlaceHolder1_C001_btnSearchClient')">
<div style="width: 400px; float: left; padding-top: 5px;">
<label for="ContentPlaceHolder1_C001_tbUserAddress" style="font-family: Arial; font-size: 13.3333px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; line-height: normal; text-decoration: none; text-transform: none; color: rgb(0, 0, 0); cursor: auto; display: inline-block; position: relative; z-index: 100; margin-right: -121px; left: 2px; top: 0px; opacity: 1;">Address, City or Zip:</label><input name="ctl00$ContentPlaceHolder1$C001$tbUserAddress" type="text" id="ContentPlaceHolder1_C001_tbUserAddress" class="textInField" style="width: 240px; background-image: url("data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAASCAYAAABSO15qAAAAAXNSR0IArs4c6QAAAPhJREFUOBHlU70KgzAQPlMhEvoQTg6OPoOjT+JWOnRqkUKHgqWP4OQbOPokTk6OTkVULNSLVc62oJmbIdzd95NcuGjX2/3YVI/Ts+t0WLE2ut5xsQ0O+90F6UxFjAI8qNcEGONia08e6MNONYwCS7EQAizLmtGUDEzTBNd1fxsYhjEBnHPQNG3KKTYV34F8ec/zwHEciOMYyrIE3/ehKAqIoggo9inGXKmFXwbyBkmSQJqmUNe15IRhCG3byphitm1/eUzDM4qR0TTNjEixGdAnSi3keS5vSk2UDKqqgizLqB4YzvassiKhGtZ/jDMtLOnHz7TE+yf8BaDZXA509yeBAAAAAElFTkSuQmCC"); background-repeat: no-repeat; background-attachment: scroll; background-size: 16px 18px; background-position: 98% 50%; cursor: auto;" data-hasqtip="21" oldtitle="Address, City or Zip:" title="" autocomplete="off" aria-describedby="qtip-21">
<div id="divDistance" style="display: inline;">
within
<select name="ctl00$ContentPlaceHolder1$C001$ddlRadius" id="ContentPlaceHolder1_C001_ddlRadius">
<option value="5">5</option>
<option value="10">10</option>
<option selected="selected" value="25">25</option>
<option value="50">50</option>
<option value="100">100</option>
</select>
miles
</div>
</div>
<div style="width: 160px; float: left;">
<input type="submit" name="ctl00$ContentPlaceHolder1$C001$btnSearchClient" value="Search" onclick="GeocodeLocation();return false;WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("ctl00$ContentPlaceHolder1$C001$btnSearchClient", "", false, "", "find-a-center", false, false))" id="ContentPlaceHolder1_C001_btnSearchClient" class="btnCenter">
</div>
<div style="clear: both;">
</div>
<div>
<span onchange="" style="font-size:12px;display: inline;" data-hasqtip="22" oldtitle="<b>AASM SleepTM</b> is an innovative telemedicine system that brings your sleep doctor to you. Featuring a secure, web-based video platform, AASM SleepTM allows you to meet with your sleep doctor from a distance. These live video visits will save you time and money. AASM SleepTM also syncs with Fitbit sleep data and has an integrated sleep diary, enabling you and your doctor to monitor your sleep." title="" aria-describedby="qtip-22"><input id="ContentPlaceHolder1_C001_chkSleepTM" type="checkbox" name="ctl00$ContentPlaceHolder1$C001$chkSleepTM"><label for="ContentPlaceHolder1_C001_chkSleepTM">Only show AASM SleepTM capable sleep centers in my state</label></span>
<a href="https://sleeptm.com/" style="font-size: 10px; margin-left: 10px; display: inline;" target="_blank" data-hasqtip="23" oldtitle="<b>AASM SleepTM</b> is an innovative telemedicine system that brings your sleep doctor to you. Featuring a secure, web-based video platform, AASM SleepTM allows you to meet with your sleep doctor from a distance. These live video visits will save you time and money. AASM SleepTM also syncs with Fitbit sleep data and has an integrated sleep diary, enabling you and your doctor to monitor your sleep." title="" aria-describedby="qtip-23">What is AASM SleepTM?</a>
</div>
</div>
到目前为止,这些都是我的尝试
url = 'http://www.sleepeducation.org/find-a-facility'
MILES = '100'
CODE = '33060'
尝试一下
first = urllib2.Request(url,
data=urllib.urlencode({'value': CODE}),
headers={'User-Agent' : 'Google Chrome' 'Cookie': 'name = ctl00$ContentPlaceHolder1$C001$tbUserAddress'})
尝试两次
post_params = {
'ctl00$ContentPlaceHolder1$C001$tbUserAddress': CODE
}
first = urllib.urlencode(post_params)
driver = webdriver.Chrome()
driver.get(url)
sbox = driver.find_element_by_class_name("ctl00$ContentPlaceHolder1$C001$tbUserAddress")
sbox.send_keys(CODE)
driver.find_element_by_class_name("ctl00$ContentPlaceHolder1$C001$btnSearchClient").click()
尝试 3
br = mechanize.Browser()
br.open(url)
br.select_form(name='ctl00$ContentPlaceHolder1$C001$tbUserAddress')
br['value'] = CODE
br.submit()
http = urllib2.urlopen(br.response())
soup = BeautifulSoup(http, "html5lib")
Error = "没有匹配名称的表单 'ctl00$ContentPlaceHolder1$C001$tbUserAddress'"
尝试 4
soup.find('input', {'name': 'ctl00$ContentPlaceHolder1$C001$tbUserAddress'})['value'] = CODE
soup.find('input', {'name': 'ctl00$ContentPlaceHolder1$C001$btnSearchClient'}).click()
【问题讨论】:
标签: python html beautifulsoup mechanize