【发布时间】:2015-08-10 14:53:16
【问题描述】:
javascript 文件如下所示:
states_arr['Chittoor']= new Array( "Kurnool (Abbas Nagar)# 9247001529 # H. No. 80-11/111, ; Beside ICICI Bank ATM, ; Near Krishna Nagar Railway Gate, ; Abbas Nagar, Kurnool.","Kurnool # 9247001530 # H. No. 46/694, Near Annapurna Hotel, Opp. Govt Hospital, Budawarpet, Kurnool. " );
我想从 js 文件中从第二个“#”符号之后开始的所有数组中提取地址,即“H. No. 80-11/111,;在 ICICI 银行 ATM 旁边,;靠近 Krishna Nagar 铁路门, ; 阿巴斯·纳加尔, Kurnool.", “H. No. 46/694, Near Annapurna Hotel, Opp. Govt Hospital, Budawarpet, Kurnool。”
以上完整的 javascript 文件可在以下位置获得: http://www.heteropharmacy.com/jScript/myScript.js
我正在使用 BeautifulSoup,这是我的错误代码:
soup = BeautifulSoup(html_doc)
script = soup.find_all("script")
pattern = re.compile(r" (?<=[0-9]\s#\s).+")
while pattern.search(script):
line1 = pattern.search(script)
print line1
这个文件需要转成json格式。
【问题讨论】:
-
下面的答案有帮助吗?
标签: javascript python json beautifulsoup