【问题标题】:Different behaviour between double quoted and single quoted string when using urllib in python在python中使用urllib时双引号和单引号字符串之间的不同行为
【发布时间】:2015-02-28 07:20:35
【问题描述】:

我是 python 新手,我了解到单引号和双引号字符串之间没有区别。 但我发现了一些不同的行为。

from bs4 import BeautifulSoup
import urllib.request

url1 = "http://www.backpackers.com.tw/forum/forumdisplay.php?f=310"
url2 = 'http://www.backpackers.com.tw/forum/forumdisplay.php?f=310'

如果我跑:

response = urllib.request.urlopen(url1)

结果:脚本完成且没有错误

如果我跑步:

response = urllib.request.urlopen(url2)

结果:错误

C:\Users\user1\Desktop\scrape>python backpacker_tw.py
Traceback (most recent call last):
  File "C:\Python34\lib\urllib\request.py", line 1189, in do_open
    h.request(req.get_method(), req.selector, req.data, headers)
  File "C:\Python34\lib\http\client.py", line 1090, in request
    self._send_request(method, url, body, headers)
  File "C:\Python34\lib\http\client.py", line 1128, in _send_request
    self.endheaders(body)
  File "C:\Python34\lib\http\client.py", line 1086, in endheaders
    self._send_output(message_body)
  File "C:\Python34\lib\http\client.py", line 924, in _send_output
    self.send(msg)
  File "C:\Python34\lib\http\client.py", line 859, in send
    self.connect()
  File "C:\Python34\lib\http\client.py", line 836, in connect
    self.timeout, self.source_address)
  File "C:\Python34\lib\socket.py", line 509, in create_connection
    raise err
  File "C:\Python34\lib\socket.py", line 500, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [WinError 10061] No connection could be made because the
 target machine actively refused it

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "backpacker_tw.py", line 7, in <module>
    response = urllib.request.urlopen(url2)
  File "C:\Python34\lib\urllib\request.py", line 153, in urlopen
    return opener.open(url, data, timeout)
  File "C:\Python34\lib\urllib\request.py", line 455, in open
    response = self._open(req, data)
  File "C:\Python34\lib\urllib\request.py", line 473, in _open
    '_open', req)
  File "C:\Python34\lib\urllib\request.py", line 433, in _call_chain
    result = func(*args)
  File "C:\Python34\lib\urllib\request.py", line 1215, in http_open
    return self.do_open(http.client.HTTPConnection, req)
  File "C:\Python34\lib\urllib\request.py", line 1192, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [WinError 10061] No connection could be ma
de because the target machine actively refused it>

这是一个错误还是我错过了什么?

C:\Users\user1\Desktop\scrape>python -V
Python 3.4.1

【问题讨论】:

  • 他们都为我工作。你重试了吗?由于某些连接或服务器故障,它可能只是第二次失败。
  • 如果我相信en.wikipedia.org/wiki/Percent-encoding,撇号是一个有效的 URI 字符,但是最新的 RFC 说它是一个保留字符,应该被编码为 %27 (当以这种方式编码时,它实际上在我的浏览器中工作)。
  • @MohitBhasi:当然,URL 中的撇号应该是百分比编码的,但该 URL 中没有撇号。 FWIW,当您将字符串文字传递给函数时,该函数只接收字符串字符,而不是分隔符 - 无论您使用 '" 还是三重引用,或者是否由某些str 方法等动态创建。
  • 也许该网站不喜欢您使用脚本来访问它。也许尝试更改User-Agent

标签: python beautifulsoup urllib python-3.4


【解决方案1】:

到文档! PEP 8,几乎所有关于 python 代码格式的内容都结束了,声明“在 Python 中,单引号字符串和双引号字符串是相同的。”这是由 Python 的创建者编写的,我相信他的话。

查看您的堆栈跟踪,我看到了错误 No connection could be ma de because the target machine actively refused it,所以这可能意味着当时服务器出现了问题?

【讨论】:

    猜你喜欢
    • 2016-02-10
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-11-09
    • 1970-01-01
    相关资源
    最近更新 更多