Note
The urllib2 module has been split across several modules in Python 3 named urllib.request and urllib.error. The 2to3 tool will automatically adapt imports when converting your sources to Python 3.
The urllib2 module defines functions and classes which help in opening URLs (mostly HTTP) in a complex world — basic and digest authentication, redirections, cookies and more.
The urllib2 module defines the following functions:
-
urllib2.urlopen(url[, data[, timeout[, cafile[, capath[, cadefault[, context]]]]]) -
Open the URL url, which can be either a string or a
Requestobject.data may be a string specifying additional data to send to the server, or
Noneif no such data is needed. Currently HTTP requests are the only ones that use data; the HTTP request will be a POST instead of a GET when the data parameter is provided. data should be a buffer in the standard application/x-www-form-urlencoded format. Theurllib.urlencode()function takes a mapping or sequence of 2-tuples and returns a string in this format. urllib2 module sends HTTP/1.1 requests withConnection:closeheader included.The optional timeout parameter specifies a timeout in seconds for blocking operations like the connection attempt (if not specified, the global default timeout setting will be used). This actually only works for HTTP, HTTPS and FTP connections.
If context is specified, it must be a
ssl.SSLContextinstance describing the various SSL options. SeeHTTPSConnectionfor more details.The optional cafile and capath parameters specify a set of trusted CA certificates for HTTPS requests. cafile should point to a single file containing a bundle of CA certificates, whereas capath should point to a directory of hashed certificate files. More information can be found in
ssl.SSLContext.load_verify_locations().The cadefault parameter is ignored.
This function returns a file-like object with three additional methods:
-
geturl()— return the URL of the resource retrieved, commonly used to determine if a redirect was followed -
info()— return the meta-information of the page, such as headers, in the form of anmimetools.Messageinstance (see Quick Reference to HTTP Headers) -
getcode()— return the HTTP status code of the response.
Raises
URLErroron errors.Note that
Nonemay be returned if no handler handles the request (though the default installed globalOpenerDirectorusesUnknownHandlerto ensure this never happens).In addition, if proxy settings are detected (for example, when a
*_proxyenvironment variable like was added. -