什么是Urllib?
Python内置的HTTP请求库
urllib.request 请求模块
urllib.error 异常处理模块
urllib.parse url解析模块
urllib.robotparser robots.txt解析模块
相比Python的变化
Python2中的urllib2在Python3中被统一移动到了urllib.request中
python2
import urllib2
response = urllib2.urlopen('http://www.cnblogs.com/0bug')
Python3
import urllib.request
response = urllib.request.urlopen('http://www.cnblogs.com/0bug/')
urlopen()
不加data是以GET方式发送,加data是以POST发送
import urllib.request
response = urllib.request.urlopen('http://www.cnblogs.com/0bug')
html = response.read().decode('utf-8')
print(html)
<!DOCTYPE html> <html lang="zh-cn"> <head> <meta charset="utf-8"/> <meta name="viewport" content="width=device-width, initial-scale=1" /> <title>0bug - 博客园</title> <link type="text/css" rel="stylesheet" href="/bundles/blog-common.css?v=-hy83QNg62d4qYibixJzxMJkbf1P9fTBlqv7SK5zVL01"/> <link id="MainCss" type="text/css" rel="stylesheet" href="/skins/KJC/bundle-KJC.css?v=SBtLze_k2f8QMx9yu0UzPZOmkUXedeg_e6WBRIadVBo1"/> <link type="text/css" rel="stylesheet" href="/blog/customcss/314654.css?v=SL7ok7Br9Wq1UADrprqW%2fnQ%2bFQI%3d"/> <link id="mobile-style" media="only screen and (max-width: 767px)" type="text/css" rel="stylesheet" href="/skins/KJC/bundle-KJC-mobile.css?v=d9LctKHRIQp9rreugMcQ1-UJuq_j1fo0GZXTXj8Bqrk1"/> <link title="RSS" type="application/rss+xml" rel="alternate" href="http://www.cnblogs.com/0bug/rss"/> <link title="RSD" type="application/rsd+xml" rel="EditURI" href="http://www.cnblogs.com/0bug/rsd.xml"/> <link type="application/wlwmanifest+xml" rel="wlwmanifest" href="http://www.cnblogs.com/0bug/wlwmanifest.xml"/> <script src="//common.cnblogs.com/scripts/jquery-2.2.0.min.js"></script> <script type="text/javascript">var currentBlogApp = '0bug', cb_enable_mathjax=false;var isLogined=false;</script> <script src="/bundles/blog-common.js?v=taItysi72HxMPeH9Xg5nAYabRul6hhgahi3tVIMIKV81" type="text/javascript"></script> </head> <body> <a name="top"></a> <!--PageBeginHtml Block Begin--> <!--模拟知乎的回到顶部开始--> <style> div.go-top { display: none; opacity: 0.6; z-index: 999999; position: fixed; bottom: 113px; left: 90%; margin-left: 40px; border: 1px solid #a38a54; width: 38px; height: 38px; background-color: #ffffff; border-radius: 3px; cursor: pointer; } div.go-top:hover { opacity: 1; filter: alpha(opacity=100); } div.go-top div.arrow { position: absolute; left: 10px; top: -1px; width: 0; height: 0; border: 9px solid transparent; border-bottom-color: #9aaabf; } div.go-top div.stick { position: absolute; left: 15px; top: 15px; width: 8px; height: 14px; display: block; background-color: #9aaabf; -webkit-border-radius: 1px; -moz-border-radius: 1px; border-radius: 1px; } </style> <script type="text/javascript"> $(function() { $(window).scroll(function() { if ($(window).scrollTop() >600) $('div.go-top').show(); else $('div.go-top').hide(); }); $('div.go-top').click(function() { $('html, body').animate({scrollTop: 0}, 600); }); }); </script> <body> <div class="go-top"> <div class="arrow"></div> <div class="stick"></div> </div> </body> <!--模拟知乎的回到顶部结束--> <!--框架引入--> <link href="https://cdn.bootcss.com/bootstrap/3.3.7/css/bootstrap.min.css" rel="stylesheet"> <!--PageBeginHtml Block End--> <table class="Framework" cellspacing="0" cellpadding="0" width="100%"> <tr> <td colspan="3"> <div id="top"> <table cellpadding="10" cellspacing="0"> <tr> <td nowrap> <h1><a id="Header1_HeaderTitle" class="headermaintitle" href="http://www.cnblogs.com/0bug/"></a></h1> </td> </tr> </table> </div> <div id="sub"> <div id="sub-right"><div id="blog_stats"> <div class="BlogStats">posts - 209, comments - 3, trackbacks - 0, articles - 44</div></div></div> <a id="blog_nav_sitehome" href="http://www.cnblogs.com/">博客园</a> :: <a id="blog_nav_myhome" href="http://www.cnblogs.com/0bug/">首页</a> :: <a id="blog_nav_newpost" rel="nofollow" href="https://i.cnblogs.com/EditPosts.aspx?opt=1">新随笔</a> :: <a id="blog_nav_contact" accesskey="9" rel="nofollow" href="https://msg.cnblogs.com/send/0bug">联系</a> :: <a id="blog_nav_rss" href="http://www.cnblogs.com/0bug/rss">订阅</a> <a id="blog_nav_rss_image" class="XMLLink" href="http://www.cnblogs.com/0bug/rss"><img src="//www.cnblogs.com/images/xml.gif" alt="订阅" /></a> :: <a id="blog_nav_admin" rel="nofollow" href="https://i.cnblogs.com/">管理</a> </div> </td> </tr> <tr> <td class="LeftCell"> <div id="leftmenu"> <div id="blog-calendar" style="display:none"></div><script type="text/javascript">loadBlogDefaultCalendar();</script> <div id=cell> <img src="/Skins/KJC/Images/icon-group.jpg" hspace=5 align=left vspace=2><h3>公告</h3> <div id=news> <div id="blog-news"></div><script type="text/javascript">loadBlogNews();</script> </div> </div> <div id="blog-sidecolumn"></div><script type="text/javascript">loadBlogSideColumn();</script> </div> </td> <td class="MainCell"> <div id="main"> <p class="date"> <span> <a id="homepage1_HomePageDays_ctl00_ImageLink" Title="Day Archive" href="http://www.cnblogs.com/0bug/" style="display:inline-block;">置顶随笔</a> </span> </p> <div class="post"> <div class="posthead"> <h2 style="padding-top: 4px; padding-bottom: 4px;"> <a id="homepage1_HomePageDays_ctl00_DayList_TitleUrl_0" href="http://www.cnblogs.com/0bug/p/8788518.html">[置顶]Python开发工程师技术手记</a></h2> </div> <div class="postbody"> <div class="cnblogs-post-body" id="postlist_postbody_8788518">正文内容加载中...</div><script type="text/javascript">getBlogPostBody(8788518);</script></div> <p class="postfoot"> posted @ 2018-04-11 01:56 0bug 阅读(10) 评论(0) <a href ="https://i.cnblogs.com/EditPosts.aspx?postid=8788518" rel="nofollow">编辑</a> </p> </div> <p class="date"> <span> <a id="homepage1_HomePageDays_DaysList_ctl00_ImageLink" Title="Day Archive" href="//www.cnblogs.com/0bug/archive/2018/04/20.html" style="display:inline-block;">2018年4月20日</a> </span> </p> <div class="post"> <div class="posthead"> <h2 style="padding-top: 4px; padding-bottom: 4px;"> <a id="homepage1_HomePageDays_DaysList_ctl00_DayList_TitleUrl_0" href="http://www.cnblogs.com/0bug/p/8893038.html">HTTP协议请求头信息和响应头信息</a></h2> </div> <div class="postbody"> <div class="c_b_p_desc">摘要: http的请求部分 基本结构 常用请头信息 Accept:text/html,image/*(告诉服务器,浏览器可以接受文本,网页图片) Accept-Charaset:ISO-8859-1 [接受字符编码:iso-8859-1] Accept-Encoding:gzip,compress[可以接受<a href="http://www.cnblogs.com/0bug/p/8893038.html" class="c_b_p_desc_readmore">阅读全文</a></div></div> <p class="postfoot"> posted @ 2018-04-20 19:12 0bug 阅读(2) 评论(0) <a href ="https://i.cnblogs.com/EditPosts.aspx?postid=8893038" rel="nofollow">编辑</a> </p> </div> <div class="post"> <div class="posthead"> <h2 style="padding-top: 4px; padding-bottom: 4px;"> <a >HTTP协议中GET和POST方法的区别</a></h2> </div> <div class="postbody"> <div class="c_b_p_desc">摘要: 最直观的区别就是GET把参数包含在URL中,POST通过request body传递参数。 GET在浏览器回退时是无害的,而POST会再次提交请求。 GET产生的URL地址可以被Bookmark,而POST不可以。 GET请求会被浏览器主动cache,而POST不会,除非手动设置。 GET请求只能进<a href="http://www.cnblogs.com/0bug/p/8892959.html" class="c_b_p_desc_readmore">阅读全文</a></div></div> <p class="postfoot"> posted @ 2018-04-20 18:59 0bug 阅读(2) 评论(0) <a href ="https://i.cnblogs.com/EditPosts.aspx?postid=8892959" rel="nofollow">编辑</a> </p> </div> <div class="post"> <div class="posthead"> <h2 style="padding-top: 4px; padding-bottom: 4px;"> <a >Redis环境安装</a></h2> </div> <div class="postbody"> <div class="c_b_p_desc">摘要: Windows下: 到https://github.com/MicrosoftArchive/redis/releases下载: 下载完成后一步一步安装就行。 然后在安装一个可视化工具:https://github.com/uglide/RedisDesktopManager Linux下安装以Ub<a href="http://www.cnblogs.com/0bug/p/8892711.html" class="c_b_p_desc_readmore">阅读全文</a></div></div> <p class="postfoot"> posted @ 2018-04-20 18:17 0bug 阅读(2) 评论(0) <a href ="https://i.cnblogs.com/EditPosts.aspx?postid=8892711" rel="nofollow">编辑</a> </p> </div> <div class="post"> <div class="posthead"> <h2 style="padding-top: 4px; padding-bottom: 4px;"> <a >自己动手,丰衣足食!Python3网络爬虫实战案例</a></h2> </div> <div class="postbody"> <div class="c_b_p_desc">摘要: 本教程是崔大大的爬虫实战教程的笔记:网易云课堂 Python3+Pip环境配置 Windows下安装Python: http://www.cnblogs.com/0bug/p/8228378.html Linux以Ubuntu为例,一般是自带的,只需配置一下默认版本:http://www.cnblo<a href="http://www.cnblogs.com/0bug/p/8892714.html" class="c_b_p_desc_readmore">阅读全文</a></div></div> <p class="postfoot"> posted @ 2018-04-20 18:17 0bug 阅读(2) 评论(0) <a href ="https://i.cnblogs.com/EditPosts.aspx?postid=8892714" rel="nofollow">编辑</a> </p> </div> <p class="date"> <span> <a >2018年4月19日</a> </span> </p> <div class="post"> <div class="posthead"> <h2 style="padding-top: 4px; padding-bottom: 4px;"> <a >Python Flask 构建微电影视频网站</a></h2> </div> <div class="postbody"> <div class="c_b_p_desc">摘要: 前言 学完本教程,你将掌握: 1.学会使用整形、浮点型、路径型、字符串型正则表达式路由转化器 2.学会使用post与get请求、上传文件、cookie获取与相应、404处理 3.学会适应模板自动转义、定义过滤器、定义全局上下文处理器、JinJa2语法、包含、继承、定义宏 4.学会使用flask-wt<a href="http://www.cnblogs.com/0bug/p/8886663.html" class="c_b_p_desc_readmore">阅读全文</a></div></div> <p class="postfoot"> posted @ 2018-04-19 22:31 0bug 阅读(5) 评论(0) <a href ="https://i.cnblogs.com/EditPosts.aspx?postid=8886663" rel="nofollow">编辑</a> </p> </div> <p class="date"> <span> <a >2018年4月18日</a> </span> </p> <div class="post"> <div class="posthead"> <h2 style="padding-top: 4px; padding-bottom: 4px;"> <a >基于Token的身份验证——JWT</a></h2> </div> <div class="postbody"> <div class="c_b_p_desc">摘要: 初次了解JWT,很基础,高手勿喷。 基于Token的身份验证用来替代传统的cookie+session身份验证方法中的session。 JWT是啥? JWT就是一个字符串,经过加密处理与校验处理的字符串,形式为: A.B.C A由JWT头部信息header加密得到 B由JWT用到的身份验证信息jso<a href="http://www.cnblogs.com/0bug/p/8877479.html" class="c_b_p_desc_readmore">阅读全文</a></div></div> <p class="postfoot"> posted @ 2018-04-18 20:53 0bug 阅读(10) 评论(0) <a href ="https://i.cnblogs.com/EditPosts.aspx?postid=8877479" rel="nofollow">编辑</a> </p> </div> <div class="post"> <div class="posthead"> <h2 style="padding-top: 4px; padding-bottom: 4px;"> <a >同步(Synchronous)和异步(Asynchronous)</a></h2> </div> <div class="postbody"> <div class="c_b_p_desc">摘要: 同步、异步的概念 同步和异步通常用来形容一次方法调用。 同步方法调用一旦开始,调用者必须等到方法调用返回后,才能继续后续的行为。 异步方法调用更像一个消息传递,一旦开始,方法调用就会立即返回,调用者就可以继续后续的操作。而,异步方法通常会在另外一个线程中,“真实”地执行着。整个过程,不会阻碍调用者的<a href="http://www.cnblogs.com/0bug/p/8874818.html" class="c_b_p_desc_readmore">阅读全文</a></div></div> <p class="postfoot"> posted @ 2018-04-18 14:52 0bug 阅读(2) 评论(0) <a href ="https://i.cnblogs.com/EditPosts.aspx?postid=8874818" rel="nofollow">编辑</a> </p> </div> <div class="post"> <div class="posthead"> <h2 style="padding-top: 4px; padding-bottom: 4px;"> <a >Window 通过cmd查看端口占用、相应进程、杀死进程等的命令</a></h2> </div> <div class="postbody"> <div class="c_b_p_desc">摘要: 一、 查看所有进程占用的端口 在开始-运行-cmd,输入:netstat –ano可以查看所有进程 二、查看占用指定端口的程序 当你在用tomcat发布程序时,经常会遇到端口被占用的情况,我们想知道是哪个程序或进程占用了端口,可以用该命令 netstat –ano|findstr “指定端口号” 二<a href="http://www.cnblogs.com/0bug/p/8872802.html" class="c_b_p_desc_readmore">阅读全文</a></div></div> <p class="postfoot"> posted @ 2018-04-18 10:58 0bug 阅读(4) 评论(0) <a href ="https://i.cnblogs.com/EditPosts.aspx?postid=8872802" rel="nofollow">编辑</a> </p> </div> <p class="date"> <span> <a >2018年4月16日</a> </span> </p> <div class="post"> <div class="posthead"> <h2 style="padding-top: 4px; padding-bottom: 4px;"> <a >优秀博客收录</a></h2> </div> <div class="postbody"> <div class="c_b_p_desc">摘要: 最新Django2.0.1在线教育零基础到上线教程 :https://www.jianshu.com/nb/21010157 vue+django2.0.2-rest-framework生鲜超市 :https://www.jianshu.com/nb/22309475<a href="http://www.cnblogs.com/0bug/p/8855883.html" class="c_b_p_desc_readmore">阅读全文</a></div></div> <p class="postfoot"> posted @ 2018-04-16 14:53 0bug 阅读(4) 评论(0) <a href ="https://i.cnblogs.com/EditPosts.aspx?postid=8855883" rel="nofollow">编辑</a> </p> </div> <div class="post"> <div class="posthead"> <h2 style="padding-top: 4px; padding-bottom: 4px;"> <a >查找Python项目依赖的库并生成requirements.txt</a></h2> </div> <div class="postbody"> <div class="c_b_p_desc">摘要: 使用pip freeze 这种方式配合virtualenv 才好使,否则把整个环境中的包都列出来了。 使用 pipreqs 这个工具的好处是可以通过对项目目录的扫描,自动发现使用了那些类库,自动生成依赖清单。<a href="http://www.cnblogs.com/0bug/p/8853596.html" class="c_b_p_desc_readmore">阅读全文</a></div></div> <p class="postfoot"> posted @ 2018-04-16 08:24 0bug 阅读(6) 评论(0) <a href ="https://i.cnblogs.com/EditPosts.aspx?postid=8853596" rel="nofollow">编辑</a> </p> </div> <div class="topicListFooter"><div >下一页</a></div></div> </div> </td> </tr> <tr> <td colspan="2" class="FooterCell"> <p > Powered by: <br /> <a >博客园</font></a> <br /> Copyright © 0bug </p> </td> </tr> </table> <!--PageEndHtml Block Begin--> <!--自动生成目录--> <script language="javascript" type="text/javascript"> //生成目录索引列表 function GenerateContentList() { var jquery_h2_list = $('#likecs_post_body h2');//如果你的章节标题不是h2,只需要将这里的h2换掉即可 if (jquery_h2_list.length > 0) { var content = '<a name="_labelTop"></a>'; content += '<div >'; content += '<p style="font-size:18px"><b>阅读目录</b></p>'; content += '<ul>'; for (var i = 0; i < jquery_h2_list.length; i++) { var go_to_top = '<div style="text-align: right"><a href="#_labelTop"></a><a name="_label' + i + '"></a></div>'; $(jquery_h2_list[i]).before(go_to_top); var li_content = '<li><a href="#_label' + i + '">' + $(jquery_h2_list[i]).text() + '</a></li>'; content += li_content; } content += '</ul>'; content += '</div>'; if ($('#likecs_post_body').length != 0) { $($('#likecs_post_body')[0]).prepend(content); } } } GenerateContentList(); </script> <script language="javascript" type="text/javascript"> document.getElementById('footer').innerText='Life is short, you need Python'; </script> <!--PageEndHtml Block End--> </body> </html>