首页 > 编程 > Python > 正文

Python爬虫PyQuery库基本用法入门教程

2020-02-15 22:41:05
字体:
来源:转载
供稿:网友

本文实例讲述了Python爬虫PyQuery库基本用法。分享给大家供大家参考,具体如下:

PyQuery库也是一个非常强大又灵活的网页解析库,如果你有前端开发经验的,都应该接触过jQuery,那么PyQuery就是你非常绝佳的选择,PyQuery 是 Python 仿照 jQuery 的严格实现。语法与 jQuery 几乎完全相同,所以不用再去费心去记一些奇怪的方法了。

官网地址:http://pyquery.readthedocs.io/en/latest/

jQuery参考文档: http://jquery.cuishifeng.cn/

1、字符串的初始化

from pyquery import PyQuery as pqhtml = '''<div>  <ul>     <li class="item-0">first item</li>     <li class="item-1"><a href="link2.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >second item</a></li>     <li class="item-0 active"><a href="link3.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" ><span class="bold">third item</span></a></li>     <li class="item-1 active"><a href="link4.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >fourth item</a></li>     <li class="item-0"><a href="link5.html" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" rel="external nofollow" >fifth item</a></li>   </ul></div>'''doc = pq(html)print(doc)print(type(doc))print(doc('li'))

<div>
    <ul>
         <li class="item-0">first item</li>

发表评论 共有条评论
用户名: 密码:
验证码: 匿名发表