[           ]
宙飒天下		1. url分析工具(purl)
url是web上指定内容的地址,它基本可以分为几个部分
- scheme 协议路径,比如http,https等
 - host 主机名,比如
www.baidu.com这样 - path 主机下内容具体所在的路劲
 - query_params 在url中作为参数传入路径的内容
 
purl是一个简单好用的url分解工具,用它可以方便的获取一段url的各部分内容
安装:
pip install purl 使用:
url = "https://www.baidu.com/s?ie=utf-8&f=8&rsv_bp=0&rsv_idx=1&tn=baidu&wd=newbee&rsv_pq=c87beda3000c6192&rsv_t=69e1zt6mVNzkBbdJDmwMn6Q%2FanmE7t9awTdR05ddZ2dL0Sxnf9BuLA3TB3g&rsv_enter=1&rsv_sug3=8&rsv_sug1=10&rsv_sug7=100"  from purl import URL  from_str = URL(url)  from_str.scheme()  'https' from_str.host()  'www.baidu.com' from_str.query()  'ie=utf-8&f=8&rsv_bp=0&rsv_idx=1&tn=baidu&wd=newbee&rsv_pq=c87beda3000c6192&rsv_t=69e1zt6mVNzkBbdJDmwMn6Q%2FanmE7t9awTdR05ddZ2dL0Sxnf9BuLA3TB3g&rsv_enter=1&rsv_sug3=8&rsv_sug1=10&rsv_sug7=100' url = "https://www.baidu.com/s?ie=utf-8&f=8&rsv_bp=0&rsv_idx=1&tn=baidu&wd=newbee&rsv_pq=c87beda3000c6192&rsv_t=69e1zt6mVNzkBbdJDmwMn6Q%2FanmE7t9awTdR05ddZ2dL0Sxnf9BuLA3TB3g&rsv_enter=1&rsv_sug3=8&rsv_sug1=10&rsv_sug7=100" 0 url = "https://www.baidu.com/s?ie=utf-8&f=8&rsv_bp=0&rsv_idx=1&tn=baidu&wd=newbee&rsv_pq=c87beda3000c6192&rsv_t=69e1zt6mVNzkBbdJDmwMn6Q%2FanmE7t9awTdR05ddZ2dL0Sxnf9BuLA3TB3g&rsv_enter=1&rsv_sug3=8&rsv_sug1=10&rsv_sug7=100" 1url = "https://www.baidu.com/s?ie=utf-8&f=8&rsv_bp=0&rsv_idx=1&tn=baidu&wd=newbee&rsv_pq=c87beda3000c6192&rsv_t=69e1zt6mVNzkBbdJDmwMn6Q%2FanmE7t9awTdR05ddZ2dL0Sxnf9BuLA3TB3g&rsv_enter=1&rsv_sug3=8&rsv_sug1=10&rsv_sug7=100" 2 url = "https://www.baidu.com/s?ie=utf-8&f=8&rsv_bp=0&rsv_idx=1&tn=baidu&wd=newbee&rsv_pq=c87beda3000c6192&rsv_t=69e1zt6mVNzkBbdJDmwMn6Q%2FanmE7t9awTdR05ddZ2dL0Sxnf9BuLA3TB3g&rsv_enter=1&rsv_sug3=8&rsv_sug1=10&rsv_sug7=100" 3 

		
		
		
		

还没有评论,来说两句吧...