python爬取网页视频
2021/11/25 20:11:03
本文主要是介绍python爬取网页视频,对大家解决编程问题具有一定的参考价值,需要的程序猿们随着小编来一起学习吧!
#coding=gbk from lxml import etree import requests from multiprocessing.dummy import Pool import random #@starttime:2021/11/25 10:21 #@endtime:2021/11/25 15:20 if __name__=='__main__': # video_down_url = [] url='https://www.pearvideo.com/' header={ 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36 Edg/95.0.1020.53' } respon1=requests.get(url=url,headers=header) page_1=respon1.text.encode('utf-8') # print(page_1) # 写得不太不顺 page_1_xpa=etree.HTML(page_1) # 坚信自己开始的想法并反复尝试(3~4次左右再改变) page_1_list=page_1_xpa.xpath('//div[@class="vervideo-bd"]') urls = [] # print(page_1_list) for li in page_1_list: str1=''.join(li.xpath('./a//@href')) # 视频的访问地址 vedio_adress_1='https://www.pearvideo.com/'+ str1 header2 = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/95.0.4638.69 Safari/537.36 Edg/95.0.1020.53', 'referer': vedio_adress_1 } id=str1.split('_')[-1] #不断地分析网页,得到了它请求时发了两个参数 #https://www.pearvideo.com/videoStatus.jsp?contId=1581126mrd=0.09969085979493786 params={ 'contId':id, 'mrd':str(random.random()) } vedio_name=''.join(li.xpath('./a/div[2]/text()')) # print(vedio_adress_1) vedio_page= requests.get(url='https://www.pearvideo.com/videoStatus.jsp',params=params,headers=header2).json() url1=vedio_page['videoInfo']['videos']['srcUrl'] key='cont-'+id video_down_url = url1.replace(url1.split('/')[-1].split('-')[0],key) # print(video_down_url) dic={ 'name':vedio_name, 'url':video_down_url } urls.append(dic) def get_vedio_data(dic): url=dic['url'] vedio = requests.get(url=url, headers=header).content with open(dic['name']+ '.mp4', 'wb') as fp: fp.write(vedio) print(dic['name'],'下载成功') pool=Pool(5) pool.map(get_vedio_data,urls) 献给还在梨视频爬取苦苦挣扎的小伙伴,我先往前走了.
这篇关于python爬取网页视频的文章就介绍到这儿,希望我们推荐的文章对大家有所帮助,也希望大家多多支持为之网!
- 2024-05-08有遇到过吗?同样的规则 Excel 中 比Python 结果大
- 2024-03-30开始python成长之路
- 2024-03-29python optparse
- 2024-03-29python map 函数
- 2024-03-20invalid format specifier python
- 2024-03-18pool.map python
- 2024-03-18threads in python
- 2024-03-14python Ai 应用开发基础训练,字符串,字典,文件
- 2024-03-13id3 algorithm python
- 2024-03-13sum array elements python