延迟提交时间和代理方法避免爬虫被识别

xiaoxiao2021-02-28  39

延迟提交时间:

import urllib.request import urllib.parse import json import time while True: content = input('请输入需要翻译的内容:(输入"q!"退出程序): ') url = 'http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule' head = {} head['User-Agent'] = 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36' data = {} data['i'] = content data['from'] = 'AUTO' data['to'] = 'AUTO' data['smartresult'] = 'dict' data['client'] = 'fanyideskweb' data['doctype'] = 'json' data['version'] = '2.1' data['keyfrom'] = 'fanyi.web' data['action'] = 'FY_BY_REALTIME' data['typoResult'] = 'false' data = urllib.parse.urlencode(data).encode('utf-8') req = urllib.request.Request(url, data, head) response = urllib.request.urlopen(req) html = response.read().decode('utf-8') target = json.loads(html) print("翻译结果: %s" %(target['translateResult'][0][0]['tgt'])) time.sleep(5)

上述代码,每5s进行一次提交

 

代理方法:

1)设置一个字典{‘类型’:“代理ip:端口号”}

2)定制创建一个opener

3)a. 安装opener

        urllib.request.install_opener(operner)

    b.调用opener

    opener.open(url)

转载请注明原文地址: https://www.6miu.com/read-2299982.html

最新回复(0)