We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
抓取2017版最新数据,发现部分区域名称存在乱码情况, 国家统计局页面源码的编码定义为gb2312,实际为gbk 因此 需要手工指定编码 def getUrl(url,num_retries = 5): ua = UserAgent() headers = {'User-Agent':ua.random} try: response = requests.get(url,headers = headers) response.encoding = "GBK" data = response.text print(url) return data except Exception as e: if num_retries > 0: time.sleep(10) print(url) print("requests fail, retry!") return getUrl(url,num_retries-1) #递归调用 else: print("retry fail!") print("error: %s" % e + " " + url) return #返回空值,程序运行报错`
The text was updated successfully, but these errors were encountered:
感谢!已修改代码!
Sorry, something went wrong.
No branches or pull requests
抓取2017版最新数据,发现部分区域名称存在乱码情况,
国家统计局页面源码的编码定义为gb2312,实际为gbk
因此 需要手工指定编码
def getUrl(url,num_retries = 5):
ua = UserAgent()
headers = {'User-Agent':ua.random}
try:
response = requests.get(url,headers = headers)
response.encoding = "GBK"
data = response.text
print(url)
return data
except Exception as e:
if num_retries > 0:
time.sleep(10)
print(url)
print("requests fail, retry!")
return getUrl(url,num_retries-1) #递归调用
else:
print("retry fail!")
print("error: %s" % e + " " + url)
return #返回空值,程序运行报错`
The text was updated successfully, but these errors were encountered: