Skip to content

Commit

Permalink
修复html解析器api
Browse files Browse the repository at this point in the history
跳转后的域名会多几级/.../
所以不能直接拼接,这里采用分段后再拼接
  • Loading branch information
HowieHz committed Feb 8, 2024
1 parent 328368b commit c8feb86
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 7 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ The script is used to automatically get the favicon

第一次运行会在运行所在目录下生成links.txt,在其中放置你要获取favicon的网站
(如果这一行开头是#,将会被程序忽略)
以下形式都是允许的:
以下形式都是允许的: (如果进入网站就会跳转,请输入跳转之后的域名)

```txt
https://howiehz.top
Expand Down
11 changes: 5 additions & 6 deletions src/api/html_parser_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,15 +52,14 @@ def get_binary_img(self, href_value: str, file_type: str):
# print(f"#gbi1 url={href_value}")
r: requests.Response = requests.get(url=f'{href_value}', headers=headers)
elif href_value.startswith('/'): # 处理一下 /xxx.xxx
# print(f"#gbi2 url={self.url}{href_value}")
r: requests.Response = requests.get(url=f'{self.url}{href_value}', headers=headers)
# print(f"#gbi2 url={self.url.split('/')[0]}//{self.url.split('/')[2]}{href_value}")
r: requests.Response = requests.get(url=f"{self.url.split('/')[0]}//{self.url.split('/')[2]}{href_value}", headers=headers)
elif href_value.startswith('https:') or href_value.startswith('http:'): # 处理一下 https://xxx.xxx or http://xxx.xxx
# print(f"#gbi3 url={href_value}")
r: requests.Response = requests.get(url=f'{href_value}', headers=headers)
else:
href_value = f'/{href_value}' # 处理一下 xxx.xxx
# print(f"#gbi4 url={self.url}{href_value}")
r: requests.Response = requests.get(url=f'{self.url}{href_value}', headers=headers)
else: # 处理一下 xxx.xxx
# print(f"#gbi4 url={self.url.split('/')[0]}//{self.url.split('/')[2]}/{href_value}")
r: requests.Response = requests.get(url=f"{self.url.split('/')[0]}//{self.url.split('/')[2]}/{href_value}", headers=headers)
self.ret.append((r.content, file_type, 'binary'))
return

Expand Down

0 comments on commit c8feb86

Please sign in to comment.