-
-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refine the user agent check #112
Comments
Exclude the word “bot“ in addition to the current filter works quite well on another system where I implemended a similar statistics processor on my own. (shame on me I never contributed back 🙈 ) |
This is a very basis regex blacklist which should detect most of them: |
I have enhanced the bot detection by using the gist given by @Zodiac1978. |
…ent-check Ticket #112 - Enhanced bot detection.
There is a bot from a French search engine which is using "Linux" in his user agent:
https://regex101.com/r/xZALC3/1/
Exabot - https://www.keycdn.com/blog/web-crawlers
Because of this problem and the problem mentioned above about mobile user agents we should blacklist the most used bots too.
I found a great resource of bots with regex patterns to detect and blacklist many bots.
https://github.com/monperrus/crawler-user-agents
The text was updated successfully, but these errors were encountered: