Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Test compatibility with "Blackhole for Bad Bots" #284

Open
Zodiac1978 opened this issue Jul 22, 2023 · 4 comments
Open

Comments

@Zodiac1978
Copy link
Member

Zodiac1978 commented Jul 22, 2023

Jeff Starr has this plugin: https://wordpress.org/plugins/blackhole-bad-bots/

It adds a (hidden) link which is also blocked via robots.txt. If a bot is trying to crawl this page it knows this is a bad bot and blocks the IP address.

The problem is, that it is not compatible with every caching plugin, because it needs to fire some hooks which are not fired if only the cached HTML is shown.

See https://wordpress.org/plugins/blackhole-bad-bots/#installation for the problem description
and https://plugin-planet.com/blackhole-pro-cache-plugins/ for the list of compatible plugins.

Let's test it and hopefully we can make Cachify compatible.

@stklcode
Copy link
Contributor

If you configure Cachify to generate static HTML files that are served by your webserver or some caching CDN directly, I don't see any elegant way to achieve this from the plugin's perspective.

With cached content served by the plugin itself, this should be possible.
All listed compatible plugins have to be properly configured, s.t. the cached content is sent to the output after Blackhole Bad Bots takes action.

Currently, Cachify hooks into the template_redirect which is pretty early.

We should

  • check in which phase Blackhole Bad Bots does it's stuff
  • depending on the answer, see whether we might introduce something like a "late init" switch to optionally use a later phase

For the majority of use cases the earlier the better, as it reduces latency and computational overhead, so if we need to do things later for compatibility, I'd prefer a switch for that.

@Zodiac1978
Copy link
Member Author

check in which phase Blackhole Bad Bots does it's stuff

The most recent version is 3.6: https://plugins.trac.wordpress.org/browser/blackhole-bad-bots/tags/3.6
(No GitHub repo.)

@Zodiac1978
Copy link
Member Author

check in which phase Blackhole Bad Bots does it's stuff

In the linked page I saw this explanation:

With page caching, the required init hook may not be fired, which means that plugins like Blackhole for Bad Bots are not able to check the request to see if it should be blocked.
https://plugins.trac.wordpress.org/browser/blackhole-bad-bots/tags/3.6/blackhole.php#L77

Maybe we could trigger blackhole_scanner ourselves if we detect the plugin instead of changing the hook?

Or we could look at those other caching plugins mentioned and how they achieve compatibility.

@Zodiac1978
Copy link
Member Author

Or we could look at those other caching plugins mentioned and how they achieve compatibility.

After reading some more, those plugins either seem to have a late init option or the full page caching can be disabled at all.

Two plugins can be "fixed" with adding a MU plugin with only this code:

function blackhole_verify_nonce($verify) { return true; }
add_filter('blackhole_verify_nonce', 'blackhole_verify_nonce');

This looks like a hard coded true for the nonce check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants