Hi everyone,
I recently encountered a huge spike in WU consumption on my Bubble app, and after reaching out to Bubble support, they traced it back to Facebook crawler bots (facebookexternalhit
) making excessive requests to pages with Facebook tracking parameters (e.g., fbclid
). In one day, there were over 800K requests, with the majority coming from Facebook-related IPs and bots.
Bubble support suggested trying to block the bots via robots.txt
using:
User-agent: facebookexternalhit
Disallow: /
However, based on external forums and discussions, it seems Facebook bots do not honor robots.txt
directives, making this approach ineffective.
I wanted to ask if anyone in the Bubble community has:
- Successfully mitigated Facebook bot crawling on their Bubble app.
- Found any Bubble-friendly solutions to prevent or slow down these bots from consuming excessive WU.
The forum posts shared by Bubble support all discuss code-based solutions (rate-limiting, headers, etc.), which aren’t directly applicable in Bubble’s no-code environment.
Here are some articles that discuss this for coded websites:
facebook - excessive traffic from facebookexternalhit bot - Stack Overflow - this thread explains that problems like this do indeed seem to be from a Facebook crawler/scraper bot. One of the responses mentions that there may be a tag you can add to rate-limit the crawler and slow it down.
facebook - Does user agent `facebookexternalhit` represent a user click? - Stack Overflow - this thread offers some more general context on what this endpoint represents.
woocommerce - facebookexternalhit/1.1 bot Excessive Requests, Need to Slow Down - Stack Overflow - this thread has several responses offering different ways you can potentially slow the bot down as well.
php - Facebook crawler is hitting my server hard and ignoring directives. Accessing same resources multiple times - Stack Overflow - this thread goes into more technical discussions of the crawler, and offers some code solutions as well.
facebookexternalhit/1.1 thousands of requests | WordPress.org - this thread mentions some more straightforward code solutions for preventing the extent of hits from this crawler.
Any insights, workarounds, or suggestions to handle this would be incredibly helpful!
Thank you so much in advance for your help!
Edit: I have the Facebook plugin which I only use for oAuth logins. Also I have Meta Pixel script in my universal header. The crawling traffic is mostly (if not only) hitting my /listing/[slug] pages.