Workload Spike – how to figure out where the traffic comes from?

I am still developing the app, the live version is not on its final domain but still under .bubbleapps.io. So I had Workload spikes every time when I worked on the app and imported & transformed data – this was no surprise, because it was me creating the workload.

Now this morning I got a new workload spike notification, and the statistics shows it comes from the Live version with page loads and fetching data – 18’000 pages loaded and 65’000 searches over the past 6 hours. The Workflow log for that time is empty. So it’s really just “traffic”.

Now I am wondering: Where does this traffic come from? Is it real people, search engine bots (thank you but please come back later once we’re on the final domain) or something else?

I somehow don’t manage to figure out where to see this in the Bubble Logs… can anyone give me a hint? Thank you!

Best place is an analytics tool like Google Analytics, Plausible, or Umami. This will tell you where users come from, and where do they go…

Bubble logs don’t give you a traffic source, so as @georgecollier already pointed out, you’ll need to look externally. You’ll need some time to collect data though.

If this is gobbling up all your workload, you can also add some lines to a robots.txt file. Something like:

User-agent: *
Disallow: /

or for specific AI bots:

User-agent: GPTBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: PerplexityBot
Disallow: /

to protect from search engines:

<meta name="robots" content="noindex, nofollow">

This doesn’t actually block anything though, it’s a message to “well-behaved” bots to voluntarily not crawl your pages.

An easy fix is to apply username/password to your app, this blocks access to everyone, including bots, server-side.

1 Like

Thank you @petter – indeed right after writing my post I added the general robots.txt Disallow line, and as expected the requests have literally instantly decreased.

To gain more insights, I will add the Bubble Live URL to my GA tracking to get more details, as already suggested by @georgecollier . But I am already more relaxed as the curve is now definitively on a lower level than in the early morning.

And – positive side-effect – it kicked my ass to move ahead with the launch :sweat_smile:

2 Likes

Haha, that’s great!

Would be interesting to see some numbers when you have them: bots consuming workload is a real-world problem to a lot of devs, so would love to see some numbers on it.

The Google Analytics Plugin by bubble.io has 270k installs but it didn’t add any code to neither the Dev nor Live page… is this possible? (my colleague would say “it does not work”)

Now with the GA4 + Tag Manager Plugin I see some movement in the Google Analytics Live dashboard and can start tracking for the future, but i won’t be able to look back and compare the before and after…

Hi Petter, after some time I now have some numbers. In the past month, there has been a steady load of 2000-4000 WU consumed per day measured (mainly for page loads and fetching data) while there is no substantial traffic of visitors counted by Google Analytics. While bubble Dashboard counted 289,359 Page Loads, GA only shows an Event Count of 4.6K. This is more than 60x difference :thinking:

There was a Workload spike in the last 24 hours (15’000 WU consumed within few

hours, on a Saturday evening) without a substantial increase in numbers seen on GA.

To be very honest, this situation is leaving me a bit clueless: First, how can I get more insights on the real traffic on the site consuming the WU, which will be a good basis to secondly keep this traffic it within reasonable limits by specific allow/disallow actions :thinking:

(btw I’m using the GA4 + Tag Manager plugin)

Page loads don’t just trigger on the first page load. They also trigger when you “navigate to current page”.

1 Like

The workflow trigger will trigger, but I do not believe it consumes WUs.

The event itself doesn’t cost any WU since page events are client-side. Accompanying actions being triggered do though.

That’s quite a difference! I’m honestly not an the authority on GA these days, but I do know that with GDPR, cookie settings, ad blockers etc, the numbers are not as reliable as they used to be. Discrepancy is to be expected, but yes, 60x is pretty big.

As @ihsanzainal84 pointed out, there’s also a lot of unknowns in Bubble (such as page load not necessarily being counted just once for each user).

You could try setting up a page load event (make sure it doesn’t repeat on page load actions) and actually store a record of every visitor. That’ll give you at least a baseline of how many users are actually loading a page to “see” it once (real or bot). You’ll be spending a bit of workload for the learning of course, but might be worth it. You can also do this for some key workflows (or other pages) to learn more about what’s actually being triggered.

Am I getting you right that the workload consumption is not necessarily the problem, but that you are confused by the number of users loading the page in the first place?

Thank you @petter . Well, the WU consumption is an issue if it’s eating up my monthly included WU units without a substantial “real” (customer) traffic. If I saw hundreds of visitors on the various pages I would be happy to see WU being spent for that, but if GA shows me only few visitors but Bubble Analytics counts thousands of page loads consuming WU, I’m not really happy. So I would say because of this difference which is a blackbox I am also concerned about the WU consumption…

Yeah, totally get that. My comment on WU was poorly phrased, so what I meant was that you could spend some extra WU “manually counting” page loads (perhaps trying to save some extra information like IP) with a workflow. I.e. create a log point for each page load (and make sure you don’t count “Go to page” within the same page).

That’ll give you a true baseline of how many times the page actually loads, with no room for inaccuracy.