Get html from a bubble app's page

Hi,
When I try to use axios.get() to grab the html of any page of my bubble app, I get a js object description of the bubble page (asset, script…) and not the html page which should be served.
If I put any other external address (ie .google.com) in axios.get() I get the html of the requested address.
Does somebody knows if there is a specific parameter to tell to bubble to server the html page when calling from itself in a plugin?

Thank your for your help.

I think the way Bubble’s JS framework works is as follows:

  1. You request a page
  2. Bubble delivers that initial page with a link to another URL
  3. That url downloads the JS to build your app
  4. It evals that JS
  5. Tons of DOM updates later, stuff gets rendered on the screen

So your best guess is:

  • grab that JS payload and reverse engineer it. It pretty much has your entire app
    or
  • use something like Puppeteer to navigate to that page and then do the stuff / scrapping you need
1 Like

Thank you for your suggestion. Unfortunately, object returned by the call does not give any information about other page to reload, and there is no 301 or 302 redirect. I guess that bubble see the incoming connection from itself, and do not serve html with js to be executed, but only information json about the page.

Puppeteer requires an instance of chrome on the running nodejs server, that is not possible in bubble env.

I’ve never used axios.get(), but assuming it’s simply fetching a resource at a specific URL, then you’re definitely seeing the raw HTML (which contains a lot of JS inside <script> tags). What might be confusing you is that the lion’s share of a Bubble page is generated dynamically via JS (and there are some hefty JS assets/files downloaded along with the page).

You can confirm this using Chrome dev tools to inspect the page. What you see on the Elements tab represents the rendered page - i.e. it’s the in-memory DOM tree that was constructed on the fly. To see what actually came “across the wire”, you must view the HTML (click the page name) on the Sources tab. There you should see what axios.get() returns. So basically, the initial HTML that’s sent across the connection serves to “bootstrap” the process of “building” the page.

And of course, you can see all the assets needed to build and render a Bubble page on the Network tab of the browser dev tools.

Not sure what it is you’re trying to accomplish, but if you want to access or manipulate the DOM from a plugin, then you can just do so directly via client-side JS.

Unfortunately it’s not confusing me. Doing an axios.get() on the bubble app websit url (or using fetch or whatever) from the plugin on bubble server side returns a json object and not an html with lot of script…
To be explicit, this what I get :
{“page_name”:“index”,“assets”:{“css”:[“/package/run_css/afb33401688525745e976a87ed6/test-9999/test/index/xfalse/xfalse/run.css”],“js”:[“/package/global_js/c5520a87a513deef6da2bf8e0742cfa817354/test-99999/test/index/xnull/xfalse/xfalse/fr_fr/xfalse/xfalse/global.js”,“/package/page_js/180ac8748124e718a8dece2078942a01e98c93/test-99999/test/index/xfalse/fr_fr/page.js”]},“metadata”:{“title”:“Alter Edition”,“description”:“Alter Edition”,“favicon”:“//91e5bcf6562d5ecacb839.cdn.bubble.io/f1711545141x3545682800/icon.svg"},“headers”:{“custom_app_header”:"\n”,“custom_body_script”:“”,“seo_headers”:[“”,“”,“”,“”,“”,“”,“”,“”,“”,“”,“”],“basic_headers”:[“”,“”],“plugin_app_headers”:[{“plugin_id”:“1568299250417x684448291308175400”,“header”:“\n”}],“plugin_page_headers”:},“errors”:}

Hmm, I see…

Well, as I said, I’m not familiar with axios.get() so you might want to double check the docs on it. Perhaps it’s actually evaluating the page JS.

If what you want is the page HTML, though, would something like the following work?

document.getElementsByTagName('html')[0].innerHTML

EDIT

Ok, I just noticed you’re making the call server-side. That wasn’t clear to me (probably b/c I haven’t used axios), so my apologies. What I can tell you, though, is that when I use curl to access a Bubble page, the actual HTML is returned, which suggests that the call you’re making might be evaluating (or attempting to evaluate) the page JS.

Not another page, but another URL.

I thought that’s how it worked. When I was spelunking that url to create an utility that rebuilds the db of any Bubble app; I could have sworn there was data of pages, reusables, elements, etc…

bubble-db-diagram

2 Likes

Yes, I do the call from server side. If I call any world url (ie www.google.com) I get the html of google page, but strangely, if I call a bubble page of an app I get the json object… It is as if bubble checks the calling ip, see it’s coming from itself and return a json and not the HTML. I hoped that there is a query parameter to force to send an HTML (even with script to execute), but at least an html.

wooooow, I’m surprised that we can retrieve the structure of a database from any bubble application without any authentication. In terms of security, bubble.io really isn’t that serious. Even if we don’t access the data, it still reveals information about how it works and any data it may contain. Not a pretty sight.

 
FWIW, the following works for me. It uses Bubble’s built-in request call and is nearly verbatim from the docs.

async function(properties, context) {

    const response = await context.v3.request({
        url: properties.url
    })
    return { result : response.body }
}

 
The above results in the following being returned from the SSA, which is exactly what I see in the Sources tab of dev tools.

<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<title>Contact Us</title>
<script type="text/javascript">
    window.bubble_session_uid = '1712720274496x766141404782930000';
</script>

<!-- bunch of other HTML code -->

<img style="display: none;" src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" />
</body>
</html>

1 Like

Great, it’s working ! Really thank you for your help and to have pointed me in the right direction. Just use context request (and no need to use axios), so simple, I miss that point when reading doc… :upside_down_face: