Looking for recommendations on how to handle 80,000 row data that changes often

Hi everyone,

Would love some feedback or suggestions on the best way to do this. We have an Excel file today which basically is a list of addresses and then it shows what Internet services are available there from our company. It’s around 80,000 rows and is updated generally monthly.

I’m trying to integrate it somehow with Bubble especially with the mapping capability, to have it show the addresses on a map in our Bubble application for example, as well as searching it and so on. The challenge though is that the Excel updates monthly as I mentioned, so I need an easy way to just delete all records and reupload the new data.

I’ve had lots of issues in the past with Bubble and importing Excel or CSV as it really starts to fall apart when you have more than 500 records or so, which even Bubble support had told me can be an issue, so 80,000 I think is impossible to upload via Bubble’s import feature.

Is it possible to have the data be elsewhere, like in a 3rd party database, or even an Excel or Google Sheet, but then have Bubble utilize the data from the external database but similar to if it was actually in Bubble? The main thing is I’m not looking to just query it to return one or two results, as that would be easy. I’m looking to more view a map in Bubble where I might pick a province and then want it to display 20,000 map points on it, or even display them in a repeating group and so on.

Otherwise what’s the standard way people would use to import 80,000 records into Bubble if not using the built in importer?

Or am I just asking for trouble here and would it be better to just upload it all to a custom Google Map and just embed that in Bubble?

I was just going to suggest considering a Google Sheet approach. You could use the API to pull the data directly from the sheet. Of course, you’d have to carefully consider any quotas, etc. and the associated costs. Microsoft’s cloud version of Excel probably has something similar as well. And there are also ways to integrate via 3rd parties, such as Zapier.

Perhaps others with more experience with larger data sets can comment on whether it’d be best to pull it into Bubble and how best to go about it.

(Also, be aware of the potential costs associated with Google Maps API. You’ll likely want to optimize for the number of HTTP requests. 20,000 is a lot of markers to display at once on a map.)

That makes sense. Is Bubble just not designed to be used as a database? We use it as a CRM and switched from Salesforce but used Dynamics and others before that where it’s super common to use the built in import and export tools. I could import these 80,000 in just a couple clicks in under 2 minutes on Salesforce but seems like the Bubble importer is really limited. That’s why I was more wondering if people generally use some kind of external database and just use Bubble as a front end or not. I’m thinking the easiest is probably continuing to use it as a custom Google Map and just embed that into our application, or just provide a link to it etc.

Quite the contrary! Bubble has a very powerful DB. The only reason it might make sense to pull directly from GSheet is if the data are static, have no relationship to other data in the Bubble DB, and are updated frequently.

I’d definitely try to automate everything though. It makes no sense to me to routinely click a button (even if it’s just one click) if the whole process can be automated.

Again, others might have some insights to offer. First, search the forums. I’m certain I’ve seen other posts about importing large data sets.

Having read this more closely, if the source of the data is your company, then why not import it into Bubble one time and then use your Bubble app to keep it up to date. I’m not sure I understand the value of routine imports. Why not use Bubble itself as the authoritative data source?

That’s the issue I have though, how do you import 80,000 records into Bubble? Using the built in importer does not work, as it “validates the data” record by record, but after around 300 it grinds to a halt and slows right down. This is what support told me:

  • In general, we do not suggest that users use the CSV data import function to add large volumes of data to their app, as the operation does time out after a certain period of time: in the case of your records, you are experiencing the timeout after ~300 entries, which is about normal for our system. Instead, you can use our Data API to set up a bulk upload workflow, which will sift through your data and add it to your database without the time-out that you’re experiencing while uploading the CSV.*

But when I go to that Data API it says you can only do 1000 at a time and only every 4 minutes so doesn’t seem like that would work.

The reason I need to monthly delete all records and upload the file again is because the file is basically a list of 80,000 or so business addresses in Canada, and then shows what services our partner company offers there in terms of internet and what not. So each month it can change, as one address this month might not offer any services, where next month it has been updated to now be service ready and so on. It doesn’t show which have been updated really, so updating the records would be cumbersome but I could probably work around it. With Salesforce we would just each month delete all records, and then run the import again with the latest file to always keep the data fresh.

I’m not the best person to address that, as I’ve not [yet] encountered the need. However, it does look like it’s come up before.

Perhaps something like an automated GSheet approach might be worth considering after all.

Do you mind my asking where that “file” (the original data) comes from? IOW, how does it get into the spreadsheet? Is it provided to you in CSV format? Is it accessible via an API?

Thanks, the original Excel is an export from our partner companies Salesforce.com. It can’t be accessed via API, it’s basically just a report that is exported to Excel or CSV format. We are basically consultants for another company but are not employed by them, so we are provided the file when we need it but we can’t integrate with their Salesforce or what not since we are 2 different unrelated companies just in a partnership on a project.

I just want to add that we can keep using Google Maps today to handle this but I was more just wondering in general how people handle large amounts of data with Bubble since the importer tool just seems so basic is all. I understand 80,000 is a lot of rows, but we have files with 1000 that always fail which to me should be a non issue.

1 Like

Well, I can tell you that Salesforce has a very capable and robust API. I’ve done some integrations involving Salesforce, Google Apps Script, and WordPress. It might be worth exploring the possibility of integrating with your partner company’s Salesforce data via API. Salesforce is BIG on security, but there are ways to set up “trusted” apps within Salesforce. In other words, it’s certainly “possible” to create an automated integration between Bubble and Salesforce, but it would likely require some setup and dev on the Salesforce side. Perhaps you’ve already looked into it.

I hear you about the import issue. That’s pretty frustrating. Here’s a post by someone who imported more than 3 million records, but that was a one-time deal.

EDIT: FWIW, it would be pretty straightforward to automate the import of data into a Google Sheet using Google Apps Script (GAS).

1 Like

Hi @lmoreau

After you import 80,000 under Bulk or API into Bubble, you can use an external server to compare files and update relevant real information, which will reduce Bubble effort by 95% or more according to this case. If the import time is to be very fast for info processing, it will actually have to be done on an external server such as Amazon Windows Server in the cloud with a tool like WinAutomation.

As you found out, importing data with Bubble isn’t very fast yet. I don’t import more than 10 contacts at a time and it must be done fast enough under the 5 minutes time out Bubble delay, and you loop over again. That’s the way I solved the challenge. In fact, importing 100 contacts take few seconds, and traitment take minutes.