Update database from a file

I have a few thousand row in a table of clients items with each row consisting of 20+ columns. I get update CSV files from my clients that should trigger the following actions:
1- If the item does not exist, I am supposed to insert a new row.
2- If the item is found, I am supposed to update the existing record.
Note Update files from clients have a couple of different formats. One is a list of all columns (which is how I can create a new row if a new item). The other one is to update a flag (available/no longer available).

I realize there is a bulk upload and modify on the app data page. However I need to allow my clients to log in, go to a page, click on a browse button, select and upload a CSV file, click on “Process” and effectively do what I can do using the App Data tools. I can create two pages to handle the difference in file formats.

I have searched all over and have not found any info on this (other than App Data tools).

Any suggestions would be greatly appreciated as this is a critical aspect of what I need to provide for our clients.

There is a free plugin to split large files. I have not worked on that piece of my system for a while but in short this is what I do. Select the file, split into multipart and upload. Import into staging area in dB for review and approval using backend workflows. Once approved update main tables using backend workflows.

@cameronL there’s a plugin called T1 CSV Uploader which will allow end users to upload CSV files directly into the database. I believe the max number of rows supported in a CSV is 5,000. Bubble’s limitation for uploading as CSV is actually 100 rows so this plugin will split one CSV file into many files (up to 50), each with 100 rows. You can check out a demo and video walkthrough of the plugin here: https://csvcreator.bubbleapps.io/csv_uploader

I have a client project with a similar requirement and this is what I’ve done:

  • Add a new field to the database thing that the end user is adding/updating. For example, a boolean field called “Duplicate Check Pass”.

  • First upload the end user’s file and create a new database record from each row in the file regardless of whether or not it already exists. Each of these new database records should have a value of False in the “Duplicate Check Pass” field.

  • Once the batch of 100 items has been uploaded into the database, iterate through each item in the list to see if it already exists (of course you need some kind of unique identifier for this such as a product ID). If the item does not exist elsewhere, then change “Duplicate Check Pass” to True. If the item already exists, then leave the “Duplicate Check Pass” field as False and use this item to update the fields on the original.

  • Schedule an API workflow at regular intervals to delete items with “Duplicate Check Pass” = False. Rather than deleting a list of items, I recommend first creating the list, then iterating through each item and deleting them one at a time (unless it’s a very small list).

Just a couple things to note with this method. First, any searches you’re using throughout the app would have to be modified to filter out results where “Duplicate Check Pass” = False. Second, If you have any database triggers with the database type that the end user is creating/updating those may need to be changed. In my experience the app will easily crash if you have database triggers that are kicked off by bulk uploads.

You also mentioned a second kind of file which the end user would upload to update availability. For this I’d probably create a new data type, let’s say “Product Availability”, with 2 fields; “Product ID” and “Available?” The end user can upload a CSV file with these 2 columns (same process as above) then you can iterate through each item, find the corresponding thing, in this example Product, and update its availability according the “Available?” field.

I know it’s a bit of a messy process but it’s the best I got. Maybe someone else can come forward with a cleaner solution.

Thank you for your response (Alex and Dehacked79). I was surprised to run into this situation where there were a number of topics on how to upload files and mostly without resolution and auto-closed after 2 months. Your suggestion makes sense and under different circumstance your solution would certainly work. My particular application, being done for a client, will involve millions of rows, which would need to be updated on almost daily basis, primarily through files that contain thousands of rows (which is why it is not done using an API). Effectively, what I need is the App Data method of “Modify” available as a plugin so it could be made available to certain end users with proper safeguards to process certain files and not others based on roles. And having to create thousands of duplicates in extra tables, and back-end processes to process them, modifying screens to ignore the extra/outdated info, and the clean up jobs, though practical for a single instance of a file upload or smaller sets of data, would be impractical for a table with millions of rows (10’s of millions) and 5 different types of files, each with the potential for 1000’s of rows and generated ever few days, to be manageable in that manner.

It appears that, as much as I am extremely impressed by the level of polish, set of features, and ease of use and deployment offered by Bubble, it really is not geared towards a business application such as the one my client needs. Unless and until Bubble makes the upload/modify/bulk features available as a plugin.

My search for easier db update from file in Bubble continues.

My suggestion would be to go the route of a SQL connector and Azure data factory using a headless SQL DB and perhaps blob storage. Do all the heavy lifting within Azure environment. One of the last severe shortcoming for production systems in Bubble is the way in which it handles mass DB transactions, how non-elegantly it handles failures(Capacity Timeouts) in these processes and the fact that extracting large amounts of data is nearly impossible. I once built a solution that would replicate the Bubble DB to SQL on the fly so I can generate CSV files which I would then push back to Bubble so an end user can get the data… Given the cost of Bubble it’s understandable but it would be nice to add something like dedicated DB DTU’s(Capacity / Backend Workflows) with guaranteed timeout immunity(Improved throttling) even if there is cost so you don’t have to move to a 3rd party to get things done.

Agreed. I am thinking along the same lines. So we will create old-fashioned programs outside of Bubble scheduled using a Cron job to grab files from a directory, apply the updates to the database, and remove processed files. This method would make more sense anyway as opposed to having a user sit and wait as 1000’s of rows are being updated considering how there is no need for user interaction beyond selecting a file to process. The app is otherwise done and the only thing left to do is load testing, after which we will move things to production.
Since you have obviously have had a lot more experience with Bubble, do you see any issues with using Bubble to lookup and interact with records in a table with 40M rows?

If the 40M records sits in SQL there is no issue imo. Biggest table I have in Bubble is about 350k and that works fine. Keep in mind SQL connector also has limits on number of records it can return. 1000?

That would not be an issue as we will never pull all the records for any reason. There is a repeating group, which only shows what has been entered in search fields after a search button is clicked. Anything more than 100 records is supposed to generate an error to tell the user to narrow the search criteria. Thank you very much for the info BTW.

This topic was automatically closed after 70 days. New replies are no longer allowed.