Upload .csv of data that references another tables' unique_id

Any idea how to upload .csv within the data tab when the dataset to be uploaded contains a field that references another tables’ unique_id?

I want to load a .csv of data to table = events. This table includes a field referencing the user table (eg. has user$unqiue_id). When I go to upload this dataset, I’m not given the option to specify that this is the unique_id.

@neerja, is there a solution for the above? We spent a number of hours manually populating this row by row. That’s not a sustainable solution. It’s particularly frustrating for us because this stemmed from “the result of” bug we submitted so it’s as if we got burned twice for something out of our control.

Thanks,
Scott

These days I’m doing preparations for a big data migration. I do this in two steps, the first step is importing of all the records (and get back Bubble’s unique_ids, that is ‘primary keys’), the second step is the imputation of these unique_ids into the reference fields. Since I have lots of tables I’m doing this with Talend via Bubble’s Data API. The REST actions are POSTs for the first stage and PATCHes for the second. The conversion from old Ids to unique_ids happens in Talend.

So, I guess you have to decide how and where to do this ‘old Id’ --> ‘unique_id’ conversion. If you set the value of a ‘foreign key’ field in Bubble, and this unique_id value is valid and it exists in the relating table, then the reference works.

Another possibility is to use different ‘primary keys’ and not the default ones in Bubble, but this probably means that you have to take care about key generation for the new records.

1 Like

Thanks for sharing @eftomi. Seems like building out ETLs in Talend would have a non trivial cost to set-up and maintain. Were you able to confirm that there’s no way to load this directly through Bubble (without changing primary keys) before you went this route?

…just hoping their is an easy solution.

The trouble is that I didn’t find any way to set the values of Unique_ids (i.e. primary keys) for imported records by myself - Bubble creates new unique_id value for each record (=thing) that you import, in either way - by using CSV import or API, and it doesn’t allow for unique_ids to be created by you and imported together with the data. (I’m talking here about primary keys, not about foreign keys.)

If you have a small number of types (=tables), the conversion of keys could be done relatively easy. The first step would be the data import, together with the existing primary and foreign keys; Bubble creates unique_ids. The second step would be the processing of (old)foreign keys values, matching these with (old)primary keys and thus finding the unique_ids, to be written as (new)foreign keys. This could be done in a workflow I suppose, but I decided to go for ETL because I have a huge number of tables and foreign keys, so setting the workflows in Bubble would be quite time consuming.

Just one important thing, to be clear about the import of the foreign keys in Bubble - let’s say that you have two data types, a Person and a Car, and that the Person type has a field called PersonsCar of the type Car, so they are related. You can import the value of PersonsCar field, if this value is regular and existing Bubble’s unique_id from the table Car. This works nicely.

Thank you for a detailed explanation. We spend an obnoxious amount of time doing similar exercises and I want to be sure I understand exactly what you do.

Is this correct?

  1. You import a giant .csv. This creates new records each with primary key; the foreign key doesn’t populate as we know.
  2. You then download the table so that you can get the new primary keys and you merge this back to your original .csv. You’re using talend to do the merge.
  3. You then use the data API to modify a list of records and POST the foreign keys into the database using the primary key to look up the correct record.

Few questions

  1. If this is correct, do you have a way to get around the POST limit (1000 records or 4min timeout)?
  2. Did you ask the Bubble team about relaxing the referential integrity to enable the .csv to load with FK directly?
  3. Finally, we spend a ton of time checking values in the database. Have you found a way to write SQL against your bubble database to do “unit-test” like checks on the data across tables?

The answers are:

  1. I put the data from a particular table into Bubble with Data API POST (please see the reference here)
  2. A response from server is JSON with new Bubble unique_id, which I write into the conversion table, together with the old ID from the old database, so I have <old_id, bubble_unique_id> pairs
  3. After all tables are transferred into Bubble, I use Data API PATCH (reference) to write Bubble_ids as foreign keys into Bubble tables. I prepare fields in Bubble beforehand, of course, the type is the Data Type that the field is referring to.

So, I’m not using CSV import, but the restrictions about unique_ids are the same.

If you have a relatively small number of tables and relationships, you can do the conversion with Bubble and you avoid the hassle with ETL. I tried this with small example and it works. If you decide for this, the approach would be:

  • import the data via CSV, together with old foreign keys
    (let’s say that you have two related tables, Person and Department - you just import them)
  • for each old foreign key in a certain table, create a new field in the same table which will hold the new bubble unique_id, its type being the referenced Data Type
    (let’s say that Person has a foreign key field DepartmentID - you add additional field PersonsDepartment of type Department to Data Type Person)
  • create Workflow API which you can call with the Bulk button close to the Import & Export buttons (please see the reference here and there). This button runs a chosen workflow once per each thing in opened view
    (if we follow the Person & Department case, you work on Person table)
  • when you create this API Workflow, it should take a parameter of Data Type corresponding to the table that you are processing
    (in the above example, the table Person which has foreign key should be changed, so the parameter could be called PersonAsInputParameterToWorkflow of type Person)
  • in the workflow, you set the first action as Make changes to … Thing to change: PersonAsInputParameterToWorkflow, and you set PersonsDepartment = Search for Departments:first item. The search box should be set as Type: Department, DepartmentID=PersonAsInputParameterToWorkflow’s DepartmentID.

So, for each record in table Person, the old foreign key of Department is searched for in related Department table, and you get the corresponding Department which is saved into PersonsDepartment. It is simple, in fact.

After that, you can remove old foreign key fields.

About limits, I don’t have time constraints, my migration is running just now and it will take a day or so, but the process is robust in Talend, and the calls are asynchronous. I assume that the generation of unique_ids in Bubble is in its deepest core, so the relaxation of this is something that can create lots of problems.

I hope this helps.

3 Likes

Thanks. Time constraints are on the Bubble Data API. My understanding from the docs is that we can POST up to 1000 / post and it needs to take <4mins to load. How did you get around that?

"The maximum number of items that can be created via a single bulk request is currently 1000. There is also a limit of 4 minutes for the request to complete; if it takes longer than 4 minutes, items that have not yet been created will be marked as errors in the response. "

I’m sorry I wasn’t clear - if the process halts, I can wait and have no needs for higher throughput. If I had to migrate really quickly, then it would be a problem.

I suppose you can ask the Bubble team to raise these levels for your particular purpose.

You’re getting around the limit by POSTing many small packets of data until it finishes, correct?

Yes, but still it happens that it stalls. If this happens, the REST request just waits. I set the timeout to a couple of minutes, and then it continues until next stop.

1 Like

Thanks. Very helpful post.

1 Like

This topic was automatically closed after 70 days. New replies are no longer allowed.