In my last paragraph I describe duplicating Question data on Upload and then duplicating Question and Upload data on Review.

In my gut, with the proper relational database design I don’t think I should have to repeat data (though I’m struggling to figure out how as I make clear below).

And ed727 raises a good point when he says that duplicating data can lead to issues when a record is edited. If a Question Thing were modified, I’d have to trigger a cascading series of edits of all affected Uploads and Reviews. That sounds like it could be a difficult system to maintain.

Have you had success maintaining a database structured in this manner? Is it not too difficult in practice?

Is there are reason you don’t do both? If you were to use an Author field on Article data type and have a list of Articles on the Author data type wouldn’t that give you the best of both worlds?

Were your database structured in that manner it seems to me you’d have both the speed and scale options available and you could choose whichever was more appropriate for each individual database query.

I realize this violate your principle of normalization but with normalization I seem to be running into the multiple search issue you address:

In my example I have a Question, Upload (think “answer” data type), and a Review data types.

One question has many Uploads. One Upload has many Reviews.

Let’s say Current User has submitted two Uploads in response to the Question in which we’re interested. And each of those two Uploads have received five Reviews. So one Question, two Uploads, 10 Reviews.

For that one Question, I want to show the Current User’s Upload with the highest average Score (Score is a field on Review).

For the Current User and the relevant Question, let’s say Upload 1 has Scores of (3, 3, 4, 5, and 5 for an average of 4) and Upload 2 has Scores of (4, 4, 4.5, 5 and 5 for an average of 4.5). Upload 2 is the higher rating and that is the Upload whose score I want to show.

With a “normalized” database structure, I can’t figure out a way to do this without doing a ton of searches and/or running into dead ends.

First I do a search for Uploads where Question is the relevant question and Created By = Current User.

Then I get lost. I need to find all of the Reviews associated with the list of Uploads. Uploads doesn’t have field for Review (only Reviews has a field for Uploads in a “normalized” database) so I don’t know how to get the list of Reviews for this User / Question combo.

Even if I could get the list of relevant Reviews described in step 3, then I’d have to perform math on those Reviews. Each Upload has five Reviews. I’d need to average the Review’s rating field at the Upload level and return the Upload with the highest average rating.

Maybe I’m approaching this wrong and shouldn’t start with a search for Uploads where Creator = Current User.

Since Review has a field for Upload but Upload doesn’t have a field for Reviews maybe I need to start with Reviews to get around this problem. But this seems to create it’s own unique problems.

First I search for Reviews. Then I filter those Reviews so I only return Uploads by Current User and Parent Group’s Question.

I’m not sure why my “Upload = Search for Uploads(More…)” query isn’t complete (it shows in red). Shouldn’t this just return a list of Reviews that meet the Upload criteria?

Then, even if I got my Upload query fixed and returned a list of Reviews that met the Question and User criteria, how do I group this list of Reviews by Upload, calculate the average Score at the Upload level, and return the Upload with the highest average rating?

Didn’t quote your whole reply but lots of great info there. Thank you, Keith.

One thing about Things I’m still not sure about is related to the quoted text.

Bubble describes Things: “A thing, in Bubble, is an entry in the database. For instance, a specific car created by a user in our example.”

Let’s imagine a Car data type with fields Year, Make, Model.

The database entry: “2005, Honda, Accord” is clearly a Thing.

In an earlier post you stated “they mean a Thing (a custom data type). A field can of course be of type Thing.”

When you say “A field can of course be of type Thing” do you mean the field itself (e.g. Make), is a Thing in addition to the field data (e.g. Honda)?

Same thing with data types. Is the data type “Car” a Thing? Or are only the database records associated with Car (e.g. 2005, Honda, Accord) Things?

When you say “A Thing (capital t) is a custom data type created by the Bubble programmer user” it makes me think that both the Car data type, the Year field, and database entries such as “2005, Honda, Accord” are all Things.

But when you say “A Thing can only be created, modified, or deleted by the Bubble programmer user by explicitly calling the Create/Make Changes/Delete a Thing actions” it makes me believe that Car is not a Thing since it’s created via the Data tab, not the Create/Make Changes/Delete a Thing actions.

Rather than saying “A Thing (capital t) is a custom data type created by the Bubble programmer user” would it be more accurate to say “A Thing (capital t) is database entry associated with (a) a custom data type created by the Bubble programmer user or (b) the built-in User data type.”?

You’ve already been very generous with your time but could I trouble you to expound upon why I shouldn’t think in terms of rows?

I think we’re all familiar with Excel. What’s the problem with thinking about a Data Type as a an Excel tab, a Field as a column, and a Thing as a Row?

If you look at the screenshot of your Turtle example, this is how Bubble displays the data.

Turtles is the Data Type and is shown on it’s own App Data tab like an Excel tab. Fields are displayed as columns and the database entry is displayed as a row just as they would be presented in Excel.