How do you check if a data in a thing has a duplicate and show them?

For example, an “ITEM” has a data called “TRANSACTION-NUM”, TRANSACTION-NUM should be unique for each “ITEM” (this is separate from unique ID)

How can you display a list of "ITEM"s, that has “TRANSACTION-NUM” that have duplicates?

Hopefully that’s clear enough.

Basically I just want to run a check if I have some duplicates in my database.

1 Like

Something like this:
Screen Shot 2020-03-14 at 4.03.18 AM

But why wouldn’t you just prevent dupes from happening in the first place?

Thanks for replying!

I can’t find “Match” in my type of content when defining a repeating group. Is this a built in thing in bubble like Grouping?

Or Match is just a new type that you created?

But why wouldn’t you just prevent dupes from happening in the first place?

The data already exists and it’s just recently I decided to enforce it to be unique. I already have a trigger that checks pre-emptively if it’s unique or not before the data will be accepted.

That’s why I thought to create a workflow that goes through the old data and show me the duplicate ones. I had it all figured out in my head in theory, but when I started making the workflow for it I can’t find the functions I want.

It’s fairly easy to find duplicates of one thing if you already specified what to compare it with, but I can’t figure out a way to find duplicates of dynamic values.

In SerPounce’s response, Match is a data type that was created to facilitate the example.

Just a thought… but if this is a just a one-off check you want to perform on existing data, you might consider dumping the data to a CSV and using Excel to check for duplicates… couldn’t be easier to do in Excel.

Again, just a thought.

Best…
Mike

That’s actually a good idea since I just had to find the duplicate ones assuming it wont ever happen again with the new system. Then I’ll bulk upload and modify them afterwards. Thanks.

1 Like

Create a new number thing in your data type. Run an api to count your records, see if there are any duplicates. Then filter on anything greater than 1 ( or delete)

Hi @gilles I’m not able to really figure out how you would create the API to count the records. I’m assuming you’d have duplicates count with the numbers increasing to two or three if there were two or three entries of the same thing.

Could you share a screen shot of an example set up?

I just really get how to set it up to put the higher values on duplicates. I’ve only been able to get every thing to be number 1, so I am not able to filter based on anything greater than 1.

Hi @boston85719, i’ll put together an example with screenshots tomorrow. It ended up being a simple workflow, “Do a search for “thing”:count” Then, display in the RG anything > 1.

That would be much appreciated because I am not able to create a R.G. with datasource set up like @SerPounce example

And I can’t seem to figure out how to do a workflow that is “do a search for “thing”:count” and then display in the RG anything >1 like you mentioned.

which I believe is that the :count function returns numbers but my datatype in the R.G. is blog posts which causes the error.

My other problem is the :unique elements function in the R.G. doesn’t work and it just returns the full list of all blog posts.

Hoping there is a simple way to find and delete the duplicates but have yet to get something put together.

You can search for the list of items, and then use the minus list function for the same list of items and then use the :unique function on the end of it into a repeating group. You should be only left with the duplicates at that point.

2 Likes

That’s a clever way to do it. This could be applied to other stuff too, not just for finding duplicates. Thank you for this idea.

Glad I could help.

Hi @boston85719,

Here’s what I do to find duplicates.

Step 1: Create a think in your data type called “counter” or “count”

Step 2: Create a your backend worklow (API) counter:


All your duplicate / counter calculations are done in the backend, no duplicate calculations or logic in your RG. Ensure that you search for the name / thing / item that you want specifically counted for duplicates.

Step 3: Create button to start a backend workflow to count your items

Step 4: Have your RG only display “when count > 1”

Here are the results, you’ll see that eggs = 1 and is not display, while any item >1 is show in the RG. I normally then have another backend workflow that deletes records > 1 so that there’s only 1 remaining. To summarize - perform the count as a backend workflow, then have your RG (or duplicate deleting workflow) work off the number it calculates. There’s probably a few other ways of skinning this bear, but this is what I currently use.

2 Likes

Hi @laflash8, I like where this one is going. I tried to mimique it with my example, but couldn’t get it to work. Can you guide me where I’m wrong?

Thx,
G

Can you show me the result of that input?

I guess i’m taking the list and just deleting the same list, thus no results

I’m assuming you ensured that there is indeed duplicate data present inside QTY-Tests? If you enabled any parameters you’d need to ensure both lists have the same parameters.

this is my data:

You can see duplicates with masks and TP. Should I have a more detailed filter in my search on their category?

I didn’t set any parameters in either search box

How about this:

Screen Shot 2020-03-17 at 1.38.56 PM

1 Like

Thanks for the detailed instructions. I had actually got something working that is different from this and the other suggestions on this thread.

I started with a R.G. whose data source was to search for a thing.

Then I did :group by to create groupings by name of the thing

Then I did :filtered by to get all those whose count was greater than 1

After that I did :sorted by name which was more or less just for my visualization to see an ordered list

So what I have in that R.G. is a list of groups of things. Each group is created only when there is more than 1 of the thing by that groups name.

Screen Shot 2020-03-19 at 11.23.57 AM

Then I created another RG to get the individual things as a list by group. What this basically did was say I had 50 groups of things, representing the fact that I had a minimum of 50 things with duplicates, which could be 2 or more duplicate entries (some of my things had 4 of the same in the DB, some only had 2).

I created a custom state to track the index number of the “group” RG (in my screenshot RG2) and it would display in my new RG all the items in the DB by that name…so if I had 4 things in DB with the same name, this new RG would show only those four items.

The item number is the custom state that tracks the index of the first RG that I have annotated in Red outline box. The black boxes are simply covering my datatype that I am searching for so as not to confuse others and they can focus on the general setup instead of the “thing” (datatype) being searched.

After that I have my workflow set up as so

This is incrementally increasing the custom state number to track the index number…basically each time it increments the list of duplicates in the RG changes so that the second step can load them.

In this screen shot the items until #3 could be changed to reflect the “last item” if you are expecting to have many duplicates of the same “thing”

All of this was set up to simply place a number in the DB on the duplicate records that I would later delete using a backend workflow which is pretty straightforward and I have since deleted so now screen shot to show that setup…although I believe on this thread there are already some examples of this. I set mine up to be recursive.

In the end my way seems to be a very convoluted way about it and the next time I attempt this I will try to follow your example as it seems less complex.

Good to see how many different ways people are solving the same issue.

2 Likes