Hi, can you elaborate more on how to implement this? I’m kind of lost on the IF’s
Sure, here’s how I handle the process (there are various ways to achieve the same result but this is, in my opinion, the simplest way to do it without using any plugins).
Basically the process is this…
In the browser you run a workflow to create a new ‘MemberMaster’, and set the partsCode field with a random string (using generate random string).
That will generate and set a random string for the ‘MemberMaster’, but it may or may not be unique.
Then you need to run a recursive workflow, to check if the code is unique. If it is then it ends. If it’s not, then it will generate another random string, then check it again, and so on.
Without using plugins you can’t run recursive workflows in the browser so it has to be done on the back end.
So first create a backend workflow, call it whatever you like (something like ‘Verify partsCode’), and set a parameter to accept the datatype ‘MemberMaster’, and give it a descriptive name, such as ‘Member to Check’.
Put an ‘only when condition’ on the workflow itself so that it will only run if the number of MemberMasters with the partsCode you’re checking is more than 1 (i.e. it’s NOT unique).
only when: search for MemberMasters (partsCode=Member to Check's partsCode): count>1
That means this workflow will only run if the partsCode for the MemberMaster being passed into it is not unique (if it is unique the workflow won’t run).
Then, in this backend workflow create 2 actions:
Make changes to ‘Member to Check’ (that will modify the MemberMaster that’s been passed into the workflow): field = partsCode, value=generate random string.
That will set a new random string on the MemberMaster.
Schedule API workflow = Verify partsCode (schedule the same workflow to run again)
This will run the workflow again, which will check the new partsCode - if it’s unique that’s it, it wont run. If it’s not, it will run again and generate a new code, then schedule itself again, forever, until a unique partsCode is found, at which point it will stop (in reality it would be extremely unlikely to get a duplicate even twice in a row)
Then, in your original browser workflow, create a new MemberMaster, with all the fields you need, including the partsCode field set with ‘generate random string’.
Then, in that same workflow, at the end, schedule the backend workflow (schedule API workflow: Verify partsCode) to run immediately (current date/time) and pass the newly created MemberMaster (the result of step 1) into the ‘Member to Check’ parameter.
The browser workflow will create the entry and set the initial partsCode, then send it to the backend for verification.
If the partsCode is unique nothing further will happen.
If it’s not, the backend workflow will run, create a new partsCode, run itself again to check its uniqueness, and continue until a unique code is set.
Thanks for this @adamhholmes.
Hi @jared.gibb , I don’t know if I got you right but what I did was I created a loop of 2 custom events that calls each other back and forth until the “Only when” condition is met, Then workflow is terminated. But I am getting this error.
This method actually worked somehow. I was just wondering why I get this warnings in the debug mode. Is this method not allowed? The only downside I see on this on live is the loading time as the workflow goes into a loop until it generates a no duplicate string and went to terminate this workflow. Please enlighten me.
I HATE recursive workflows…too flaky and dangerous. Why don’t you just keep a list of all the numbers used in the db and check if the one just generated already exists? I’m known to create tables with just one “List of texts” field…those only allow unique values so that’s an easy way to check without searches.
Does the is need to be random? Why not just create 000001, then 0000002 and so forth? With other methods mentioned you will get ever increasing checks. E.g. you code might need to create and check a value several times before finding one that isn’t taken, which is highly inefficient.
Yes it has to be random to prevent multiple users to get the same digits if registered at the same time.
I’m looking forward for Bubble to add a “calculate non-repeating random string” in the calculate formula action. This will help lots of apps for sure.
Randomizing with “only” 6 digits will always have a probability of duplicates. Even hard-to-break 32 digit crypto-strings has the risk of duplication, albeit a really low risk.
I run as saas app for a company that uses running digits like mentioned above without duplicates, so I know for a fact that the method works. The other methods mentioned will create unnecessary and ever-increasing calls to the db to avoid duplicates as the number of available digits decreases.
Randomizing stuff is actually really hard, and to be fair to bubble, they already did create a “non-repeating random string” - its the unique id on every object
In general the original poster, with choosing 4000 numbers “randomly” from 1000000, is running hard up against the Birthday Problem. Which results in a very non-trivial probability of identifier collisions.
The solution you are hinting at is the fairly common technique of generating a hash or digest, to a specified number of digits, seeded by the high water mark of the created record count; which is strictly increasing, and thus unique itself. If you are just trying to generate a unique number, and do not need any security guarantees nearly any non-cryptographically secure hash function will suffice.
This is a very common strategy in traditional enterprise RDBMS, and also underpins PHP’s (in)famous constant time look-up associative arrays (although I think PHP’s specific implementation is a bit slower with look-up time growing as the double logarithm of the number of stored items). Other options would be to hash another field that you want to constrain to unique for the user, such as their email address or social media handle
; or hash chaining, where the input for the next hash function is the output of the previous hash function. In this last case you only need a hash function that generates a permutation with a single orbit.(this is just a how a random number generator works)
Hashes and digests are considered performant constant time methods for generating “nearly” unique strings. They don’t take longer the more records you have. The “nearly” caveat is due to the occurrence of hash collisions. The difficult computation of which underpins block chains.
In the meantime the best advice a seasoned mathematician can give you is to use more than 6 digits. That number of digits doesn’t contain enough information to uniquely cover thousands of records with an “approximately” non-deterministic algorithm.
My personal “ok-line” on this is to extract the UNIX timestamp and append 6-8 random chars as the rando string. In order to overlap, two users have to signup at the exact same millisecond and pick the same straw out of 10 Million. Other than that, all other scenarios of overlap are impossible, and the juxtaposed strings still retain a somewhat appearance of randomity.
If that’s not sufficient safety for your app’s usecase, you’ll need to stand up a dbo sequence object on a hosted DB somewhere and SQL-conn in to pull the next-in-sequence. A DB sequence object enforces uniqueness, as two simultaneous requests will still get handled linearly.
What @aaronsheldon says is completely correct and delightfully well-documented with references.
In application development, we often need (or desire) to create some unique-ish identifier. But as Aaron points out, a six-digit number will not suffice.
If you want to generate an asymptotically-increasing serial number, you can in fact do this with Bubble, but you need to follow a very specific workflow, as outlined in this classic thread – if you’d like to just skip to the solution, read this specific summary reply.
But of course, in between, I have a great video explicating the problem and solution.
(Also, at this point, I like to point out that all Things in your Bubble database already have a Unique ID and so I feel this sort of thrashing is stupid and pointless, but some people just don’t like the appearance of that Unique ID. Whaddyagonnado?)
NOW, we of course do not have to limit ourselves to numbers for our unique ID string. For example, Bubble’s unique ID for Things is a set of numeric characters joined with “x” by another string of numeric characters. And, hence it is a string (what Bubble calls a text), and not a numeric value.
By increasing the number of symbols in the alphabet that we use to construct our unique ID, we can reduce the number of symbols we need to create more-or-less-universally-unique random strings without risk of collision.
Because this is such a common thing, my List Shifter plugin actually includes an implementation of the NanoID library (this shouldn’t be a surprise because, of course, my plugins are full of awesome surprises), which is great (and more efficient than UUID) at generating universally-unique values.
If you’d like to experiment with the collision risk inherent in different unique ID schemes, the NanoID folks have a really great one here:
Using this tool, you can see how collision-prone a six-digit, randomly generated numeric string is:
what this tells us is that, if we generate just 6-digit numeric strings at a (very slow) rate of ONLY 10 per second it will take us only 14 seconds to be at a risk of a 1% chance of a collision!
If we were to change our alphabet to the set of all URL-safe characters (that is, we go from 10 symbols to 64 symbols), the same computation would still only take about 1 hour before there is a 1% chance of a collision:
Now, of course, we can see that this is still hardly “unique” in computing terms.
But what if we go to the default length that Nano ID uses (21 characters) with the full URL-safe alphabet? Here’s what we find:
In this case, it will take FOUR BILLION YEARS (approximately the age of the planet we live on) before there is a 1% risk of a collision (again generating 10 of these values per second).
If we look at even higher rates, we still find very good (unique for all intents and purposes) perfomance:
if we compute 100,000 of these things per hour, it will still take us ONE BILLION YEARS to reach a 1% chance of collision!
That will probably do, eh?
With List Shifter, you can use the “PROCESS List” action to generate Nano ID-computed “unique IDs” with a length of your choosing (I don’t let you pick the alphabet, which always uses the default).
Here’s a little example: When you click the “Generate IDs” button on this page, it generates ONE MILLION NanoIDs and then tells you if there are any duplicates (if there are no duplicates, the Number of Unique values will be 1000000. (And, BTW: If you do find duplicates at a length of 21 symbols, you should run right out and buy yourself a Powerball ticker, as it’s your fucking lucky day.)
Here’s the example page:
Postscript: To see how long this computation takes, open the console (hit F12 in most browsers) and you’ll see that Debug Buddy is logging the time it takes.
(On my pretty damn fast machine, it takes ~22 seconds to compute 1 million NanoIDs with a length of 100 symbols. It computes 1 million NanoIDs with a length of 21 symbols [the default] in about 8-10 seconds.)
Which is to say: Generating a single NanoID – of whatever length – takes no appreciable time at all.
BTW2: If you experiment on my page with length values less-than-10, you’ll start to see that the total available number of unique values is less than 1 million and that you are always assured of having dupes.
Statistics is hard.
Found this thread searching for a complementary requirement, verify that a user input isn’t repeated (eg. SSN).
I came out with the following workaround.
Using the API connector, create a Self Calling WF endpoint, in which a search will be done for all existing things that use the “unique” field.
where green is the value received in the endpoint and verified as unique against values saved in black’s red field.
Has been working for now, but for a 500 entries table, so, not sure how scalable this might be.
In my Back End Utilities I have a server-side function called
Sample Dictionary which I use for generating cryptographically strong random strings. It accepts a string of characters, representing the dictionary to sample from, and a number specifying the number of samples to make from the dictionary, with replacement. The returned value is the concatenation of the sampled characters.
My use case that I have implemented is populating slugs with a random 8 character base-32 string, sampled from
0123456789ABCDEFGHJKLMNPQRTUVWXY. The letters
IOSZ are omitted to provide disambiguation from
0125 when being read by humans. This provides a little over a trillion unique combinations; allowing me to make over a billion draws before I even come close to having a %0.001 of hitting a duplicate. If you expect to have more than a billion unique draws I recommend using a longer string or a larger base, like base-64.