Regex Expression for Duplicate Occurrences

Been playing around with this and have come close…

What is a Regex expression that will check for any duplicates in a string, and stop when it finds the first occurrence of any upper case letter combination, any lower case combination, any special character combination, or any number combination. And the occurrence can be found in any consecutive or non-consecutive positions in the string starting at the first position.

I’ve tried and a number of other variations including lookahead

but in this case, it only gets the characters that are next to each other. I’m not sure how you would get the other characters that are spread out in the string.

/(.+)(?=.*\1)/gm

Thanks - initially it failed for me. But I think it was something in my code. Still trying to figure it out’.

Update: Still getting duplicates. Example: zUiU7GJ)

Go ahead and fix it :slight_smile:
I actualy didn’t follow what do you want to get at the end… A `first occurrence of any uppercase letter combination" in regex terms means just first duplicated uc letter. Grab all other combinations and you get “find any duplicated char”.
So, I just made your pattern working with positive lookahead to catch distant duplicated chars, but I don’t understand your final goal.

No duplicate characters (special characters, numbers, Upper case, Lower Case) of any kind - anywhere in the string. Regardless if they are next too each other, have characters in between them, multiple characters like 3 in a row or 3 in the full string, at the beginning and the end, etc. That’s it. Thanks. (PS - I’ve been trying to work it - amongst other things for days - but haven’t been able to come up with the right code to make it work).

@s.arndt , (?:([\s\S])(?=.*\1)) should work. It does not ignore case, so if you have ‘AAaa’ in the source, the result will be ‘Aa’. If you want to ignore case, maybe it is an option to use :uppercase or :lowercase before the find & replace?

Another option is to use Javascript: Regex Question (case), see mebeinken’s example.

I’m gonna become that guy, but can’t keep my hands to myself…
And I still didn’t get whether you get your issue solved.
So my questions are:

  • What does “No duplicates” mean? Do you want to detect duplicates or you want to remove duplicates and get a string without them?
  • 3 in a row or not - it doesn’t matter. If you have a sequence of 3 chars duplicated, then each one of them is duplicated, so it doesn’t matter, and you don’t need to mention it, unless my understanding is wrong.
  • What do you mean when saying “…check for any duplicates in a string, and stop when it finds the first occurrence”? Are you trying to extract that first occurence or what?
  • How do you use this regex, in javascript? in Bubble expressions?

im using the regex to determine if a string of characters has any duplicate characters in it. at any positions in the string. i dont care if the string has two characters that are the same or 50 characters the same. or if the string has multipe sets of duplicate characters or only one set. And case sensitivity is needed (a snall a is not a duplicate of a large A).

what i do care about is making sure that i test for any type of character and a string gets accurately detected.

It would also be great if the regex could handle international character sets.

The regex doesnt need to delete the duplicate(s) - only detect. i wull handle that based on the conditional result of the test.

i have coded an iteative process to check but because of rhe number of strings that may be checked, the process consumes all of the memory stack, whereas the regex approach doesnt.

im doing this conditional test in a javascript workflow action using the toolboxes run javascript.

You don’t need regex:

console.log("Has duplicates:", (s => Array.from(s).some((c, i) => s.includes(c, i + 1)))("zUiU7GJ")); //Has duplicates: true
console.log("Has duplicates:", (s => Array.from(s).some((c, i) => s.includes(c, i + 1)))("zUiS7GJ")); //Has duplicates: false

or the same but more readable:

function hasDuplicates (s) {
    return Array.from(s).some((c, i) => s.includes(c, i + 1));
}

console.log("Has duplicates:", hasDuplicates("zUiU7GJ")); //Has duplicates: true
console.log("Has duplicates:", hasDuplicates("zUiS7GJ")); //Has duplicates: false

but if you’re insisting:

console.log((/(.+)(?=.*\1)/gm).test("zUiU7GJ")); //true
console.log((/(.+)(?=.*\1)/gm).test("zUiS7GJ")); //false

though this looks shorter, the first option is faster.

P.S. you don’t need /g actually…

Yes, that is possible too using Bubble:

The regex pattern: ([\s\S])(?=.*\1)

Or combine it with @vladimir.pak’s suggestion to exclude line breaks, in that case the pattern is (.+)(?=.*\1).

okay tried the .some method. works great. definitely faster with less overhead. thanks

:+1:
this one is better optimized though:

console.log("Has duplicates:", Array.from("zUiU7GJ").sort().some((c, i, a) => c === a[i + 1])); //Has duplicates: true

Thanks - Appreciate the follow-up. I change my code to using this one and works fine.

This topic was automatically closed after 70 days. New replies are no longer allowed.