Dynamic regex filter from a list of items

Hi everyone,

we have a database which includes all of our products (~20k), each having an SKU field (the SKU is basically our internal ID).

Now, we want to match specific other database entries based on that SKU. To do that, I’ve created a dynamic filtering system relying on regex. A user can add as many different regex formulas as they like, and for each of them they can add as many numbers, that get checked after the regex has been applied to the SKU, as they like.
Here is an example (the xxxxxx-yyyyy strings are just the display names of the regex templates to show us more nicely what is being matched for – the y character):


xxxxxx-yyyyy uses the regex “(?<=^[a-zA-Z0-9]{6}-)[a-zA-Z0-9]{5}(?!.*-)”
yyyyyy-xxxxx uses the regex “^[a-zA-Z0-9]{6}(?=-)”
xxxxxx-yyxxx uses the regex “(?<=^[a-zA-Z0-9]{6}-).{2}”

So technically, a user could add unlimited regexes (it’s not limited to these three), each with unlimited matched numbers.

What should happen now is that each regex gets applied to each SKU in the Database and then, each of the numbers inside the corresponding regex fields gets checked if it matches. All of this should be like a dynamic OR condition, so just one of them has to match.

The huge challenge is that they’re dynamic.

Does anyone have an idea for an approach that might work?
Any ideas are highly appreciated.

I could technically remove the dynamic aspect from the regexes, that there are set regexes only, and only the actual input numbers it should match for, for each are dynamic.

This would work as most products’ sku follows the same system, and the exceptions usually follow one of ~5 other systems. This would mean that I could add a field for each regex to the actual product and whenever a new product gets added, it applies each of the regex functions and adds the output to the dedicated field.

That way, we would get rid of the challenge of looping through two different dynamic nested searches.

Then, depending on what regex(es) the user decides to filter on, it would just have to check each of the input values against the selected regex field of each product. Still very database heavy IMO, but not as bad as the original approach.

Let’s say I simplify it this way, what would be the most efficient approach?

Update:
If I make each group for each of the 7 static regexes completely static (not putting them inside another rg which displays one row for each regex), I could do something like this, which doesn’t even require the filter function and should therefore be pretty efficient: