URL Validation of Inputs and stripping of folder structures

This is to help out the URL validation and regex parsing of those URLS stripping unwanted stuff.

What am I trying to do:

Via an input I wanted to validate the entry was a valid URL (both https, and http - nothing crazy).

Then, if it was valid - get the URL and strip it back to its host, but include sub domains and protocols (http or https) - NO folder structure.
From this:
https://blog.badgerfarmers.com/badger-pens/
http://www.badgerfarmers.com/badger-pens/
To this:
https://blog.badgerfarmers.com
http://www.badgerfarmers.com

and if this happened:

ftp://badgerfarmers.com
badgerfarmers.com/x
It says, nope - that’s not valid - try again with the protocol please.

This post got me part of the way there, so props to @alan.thomas111997

Create a state - I called mine ‘is URL’ - free text

Then create a workflow for when the URL input is changed where the user types in the url.

You will need two workflows, one for when the URL is valid and one for where its not valid.

Valid:


Not Valid:


MAGIC RegEx you will need to use in both instances :

(http|https)://([\w_-]+(?:(?:.[\w_-]+)+))([\w.,@?^=%&:/~+#-]*[\w@?^=%&/~+#-])

(make special note of the first image in each valid, not valid)

The custom states will adjust to each and this is where you will use the custom state to determine what is shown or not depending on your use case - in conditionals etc.

The other workflow is where the value is changed and this removed using regex and stores in a new custom state the stripped back version of the input URL

Magic Regex here: ^(?:https?://)?(?:[^@/\n]+@)?([^:/\n]+)

Now be mindful with this one, you should use the first elements to validate it first - and then show the results of this as I have found if you add ftp:// etc it will remove the rest of the url.

I am no regex expert… so if you have a better regex example please share but this does appear to work.

Good luck

2 Likes

I thought I wold add to this for those that need to strip:

http:// AND https:// protocols AND www. from a url.

get the input or whatever, then do a find and replace with regex and use:

http(s)?(:)?(//)?|(//)?(www.)?

be mindful of capitals.