Regex pattern that excludes whitespaces

Hello I’m using this regex pattern to extract this text succesfully:
Regex Pattern: tailored_bullet_point_1(.+?)explanation_1

Text: tailored_bullet_point_1": "Collaborated seamlessly across teams, ensuring detailed integration of academic and social-emotional aspects into comprehensive learning plans, showcasing @strong communication@ and commitment to personalized experiences for stakeholders.",\n "explanation_1\

There are 4 white spaces after the \n at the end of the text right before "explanation_1. The number of white spaces can vary but they’re always in the same place (between the line break and the "explanation_1.

I’ve been experimenting and haven’t been able to figure out how to adapt the regex pattern to exclude whitespaces after the \n and before the "explanation_1\

Any suggestion on what to try are greatly appreciated!!!

Include other white space chars, if they might be present in your texts.

Thank you but my goal is to EXCLUDE the white spaces.

I’m trying variations of tailored_bullet_point_1(.+?)\s*explanation_1
where \s* : Matches zero or more whitespace characters (spaces, tabs, newlines). When used after a capturing group, it excludes any whitespace immediately after the content captured by that group from being included in the match.
So I’m not sure why it’s not working :confused:

It works well, consider explaining your setup in more details.

Hello Vladimir, thanks for you willingness to help out. I think this image explains it better. I have the regex code you graciously provided applied and I’m still getting the white spaces that I want to exclude (see green underline):

The whitespaces are after the \n which I’ve tried multiple ways to take into consideration in the patterns which capture the group in tools like regexr but do not exclude whitespace between the capture group that always ends with “.” and “explanation_1”

Here’s the original text as returned by the API
tailored_bullet_point_1": "Collaborated effectively across teams to enhance stakeholder experiences through seamless integration of academic, social-emotional, and logistical aspects, showcasing exceptional @communication@ skills and attention to personalized learning plans.",\n "explanation_1

Here are the patterns that I think should work but is not:

Here’s the Chat GPT interpretation of the code which is exactly what I’m looking for and again it captures the group successfully in regexr:
In this modified pattern:

[\s\S]*? matches any whitespace or non-whitespace character (including line breaks) zero or more times, lazily.
This ensures that any whitespace characters, including line breaks, between the end of the capture group and “explanation_1” are included in the match, and will be effectively eliminated.

Extract with regex extracts the portion that matches the whole expression, not the group inside it, so you need to use positive look-ahead and look-behind assertions.
And AFAIK (I could be wrong here), newline modifier is enabled by default, so you might need to transform original text escaping newlines, then extract and de-escape back.

This one works within the Extract with Regex with your text, but I still didn’t get whether I should decode \n and replace it with a new line or it’s part of the text as it is. I chose the former case.
Note that if there’s line break within the value text Collaborated ... plans., the pattern will fail. I do remember there’s some trick like I described in my post above, but I don’t remember the cause… It looks like not about newlines modifier, but I forgot what it is. Address bubble docs about this operator or maybe there’re some related post here in the forum.

P.S.: got it, use ((.|\s)+?) instead of simply (.+?) as dotAll flag is not enabled by default, so you finally get: (?<=^\s*tailored_bullet_point_1)((.|\s)+?)(?=\s*\"explanation_1\s*$)

Vladimir! Thank you very much. This was extremely helpful to me and I learned a lot. Your a champion! Thank you!!!

1 Like

No problem, you’re welcome.