REgex and applications

REgex is an abbreviation for Regular Expression. REgex is a character pattern that is used to search for a specific text or a portion of text.

The purpose of the regex is usually to search for text matches and/or make replacements on this text. When working with a large text document or an extensive log file. REgex allows you to quickly search for the pattern you need.

The first thing we need to understand is that every letter, number and symbol is a character. A string is made up of several characters to form words or keys.

In REgex different symbols are used to identify the pattern we are looking for.

For example:

 \d  allows us to search for any number from 0 to 9.
 \D will Match with any character other than a number.

One of the best tools I've come across for testing REgex codes is https://regexr.com/

Trying the example above:

In the image on the left the Regex matched the first number it found. In the middle, with 3 numbers in a row. And on the right, with the first character that was not numerical.

The tool allows to make tests in real time, selecting the text that coincides with the regex expression that we put as input.

Regex has several text identification tools. The important thing is to understand their syntax. There are several ways to search for the same character set. And it is up to each one to find the easiest way to do it.

Since in Regex the symbols on the keyboard have their own meaning, it is imperative to use an 'escape character' to be able to search for a symbol.
If our expression looks for a point (.) we need to use the following syntax \.
Otherwise the dot (.) itself is used in regex to search for 'any character'.

Another important example is inclusive and exclusive groups. This is done by using straight keys [ ].

In this example we notice that only the characters abcfjusz are selected regardless of the order in which they are found between the straight keys.

Likewise the symbol (^) to deny a data set. Taking into consideration the previous example:

Denying the above set allows us to select all characters except for the set between straight braces.

Below is a list of the syntax along with a link to a page that provides REgex exercises for practice.

Leave a Reply

Your email address will not be published. Required fields are marked *