Advanced Regular Expressions | Trend Micro Service Central

Counting and Grouping

Element	Meaning	Example
.	The dot or period character represents any character (except the new line character).	`do.` matches: doe, dog, don, dos, dot `d..r` matches: deer, door
*	The asterisk character means zero or more instances of the preceding element.	`do*` matches: d, do, doo, dooo, doooo
+	The plus sign character means one or more instances of the preceding element.	`do+` matches: do, doo, dooo, doooo but not d
?	The question mark character means zero or one instances of the preceding element.	`do?` matches: d or do but not doo, dooo
( )	Parenthesis characters group whatever is between them to be considered as a single entity.	`d(eer)+` matches: deer or deereer or deereereer The + sign is applied to the substring within parentheses, so the regular expression looks for d followed by one or more of the grouping eer.
[ ]	Square bracket characters indicate a set or a range of characters.	`d[aeiouy]+` matches: da, de, di, do, du, dy, daa, dae, dai The + sign is applied to the set within brackets, so the regular expression looks for d followed by one or more of any of the characters in the set [aeioy]. `d[A-Z]` matches: dA, dB, dC, and so on up to dZ. The set in square brackets represents the range of all upper-case letters between A and Z.
[^ ]	Caret characters within square brackets logically negate the set or range specified, meaning the regular expression will match any character that is not in the set or range.	`d[^aeiouy]` matches: db, dc or dd, d9, d#--d followed by any single character except a vowel
{ }	Curly brace characters set a specific number of occurrences of the preceding element. A single value inside the braces means that only that many occurrences will match. A pair of numbers separated by a comma represents a set of valid counts of the preceding character. A single digit followed by a comma means there is no upper bound.	`da{3}` matches: daaa--d followed by 3 and only 3 occurrences of “a” `da{2,4}` matches: daa, daaa, daaaa, and daaaa (but not daaaaa)--d followed by 2, 3, or 4 occurrences of a `da{4,}` matches: daaaa, daaaaa, daaaaaa--d followed by 4 or more occurrences of a.

Shorthand Classes

Element	Meaning	Example
\d	Any digit character; functionally equivalent to [0-9] or [[:digit:]]	`\d` matches: 1, 12, 123, but not 1b7--one or more of any digit characters.
\D	Any non-digit character; functionally equivalent to [^0-9] or [^[:digit:]]	`\D` matches: a, ab, ab&, but not 1--one or more of any character but 0, 1, 2, 3, 4, 5, 6, 7, 8, or 9.
\w	Any "word" character--that is, any alphanumeric character; functionally equivalent to [A-Za-z0-9] or [ [:alnum:]]	`\w` matches: a, ab, a1, but not !&--one or more upper- or lower-case letters or digits, but not punctuation or other special characters.
\W	Any non-alphanumeric character; functionally equivalent to [^A-Za-z0-9] or [^ [:alnum:]]	`\W` matches: *, &, but not ace or a1--one or more of any character but upper- or lower-case letters and digits.
\s	Any white space character; space, new line, tab, non-breaking space, and others; functionally equivalent to [[:space]]	`vegetable\s` matches: "vegetable" followed by any white space character So the phrase "I like a vegetable in my soup" would trigger the regular expression, but "I like vegetables in my soup" would not.
\S	Any non-white space character; anything other than a space, new line, tab, non-breaking space, and others; functionally equivalent to [^[:space]]	`vegetable\S` matches: "vegetable" followed by any non-white space character So the phrase "I like vegetables in my soup" would trigger the regular expression, but "I like a vegetable in my soup" would not.

Character Classes

Element

Meaning

Example

[:alpha:]

Any alphabetic characters

.REG. [[:alpha:]] matches:

abc, def, xxx, but not 123, @#$.

[:digit:]

Any digit character; functionally equivalent to \d

.REG. [[:digit:]] matches:

1, 12, 123

[:alnum:]

Any "word" character--that is, any alphanumeric character; functionally equivalent to \w

.REG. [[:alnum:]] matches:

abc, 123, but not ~!@.

[:space:]

Any white space character; space, new line, tab, non-breaking space; functionally equivalent to \s

.REG. (vegetable)[[:space:]] matches:

"vegetable" followed by any white space character

So the phrase "I like a vegetable in my soup" would trigger the regular expression, but "I like vegetables in my soup" would not.

[:graph:]

Any characters except space, control characters, or other similar characters

.REG. [[:graph:]] matches:

123, abc, xxx, ><”, but not space or control characters.

[:print:]

Any characters (similar with [:graph:]) but includes the space character

.REG. [[:print:]] matches:

123, abc, xxx, ><”, and space characters.

[:cntrl:]

Any control character (for example, CTRL + C, CTRL + X)

.REG. [[:cntrl:]] matches:

0x03, 0x08, but not abc, 123, !@#.

[:blank:]

Space and tab characters

.REG. [[:blank:]] matches:

space and tab characters, but not 123, abc, !@#

[:punct:]

Punctuation characters

.REG. [[:punct:]] matches:

; : ? ! ~ @ # $ % & * ‘ “ , but not 123, abc

[:lower:]

Any lowercase alphabetic character

Note

Enable case sensitive matching must be enabled or else it will function as [:alnum:])

.REG. [[:lower:]] matches:

abc, Def, sTress, Do, but not ABC, DEF, STRESS, DO, 123, !@#.

[:upper:]

Any uppercase alphabetic character

Note

Enable case sensitive matching must be enabled or else it will function as [:alnum:])

.REG. [[:upper:]] matches:

ABC, DEF, STRESS, DO, but not abc, Def, Stress, Do, 123, !@#.

[:xdigit:]

Digits allowed in a hexadecimal number (0-9a-fA-F)

.REG. [[:xdigit:]] matches:

0a, 7E, 0f

Pattern Anchor Regular Expressions

Element	Meaning	Example
^	Indicates the beginning of a string	`^(notwithstanding)` matches: Any block of text that begins with "notwithstanding" So the phrase "notwithstanding the fact that I like vegetables in my soup" would trigger the regular expression, but "The fact that I like vegetables in my soup notwithstanding" would not.
$	Indicates the end of a string	`(notwithstanding)$` matches: Any block of text that ends with "notwithstanding" So the phrase "notwithstanding the fact that I like vegetables in my soup" would not trigger the regular expression, but "The fact that I like vegetables in my soup notwithstanding" would.

Escape Sequences and Literal Strings

Element

Meaning

Example

\

matches

Indicates that some characters match a special meaning in a regular expression (for example, +)

.REG. C\/C\+\+ matches:

‘C\C++’

.REG. \* matches:

*

.REG. \? matches:

?

\t

Indicates a tab character (ASCII 0x09 character)

(stress)\t matches:

Any block of text that contained the substring "stress" immediately followed by a tab.

\n

Indicates a new line character (ASCII 0x0A character)

Note

Different platforms represent a new line character differently. On Windows, a new line is a pair of characters, a carriage return followed by a line feed. On UNIX and Linux, a new line is just a line feed, and on Macintosh a new line is just a carriage return.

(stress)\n\n matches:

Any block of text that contained the substring "stress" followed immediately by two new line characters.

\r

Indicates a carriage return character (ASCII 0x0D character)

(stress)\r matches:

Any block of text that contained the substring "stress" followed immediately by one carriage return.

\xhh

Indicates an ASCII character with given hexadecimal code (where hh represents any two-digit hex value)

\x7E(\w){6} matches:

Any block of text containing a "word" of exactly six alphanumeric characters preceded with a ~ (tilde) character.

Additional examples that will trigger a match: ~ab12cd and ~Pa3499.

\b

Indicates a backspace character

(stress)\b matches:

Any block of text that contained the substring “stress” followed immediately by one backspace (ASCII 0x08) character