Content Filtering keywords support regular expression declarations. See the following
tables for more in-depth examples of regular expressions.
There are a number of websites and tutorials available online. One such site is the
PerlDoc
site, which can be found at:
Counting and Grouping
Element
|
Meaning
|
Example
|
.
|
The dot or period character represents any character (except the new line
character).
|
do. matches:
doe, dog, don, dos, dot
d..r matches:
deer, door
|
*
|
The asterisk character means zero or more instances of the preceding
element.
|
do* matches:
d, do, doo, dooo, doooo
|
+
|
The plus sign character means one or more instances of the preceding
element.
|
do+ matches:
do, doo, dooo, doooo but not d
|
?
|
The question mark character means zero or one instances of the preceding
element.
|
do? matches:
d or do but not doo, dooo
|
( )
|
Parenthesis characters group whatever is between them to be considered as
a single entity.
|
d(eer)+ matches:
deer or deereer or deereereer
The + sign is applied to the substring within parentheses, so the regular
expression looks for
dfollowed by one or more of the grouping eer. |
[ ]
|
Square bracket characters indicate a set or a range of
characters.
|
d[aeiouy]+ matches:
da, de, di, do, du, dy, daa, dae, dai
The
+sign is applied to the set within brackets, so the regular expression looks for dfollowed by one or more of any of the characters in the set [aeioy]. d[A-Z] matches:
dA, dB, dC, and so on up to dZ.
The set in square brackets represents the range of all upper-case letters between
A and Z.
|
[^ ]
|
Caret characters within square brackets logically negate the set or range
specified, meaning the regular expression will match any character that is not in
the set or range.
|
d[^aeiouy] matches:
db, dc or dd, d9, d#--d followed by any single character except a vowel
|
{ }
|
Curly brace characters set a specific number of occurrences of the
preceding element. A single value inside the braces means that only that many
occurrences will match. A pair of numbers separated by a comma represents a set of
valid counts of the preceding character. A single digit followed by a comma means
there is no upper bound.
|
da{3} matches:
daaa--d followed by 3 and only 3 occurrences of “a”
da{2,4} matches:
daa, daaa, daaaa, and daaaa (but not daaaaa)--d followed by 2, 3, or 4
occurrences of
a da{4,} matches:
daaaa, daaaaa, daaaaaa--d followed by 4 or more occurrences of
a. |
Shorthand Classes
Element
|
Meaning
|
Example
|
\d
|
Any digit character; functionally equivalent to [0-9] or
[[:digit:]]
|
\d matches:
1, 12, 123, but not 1b7--one or more of any digit characters.
|
\D
|
Any non-digit character; functionally equivalent to [^0-9] or
[^[:digit:]]
|
\D matches:
a, ab, ab&, but not 1--one or more of any character but 0, 1, 2, 3, 4, 5, 6,
7, 8, or 9.
|
\w
|
Any "word" character--that is, any alphanumeric character; functionally
equivalent to [A-Za-z0-9] or [ [:alnum:]]
|
\w matches:
a, ab, a1, but not !&--one or more upper- or lower-case letters or digits,
but not punctuation or other special characters.
|
\W
|
Any non-alphanumeric character; functionally equivalent to [^A-Za-z0-9]
or [^ [:alnum:]]
|
\W matches:
*, &, but not ace or a1--one or more of any character but upper- or
lower-case letters and digits.
|
\s
|
Any white space character; space, new line, tab, non-breaking space, and
others; functionally equivalent to [[:space]]
|
vegetable\s matches:
"vegetable" followed by any white space character
So the phrase "I like a vegetable in my soup" would trigger the regular
expression, but "I like vegetables in my soup" would not.
|
\S
|
Any non-white space character; anything other than a space, new line,
tab, non-breaking space, and others; functionally equivalent to
[^[:space]]
|
vegetable\S matches:
"vegetable" followed by any non-white space character
So the phrase "I like vegetables in my soup" would trigger the regular
expression, but "I like a vegetable in my soup" would not.
|
Character Classes
Element
|
Meaning
|
Example
|
||
[:alpha:]
|
Any alphabetic characters
|
.REG. [[:alpha:]] matches:
abc, def, xxx, but not 123, @#$.
|
||
[:digit:]
|
Any digit character; functionally equivalent to \d
|
.REG. [[:digit:]] matches:
1, 12, 123
|
||
[:alnum:]
|
Any "word" character--that is, any alphanumeric character; functionally
equivalent to \w
|
.REG. [[:alnum:]] matches:
abc, 123, but not ~!@.
|
||
[:space:]
|
Any white space character; space, new line, tab, non-breaking space; functionally
equivalent to
\s
|
.REG. (vegetable)[[:space:]] matches:
"vegetable" followed by any white space character
So the phrase "I like a vegetable in my soup" would trigger the regular
expression, but "I like vegetables in my soup" would not.
|
||
[:graph:]
|
Any characters except space, control characters, or other similar characters
|
.REG. [[:graph:]] matches:
123, abc, xxx, ><”, but not space or control characters.
|
||
[:print:]
|
Any characters (similar with [:graph:]) but includes the space
character
|
.REG. [[:print:]] matches:
123, abc, xxx, ><”, and space characters.
|
||
[:cntrl:]
|
Any control character (for example, CTRL + C, CTRL + X)
|
.REG. [[:cntrl:]] matches:
0x03, 0x08, but not abc, 123, !@#.
|
||
[:blank:]
|
Space and tab characters
|
.REG. [[:blank:]] matches:
space and tab characters, but not 123, abc, !@#
|
||
[:punct:]
|
Punctuation characters
|
.REG. [[:punct:]] matches:
; : ? ! ~ @ # $ % & * ‘ “ , but not 123, abc
|
||
[:lower:]
|
Any lowercase alphabetic character
|
.REG. [[:lower:]] matches:
abc, Def, sTress, Do, but not ABC, DEF, STRESS, DO, 123, !@#.
|
||
[:upper:]
|
Any uppercase alphabetic character
|
.REG. [[:upper:]] matches:
ABC, DEF, STRESS, DO, but not abc, Def, Stress, Do, 123, !@#.
|
||
[:xdigit:]
|
Digits allowed in a hexadecimal number (0-9a-fA-F)
|
.REG. [[:xdigit:]] matches:
0a, 7E, 0f
|
Pattern Anchor Regular Expressions
Element
|
Meaning
|
Example
|
^
|
Indicates the beginning of a string
|
^(notwithstanding) matches:
Any block of text that begins with "notwithstanding"
So the phrase "notwithstanding the fact that I like vegetables in my soup" would
trigger the regular expression, but "The fact that I like vegetables in my soup
notwithstanding" would not.
|
$
|
Indicates the end of a string
|
(notwithstanding)$ matches:
Any block of text that ends with "notwithstanding"
So the phrase "notwithstanding the fact that I like vegetables in my soup" would
not trigger the regular expression, but "The fact that I like vegetables in my
soup notwithstanding" would.
|
Escape Sequences and Literal Strings
Element
|
Meaning
|
Example
|
||
\
|
matches Indicates that some characters match a special meaning in a regular expression
(for example, +)
|
.REG. C\/C\+\+ matches:
‘C\C++’
.REG. \* matches:
*
.REG. \? matches:
?
|
||
\t
|
Indicates a tab character (ASCII 0x09 character)
|
(stress)\t matches:
Any block of text that contained the substring "stress" immediately followed by a
tab.
|
||
\n
|
Indicates a new line character (ASCII 0x0A character)
|
(stress)\n\n matches:
Any block of text that contained the substring "stress" followed immediately by
two new line characters.
|
||
\r
|
Indicates a carriage return character (ASCII 0x0D character)
|
(stress)\r matches:
Any block of text that contained the substring "stress" followed immediately by
one carriage return.
|
||
\xhh
|
Indicates an ASCII character with given hexadecimal code (where hh represents any
two-digit hex
value)
|
\x7E(\w){6} matches:
Any block of text containing a "word" of exactly six alphanumeric characters
preceded with a ~ (tilde) character.
Additional examples that will trigger a match: ~ab12cd and ~Pa3499.
|
||
\b
|
Indicates a backspace character
|
(stress)\b matches:
Any block of text that contained the substring “stress” followed immediately by
one backspace (ASCII 0x08) character
|