Criteria for a Keyword List
Rule
|
|
Any keyword |
A file must contain at least one keyword
in the keyword list.
|
All keywords
|
A file must contain all the keywords in the keyword list.
|
All keywords within <x> characters
|
A file must contain all the keywords in the keyword list. In
addition, each keyword pair must be within <x> characters
of each other.
For example, your 3 keywords are WEB, DISK, and USB, and the
number of characters you specified is 20.
If Deep Discovery Email
Inspector detects all keywords in the
order DISK, WEB, and USB, the number of characters from the
“D” (in DISK) to the “W” (in WEB), from the “W” to the “U”
(in USB), and from the "D" to the "U" must be 20 characters
or less.
The following data matches the criteria:
DISK####WEB#########USB
The following data do not match the criteria:
DISK####WEB############USB (23 characters between “D” and
“U”)
DISK*******************WEB****USB (23 characters between “D”
and “W”)
When deciding on the number of characters, remember that a
small number, such as 10, will usually result in a shorter
scanning time but will only cover a relatively small area.
This may reduce the likelihood of detecting sensitive data,
especially in large files. As the number increases, the area
covered also increases but the scanning time might be
longer.
|
Combined score for keywords exceeds threshold
|
A file must contain one or more keywords
in the keyword list. If only one keyword was detected, its score
must be higher than the threshold. If there are several keywords,
their combined score must be higher than the threshold.
Assign
each keyword a score of 1 to 10. A highly confidential word or phrase,
such as "salary increase" for the Human Resources department, should
have a relatively high score. Words or phrases that, by themselves,
do not carry much weight can have lower scores.
Consider the
scores that you assigned to the keywords when configuring the threshold.
For example, if you have five keywords and three of those keywords
are high priority, the threshold can be equal to or lower than the
combined score of the three high priority keywords. This means that
the detection of these three keywords is enough to treat the file
as sensitive.
|