Boolean Expressions |
Top Previous Next |
Agent Ransack's Boolean expression engine supports Web style search expressions using the AND, OR, NOT, NEAR, REGEX, and LIKE operators. Agent Ransack can be configured to match the expression across the whole file (default) or on a line by line basis in the Options Tab.
Line by Line example
The expression work AND document searches for lines that include the words work and document. Since Agent Ransack implicity assumes an AND the expression can alternatively be written as work document.
The expression work OR document searches for lines that include either 'work' or 'document'.
The expression work NOT document searches for lines that include 'work' but not 'document'.
Whole file example
The expression work AND document searches for files that include the words work and document. The words can occur on the same line or on different lines throughout the file.
The expression work NOT document searches for files that include work but not document anywhere in the file.
Note: the operators AND, OR, and NOT must be written in capital letters otherwise they are assumed to be search terms.
Quotes can be used to search for literal phrases, e.g.
"work document" searches for the exact phrase work document.
Brackets can be used to specify phrase grouping, e.g.
The expression work AND (document OR letter) searches for lines that include work and either document or letter.
LIKE Operator
If the spelling of the search term is unsure, or possibly misspelled in the search text, the LIKE operator can be used to specify an approximate search term. For example,
LIKE necessary
will find necessary but also slight variations such as neccessary. The scale of the approximation can be changed in the Configuration settings.
NEAR Operator
To specify that two search terms should be near to each other in the search text use the NEAR operator. For example,
work NEAR document
will only match the two terms if they are within a certain number of characters of each other (the default maximum character distance is specified in the Configuration settings). The maximum distance can be specified as part of the expression, e.g.
work NEAR:20 document
would search change the default maximum distance to 20 characters.
REGEX Operator
To specify that a term is a regular expression use the REGEX operator. For example,
work AND REGEX "\d{5,6}"
will match any document that has the term work and the regex \d{5,6} (ie. a number with 5-6 digits) in it. To specify that terms should always be treated as regular expressions, ie without the need to use the REGEX operator, use the Boolean RegEx expression type (see below).
LINES Operator
The LINES operator limits the lines that are searched for the following expression. For example,
LINES:3-5 (tower AND london)
searches only lines 3, 4, and 5 for the expression tower AND london.
LINES:10+ (tower AND london)
searches all lines from line 10 and higher.
FILELIST Operator
The FILELIST operator loads the specified file as a File List. For example,
work AND FILELIST "C:\TermList.txt"
Note: Since the use of the FILELIST operator is explicit the functionality will work regardless of the File Lists settings.
Boolean Sub Expressions
Boolean expressions are comprised of sub expressions. The sub expression type will depend on the Expression Type chosen
1 The wildcard setting is specified in the Options tab (the default is to allow wildcards).
Example: Boolean RegEx
By using the Boolean RegEx expression type regular expression searches can be combined using the operators AND, OR, and NOT. The regular expressions are evaluated on each line but the behaviour of the Boolean combination of those regex results, ie line by line or across whole file, is defined by the Boolean Expression settings in the Options Tab.
Line by Line example
The expression [0-9]+ AND document searches for lines that include both a number and the word document.
The expression "[a-z]+@[a-z]+" NOT "\.(com|net)" searches for lines with email like text but not including .com or .net. Note the use of quotes to show the regular expression grouping (otherwise the brackets would have been treated as a boolean grouping).
Whole file example
The expression "([0-9]+\.){3}[0-9]+" AND error searches for files with an IP address and the word error somewhere in the file but not necessarily on the same line.
Quotes are used to identify parts of the expression that are regular expressions.
Note: Due to the complex nature of the Boolean RegEx expression type searches using it are usually slower than with the other expression types. Therefore use of Boolean RegEx is only recommended when its specific capabilities are required.
|