
Telephone
within the U.K. 01892 531108
Telephone outside the U.K. +44 1892 531108
Regular expressions are derived from the UNIX utility GREP and enable powerful text searches to be carried out using the special characters ^, $, ., *, +, ?, [ ], [^] , [-], \, ( ) and |. These characters have the following meanings:
| ^ | At the beginning of a line a circumflex matches the start of a line. For instance ^while will find all lines starting with while. |
| $ | At the end of a line a dollar matches the end of a line. For instance tomorrow$ will find all lines ending with tomorrow. |
| * | An asterisk after a character will match zero or more occurrences of that character. For instance to* will match t, to, and too. |
| + | A plus sign after a character will match one or more occurrences of that character. For instance to+ will match to and too. |
| ? | A question mark after a character will match zero or one occurrence of that character. For instance to? will match t and to. |
| . | A period matches any character. For instance p.n will match pan, pen, pin and pun. |
| | | The vertical line character matches either expression it separates. For example pan|pen will match pan and pen. |
| ( ) | Characters can be grouped within parentheses. This allows certain expressions to act on more than one character. For instance find(ing)?s will match finds and findings. |
| [ ] | Characters in square brackets will match any one of the enclosed characters. For instance p[aei]n will match pan, pen , pin but not pun. |
| [^] | A circumflex at the start of an expression within brackets will match any character except one of the enclosed characters. For instance p[^aei]n will match pun but not pan, pen or pin. |
| [-] | A hyphen within brackets indicates a range of characters. For instance p[a-h]n will match pan and pen but not pin or pun. |
| \ | A backslash before any of the above special characters treats that character literally. For instance \. will be treated as a period rather than as any character. |
| \w | Matches any word character. Word characters are the characters a-z, A-Z, 0-9, _ and any other character recognised by your system such as é and ä. For instance resum\w will match resume and resumé. |
| \W | Matches any non word character. |
| \s | Matches any white space character including line endings. For instance text\ssearch will match text search even if it spans two lines. |
| \S | Matches any character that is not a white space character. |
| \d | Matches any digit character. For instance \d\d\d will match 999 and 101. |
| \D | Matches any character that is not a digit character. |
Within square brackets the special characters $, ., * and + are treated literally while ^ is only treated as a special character if it immediately follows a [
Further Examples