7 September 2006
Regular Expression Basics
Posted by Mikhail Esteves under: Miscellaneous; Tips .
Regular expressions, sometimes referred to as regex, grep, or pattern matching, can be a very powerful tool and a tremendous time-saver with a broad range of application. As an extended form of find-and-replace, you can use a regular expression to do things such as perform client-side validation of email addresses and phone numbers, search multiple documents for strings and patterns you wish to change or remove, or extract a list of links from source code. Regex is supported by most languages and tools, but because there can be varying implementations, this article will cover basic principles that are commonly used.
If you’ve seen a regular expression before and thought it looked like alien space-algebra, it does, but have no fear – you’ll be fluent in alien space-algebra in no time! To make the most of the power of regex, you need to be familiar with a few classifications of characters. Literals are normal text characters and can include whitespace (tabs, spaces, newlines, etc.). Unless modified by a metacharacter, a literal will match itself on a one-for-one basis. Metacharacters’ power lies in how they are arranged and interpreted as wildcards. Metacharacters can be escaped with a backslash (\) to find instances of themselves, for instance, if you need to find a caret (^) or a backslash, as well as used in nested groups or other combinations.