Meta characters in Regex

Below are the meta characters in Regex i.e these characters has special meaning and if we want to search this characters using Regex we need to escape these characters using a black slash \ .

10 Meta characters
[ ]
\
^
$
.
|
?
*
+
( )

Non printable characters

You can use special characters to match non printable character in regular expression.

\tTab characterASCII 0x09
\rCarriage Return0x0D
\nNew Line feed0x0A
\vVertical tab0x0B

In the Latin-1 character set, the copyright symbol is character
0xA9. So to search for the copyright symbol, you can use «\xA9». Another way to search for a tab is to use «\x09». Note that the leading zero is required.

Most regex flavors also support the tokens «\cA» through «\cZ» to insert ASCII control characters. The letter after the backslash is always a lowercase c. The second letter is an uppercase letter A through Z, to indicate Control+A through Control+Z. These are equivalent to «\x01» through «\x1A» (26 decimal). E.g. «\cM» matches a carriage return, just like «\r» and «\x0D». In XML Schema regular expressions, «\c» is a shorthand character class that matches any character allowed in an XML name.

If your regular expression engine supports Unicode, use «\uFFFF» rather than «\xFF» to insert a Unicode character. The euro currency sign occupies code point 0x20AC. If you cannot type it on your keyboard, you can insert it into a regular expression with «\u20AC».

Leave a Comment