Dot meta-character in regex

  • The Dot meta-character matches any single character, without caring what that character is
  • Note : Dot does not matches the new line character.
  • Dot is not a meta-character in character set [ ]
  • Dot is a very powerful regex metacharacter and it allows you to be lazy i.e you put a dot and everything matches in regex
  • It is adviced to use Dot meta character sparingly since it matches everything it causes regex to match character that we don’t want to match

Example : If you want to match date format 03/11/2035 you can write a regex like this “\d\d.\d\d.\d\d\d\d” but also matches 03-11-2015 and 03.11.2015
We can improve our regex search for using character class like this “\d\d[/]\d\d[/]\d\d\d\d”

In the above example we don’t have any control of what all character . can match if the above date format input hence it is very dangerous to use dot meta character in regex.

Use Character Sets Instead of the Dot

  • Dot meta character is used most of the time when they know there can be a special character inside the search so you can either blacklist or white-list the charters using character set instead of using dot meta character
  • In the above example of date we can use negated character set to compare date

Q & A

Input : Hello world this is a date 03/11/2035 which has to be found using Regex
But it should not match 03\11\2035 or 03 11 2035 or 03-11-2035
Search : 03/11/2035
Regex Using Character set : \d\d[/]\d\d[/]\d\d\d\d
Regex Using Negated Character set : \d\d[^a-zA-Z0-9\-. ]\d\d[^a-zA-Z0-9\-. ]\d\d\d\d

Leave a Comment