. any character (except newline) · \d digit [0-9] · \D non-digit · \w word char [a-zA-Z0-9_] · \W non-word · \s whitespace · \S non-whitespace · \b word boundary · \B non-boundary
* 0 or more · + 1 or more · ? 0 or 1 · {3} exactly 3 · {2,5} 2 to 5 · {3,} 3 or more · Add ? after any quantifier for lazy (non-greedy) matching: .*? matches as few characters as possible.
^ start of string · $ end of string · \b word boundary · \A absolute start · \Z absolute end
(abc) capture group · (?:abc) non-capturing group · (?P<name>abc) named group · \1 backreference to group 1 · (a|b) alternation
Lookarounds assert that something exists before or after your match without including it in the match result. They are zero-width — they check but don't consume characters.
(?=abc) — positive lookahead: matches if "abc" follows. Example: \d+(?= dollars) matches "100" in "100 dollars" but not the word "dollars".
(?!abc) — negative lookahead: matches if "abc" does NOT follow. Example: \d+(?! dollars) matches numbers NOT followed by "dollars".
(?<=abc) — positive lookbehind: matches if "abc" precedes. Example: (?<=\$)\d+ matches "100" in "$100" but not in "100 items".
(? — negative lookbehind: matches if "abc" does NOT precede.
^(?=.*[A-Z])(?=.*[a-z])(?=.*\d)(?=.*[!@#$%]).{8,}$Named groups make complex patterns readable and maintainable:
(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})
This matches dates like "2026-04-28" and captures year, month, and day as named variables you can reference in code.
[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}
https?:\/\/[\w.-]+\.[a-z]{2,}[\/\w.?&=-]*
\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}
Matches: (555) 123-4567, 555-123-4567, 5551234567, 555.123.4567
\b(?:\d{1,3}\.){3}\d{1,3}\b
(?:0[1-9]|1[0-2])\/(?:0[1-9]|[12]\d|3[01])\/\d{4}
\d{4}-(?:0[1-9]|1[0-2])-(?:0[1-9]|[12]\d|3[01])
(?:1[0-2]|0?[1-9]):[0-5]\d\s?(?:AM|PM|am|pm)
#(?:[0-9a-fA-F]{3}){1,2}\b
<([a-z][a-z0-9]*)\b[^>]*>(.*?)<\/\1>
\b\d{4}[- ]?\d{4}[- ]?\d{4}[- ]?\d{4}\b
\b\d{5}(?:-\d{4})?\b
\b\d{3}-\d{2}-\d{4}\b
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[!@#$%^&*]).{12,}$
<[^>]*>
https?:\/\/(?:www\.)?([^\/?#]+)
^\s+|\s+$
\b(\w+)\s+\1\b
"([^"]*(?:""[^"]*)*)"|([^,]+)
\*\*([^*]+)\*\*
(?:youtube\.com\/watch\?v=|youtu\.be\/)([a-zA-Z0-9_-]{11})
g — global (find all matches, not just first) · i — case-insensitive · m — multiline (^ and $ match line boundaries) · s — dotall (. matches newlines too) · u — unicode support · x — extended (allows whitespace and comments in pattern)
[a-z] is faster than . because the engine doesn't need to try every character^http is far faster than just http because the engine only checks the start(a+)+ can cause exponential time on non-matching strings(?:...) when you don't need the captured value — slightly faster.*? instead of greedy .* when you want the shortest match