Covers: syntax, flags, character classes, quantifiers, groups, assertions, Unicode, APIs, performance, pitfalls, debugging, common patterns, and interview questions.
A RegExp is a pattern describing a set of strings. Two creation forms:
const r1 = /abc/i; // literal form, flags after closing slash
const r2 = new RegExp("a\\d+", "g"); // constructor form; note double-escaping in string
/…/ is parsed at script-compile time; RegExp() builds at runtime.
g — global: find all matches; affects lastIndex for exec() and test() behavior.i — ignore case.m — multiline: ^ and $ match at line boundaries.s — dotAll: . matches newline.u — unicode: enables full Unicode semantics (code points, \u{…}, \p{…}).y — sticky: match starting exactly at lastIndex; does not search ahead.d — indices (ES2022): match/exec return indices of captures via .indices property.Examples:
/^\w+/m // multiline start-of-line word
/./s // dotAll: dot matches \n
/\p{Letter}/u // unicode property escape
/\d+/ — no double escaping in pattern.new RegExp("\\d+", "g") — backslashes must be escaped because of JS string parsing.[abc] — any of a, b, or c.[A-Za-z0-9].[^a-z].Predefined shorthands:
\d → digit [0-9] (in Unicode mode can vary)\D → non-digit\w → word char [A-Za-z0-9_] (note: \w does not include many Unicode letters unless u+\p{...} used)\W → non-word\s → whitespace (space, tab, newline, etc.)\S → non-whitespace. matches any character except newline by default; with s it matches newline too.u flag required for Unicode code point features like \u{1F600} and \p{...}.\p{Script=Hiragana}, \p{Letter}, \p{Emoji}. Use \P{...} for negation.Example: match letters from all scripts:
/\p{Letter}+/u
Greedy:
a* — 0 or more (greedy)a+ — 1 or morea? — 0 or 1a{n} — exactly na{n,} — n or morea{n,m} — between n and m? after quantifier: a+?, a*?, a{2,4}?a++ is invalid); emulate using atomic constructs where possible (JS lacks atomic group support).( … ) — stores match in numbered groups and in .groups if named.(?: … ) — groups without capturing (useful for alternation or quantifier scoping).(?<name> … ) → access via .groups.name.\1, \2, or \k<name> (named backreference).Example:
const re = /^(?<area>\d{3})-(\d{3})-(\d{4})$/;
const m = re.exec("123-456-7890");
m.groups.area // "123"
Lookahead:
(?=… ) — assert following text matches.(?!… ) — assert following text does not match.Lookbehind (ES2018+):
(?<=… ) — assert preceding text.(?<!… ) — assert preceding text not match.Examples:
/\d+(?=%)/ // digits followed by percent (but percent not consumed)
/(?<=\$)\d+/ // digits preceded by a dollar sign
^ — start of input (or line with m).$ — end of input (or line with m).\b — word boundary (transition between \w and \W).\B — not word boundary.\A, \z are not standard JS anchors (use ^ and $ with care).RegExp.prototype.test(str) → boolean; with g/y it advances lastIndex.RegExp.prototype.exec(str) → match array with captures; repeated exec calls iterate when g/y set and update lastIndex.String methods using regex:
str.match(re) — if re has g returns array of matches; else returns same as exec.str.matchAll(re) — returns iterator of full match objects (with groups); re should have g or y to get multiple matches.str.replace(re, replacement|fn) — replacement string can use $1, $<name>, $&, $ , $’, or a function (match, p1, p2, …, offset, input, groups) => …`.str.search(re) — returns index of first match or -1.str.split(re) — splits string by pattern; capturing groups appear in result.Example: use function replacement to transform:
"a1b2".replace(/\d/g, d => String(Number(d) * 2)); // "a2b4"
lastIndex, g, y, and exec behaviorg or y, it maintains lastIndex. exec() starts matching at lastIndex.g searches from lastIndex forward for the next match; y requires the match to start exactly at lastIndex.lastIndex = 0 when reusing regex across different input strings or use non-global regex for single checks.Pitfall:
const re = /a/g;
re.test("a"); // true, lastIndex becomes 1
re.test("a"); // false, since lastIndex=1 and search starts after end
Characters with special regex meaning must be escaped: .^$*+?()[]\{}|/
/\// matches /.new RegExp("\\.") to match a literal dot.function escapeRE(s) { return s.replace(/[.*+?^${}()|[\]\\]/g, "\\$&"); }
^(a+)+$ on long non-matching input; causes exponential-time behavior.Avoid by:
/^(a|aa)+b/ on long a...a then b absent — heavy backtracking.^, $) when possible..* when you can use more precise classes (e.g., [^"]* instead of .* inside quoted parsing).(?:...) when capture not required.\d/\w with u only if Unicode semantics required.Email (simple, pragmatic)
/^[^\s@]+@[^\s@]+\.[^\s@]+$/
URL (very simple)
/^(https?:\/\/)?([\w.-]+)\.([a-z]{2,})(\/\S*)?$/i
Extract all words (Unicode-aware)
const words = [...text.matchAll(/\p{L}[\p{L}\p{N}_']*/gu)];
Capture duplicate adjacent words (case-insensitive)
/\b(\w+)\s+\1\b/i
Validate hex color
/^#([A-Fa-f0-9]{6}|[A-Fa-f0-9]{3})$/
Split on commas not inside quotes
const parts = str.split(/,(?=(?:[^"]*"[^"]*")*[^"]*$)/);
Remove HTML tags (simple)
str.replace(/<[^>]*>/g, "");
Use console.log with RegExp.prototype.exec for captures:
const m = /(\d+)-(\w+)/.exec("12-abc");
console.log(m[1], m[2]);
RegExp() constructor → security risk (ReDoS) and broken patterns..lastIndex.str.match(re) returns same shape for g vs non-g regex.. to match newline without s..length for Unicode user-visible characters — use grapheme segmentation (Intl.Segmenter) or use [...str] to iterate code points (still not grapheme clusters).Intl.Segmenter for grapheme cluster-aware iteration (user-visible characters).d flag): get start/end indices for captures.\p{…} for script/category matching.Create:
/pattern/flagsnew RegExp("pattern", "flags")re.test(str)re.exec(str) or str.match(re) (non-global)str.matchAll(/.../g) or loop while (m = re.exec(str))str.replace(re, (m, g1, g2, offset, input, groups) => ...)s.replace(/[.*+?^${}()|[\]\\]/g, "\\$&")Inside replacement string:
$& — matched substring$1, $2, … — captures$<name> — named capture$` — portion before match$' — portion after match$$ — literal $Example:
"John".replace(/(J)(ohn)/, "$2, $1"); // "ohn, J"
g and y flags.lastIndex work and what pitfalls does it cause with test() and exec()?(?: … )?u + \p{Letter} or Intl.Segmenter.)new RegExp(".") different from /./s sometimes? (String escaping & flags.).replace().Phone with optional country code
/^(?:\+?(\d{1,3}))?[-. (]?(\d{3})[-. )]?(\d{3})[-. ]?(\d{4})$/
(?: … )? optional non-capturing country code.(\d{3}) area code captured.CSV field split respecting quotes
/(?:^|,)(?:"([^"]*(?:""[^"]*)*)"|([^",]*))/g
"" escapes.