RegExp 断言

前言 #

JavaScript 语言的正则表达式，ES5只支持先行断言（lookahead）和先行否定断言（negative lookahead），不支持后行断言（lookbehind）和后行否定断言（negative lookbehind）。

先行断言 #

基础说明 #

先行断言（lookahead）指的是，x只有在y前面才匹配，必须写成/x(?=y)/。

先行断言括号之中的部分（(?=%)），是不计入返回结果的。

举例 #

只匹配百分号之前的数字。

((log) => {
    const regex1 = /\d+(?=%)/;
    const regex2 = /\d+(?!%)/;
    const string = '100% of US presidents have been male';

    log(regex1.exec(string)); // ["100", index: 0, input: "100% of US presidents have been male", groups: undefined]
    log(regex2.exec(string)); // ["10", index: 0, input: "100% of US presidents have been male", groups: undefined]
})(console.log)

先行否定断言 #

基础说明 #

先行否定断言（negative lookahead）指的是，x只有不在y前面才匹配，必须写成/x(?!y)/。

先行否定断言括号之中的部分（(?!%)），是不计入返回结果的。

举例 #

只匹配不在百分号之前的数字。

((log) => {
    const regex1 = /\d+(?=%)/;
    const regex2 = /\d+(?!%)/;
    const string = 'that’s all 44 of them';

    log(regex1.exec(string)); // null
    log(regex2.exec(string)); // ["44", index: 11, input: "that’s all 44 of them", groups: undefined]
})(console.log)

后行断言 #

基础说明 #

ES2018 引入后行断言（lookbehind），V8 引擎 4.9 版（Chrome 62）已经支持。

后行断言正好与先行断言相反，x只有在y后面才匹配，必须写成/(?<=y)x/。

举例 #

美元匹配 #

只匹配美元符号之后的数字

((log) => {
    const regex1 = /(?<=\$)\d+/;
    const regex2 = /(?<!\$)\d+/;
    const string = 'Benjamin Franklin is on the $100 bill';

    log(regex1.exec(string)); // ["100", index: 29, input: "Benjamin Franklin is on the $100 bill", groups: undefined]
    log(regex2.exec(string)); // ["00", index: 30, input: "Benjamin Franklin is on the $100 bill", groups: undefined]
})(console.log)

字符串替换 #

使用后行断言进行字符串替换

((log) => {
    const RE_DOLLAR_PREFIX_1 = /(?<=\$)foo/g;
    const RE_DOLLAR_PREFIX_2 = /(?<!\$)foo/g;
    const REPLACE_STR = 'bar';
    const string = '$foo %foo foo';

    // 只有在美元符号后面的foo才会被替换
    log(string.replace(RE_DOLLAR_PREFIX_1,REPLACE_STR)); // $bar %foo foo
    // 只有不在美元符号后面的foo才会被替换
    log(string.replace(RE_DOLLAR_PREFIX_2,REPLACE_STR)); // $foo %bar bar
})(console.log)

tip #

组匹配顺序 #

后行断言的实现，需要先匹配/(?<=y)x/的x，然后再回到左边，匹配y的部分。

这种先右后左的执行顺序，与所有其他正则操作相反，导致了一些不符合预期的行为。

首先，后行断言的组匹配，与正常情况下结果是不一样的。

((log) => {
    const regex1 = /(?<=(\d+)(\d+))$/;
    const regex2 = /^(\d+)(\d+)$/;
    const string = '1053';

    log(regex1.exec(string)); // ["", "1", "053", index: 4, input: "1053", groups: undefined]
    log(regex2.exec(string)); // ["1053", "105", "3", index: 0, input: "1053", groups: undefined]
})(console.log)

上面代码中，需要捕捉两个组匹配。

没有后行断言时，第一个括号是贪婪模式，第二个括号只能捕获一个字符，所以结果是105和3。

而后行断言时，由于执行顺序是从右到左，第二个括号是贪婪模式，第一个括号只能捕获一个字符，所以结果是1和053。

反斜杠规范 #

后行断言的反斜杠引用，也与通常的顺序相反，必须放在对应的那个括号之前。

((log) => {
    const regex1 = /(?<=(o)d\1)r/;
    const regex2 = /(?<=\1d(o))r/;
    const string = 'hodor';

    log(regex1.exec(string)); // null
    log(regex2.exec(string)); // ["r", "o", index: 4, input: "hodor", groups: undefined]
})(console.log)

上面代码中，如果后行断言的反斜杠引用（\1）放在括号的后面，就不会得到匹配结果，必须放在前面才可以。

因为后行断言是先从左到右扫描，发现匹配以后再回过头，从右到左完成反斜杠引用。

后行否定断言 #

基础说明 #

后行否定断言（negative lookbehind）则与先行否定断言相反，x只有不在y后面才匹配，必须写成/(?<!y)x/。

举例 #

只匹配不在美元符号后面的数字

((log) => {
    const regex1 = /(?<=\$)\d+/;
    const regex2 = /(?<!\$)\d+/;
    const string = 'it’s is worth about €90';

    log(regex1.exec(string)); // null
    log(regex2.exec(string)); // ["90", index: 21, input: "it’s is worth about €90", groups: undefined]
})(console.log)

« Previous

⤊ Top

RegExp 断言

前言 #

先行断言 #

基础说明 #

举例 #

先行否定断言 #

基础说明 #

举例 #

后行断言 #

基础说明 #

举例 #

美元匹配 #

字符串替换 #

tip #

组匹配顺序 #

反斜杠规范 #

后行否定断言 #

基础说明 #

举例 #

Table of chapters Close