Log in

No account? Create an account

Negative lookbehind regular expression match

« previous entry | next entry »
Jul. 16th, 2012 | 01:07 am

In the following negative lookbehind regexp,

  20120716-01:01:55 mengwong@cny2:~% perl -ple 's/(?<!a)b/a/g'

abb correctly becomes aba


But bbb wrongly becomes aaa.



20120716-01:02:17 mengwong@cny2:~% perl -v                  
This is perl 5, version 12, subversion 3 (v5.12.3) built for darwin-thread-multi-2level
(with 2 registered patches, see perl -V for more detail)
Tags: ,

Link | Leave a comment | | Flag

Comments {5}

(no subject)

from: anonymous
date: Jul. 15th, 2012 08:17 pm (UTC)

Doesn't seem wrong to me; you have an input string and an output string and the second input string doesn't contain any "a"s. Or is Perl supposed to replace and reevaluate after every match?

Reply | Thread

(no subject)

from: Jeremy Tavan
date: Jul. 15th, 2012 10:36 pm (UTC)

Why is it wrong for bbb to become aaa? Your regexp translates to "convert every b that is not preceded by an a into an a", and so bbb should indeed become aaa.

Reply | Thread


from: anonymous
date: Jul. 16th, 2012 01:38 am (UTC)

why is that wrong? it's a zero-width negative look-behind assertion and the first b does not have an a before it... ? or is it because after the first substitution the string is now abb and because of the 'g' at the end, the second b should not match because the string has been changed by the first valid substitution. in that case it should be abb, right?

Reply | Thread


Re: zero-width

from: mengwong
date: Jul. 16th, 2012 05:10 am (UTC)

i agree with this position.

Reply | Parent | Thread

Andre Jay Marcelo-Tanner

look behind matches b with anything before it as long as its not a

from: Andre Jay Marcelo-Tanner
date: Jul. 16th, 2012 02:03 am (UTC)

it converts bbb to aaa correctly
because each step it analyzes the previous character for !a

if it matches then it goes a head and looks for b, if b matches then it replace with a.

in the first example 'ab' was not replaced with 'bb' because the regex failed in both chars first when 'b' did not match the first char 'a' and second when 'a' was the previous char.

if you just want abbbbb eg to become ababab

then you might try s/(?<=(ab))b/a/g

Reply | Thread