?

Log in

Negative lookbehind regular expression match

« previous entry | next entry »
Jul. 16th, 2012 | 01:07 am

In the following negative lookbehind regexp,

  20120716-01:01:55 mengwong@cny2:~% perl -ple 's/(?<!a)b/a/g'

abb correctly becomes aba

  abb
  aba

But bbb wrongly becomes aaa.

  bbb
  aaa

Discuss.

20120716-01:02:17 mengwong@cny2:~% perl -v                  
This is perl 5, version 12, subversion 3 (v5.12.3) built for darwin-thread-multi-2level
(with 2 registered patches, see perl -V for more detail)
Tags: ,

Link | Leave a comment | Share

Comments {5}

(no subject)

from: anonymous
date: Jul. 15th, 2012 08:17 pm (UTC)
Link

Doesn't seem wrong to me; you have an input string and an output string and the second input string doesn't contain any "a"s. Or is Perl supposed to replace and reevaluate after every match?

Reply | Thread

(no subject)

from: Jeremy Tavan
date: Jul. 15th, 2012 10:36 pm (UTC)
Link

Why is it wrong for bbb to become aaa? Your regexp translates to "convert every b that is not preceded by an a into an a", and so bbb should indeed become aaa.

Reply | Thread

zero-width

from: anonymous
date: Jul. 16th, 2012 01:38 am (UTC)
Link

why is that wrong? it's a zero-width negative look-behind assertion and the first b does not have an a before it... ? or is it because after the first substitution the string is now abb and because of the 'g' at the end, the second b should not match because the string has been changed by the first valid substitution. in that case it should be abb, right?

Reply | Thread

mengwong

Re: zero-width

from: mengwong
date: Jul. 16th, 2012 05:10 am (UTC)
Link

i agree with this position.

Reply | Parent | Thread

Andre Jay Marcelo-Tanner

look behind matches b with anything before it as long as its not a

from: Andre Jay Marcelo-Tanner
date: Jul. 16th, 2012 02:03 am (UTC)
Link

it converts bbb to aaa correctly
because each step it analyzes the previous character for !a

if it matches then it goes a head and looks for b, if b matches then it replace with a.

in the first example 'ab' was not replaced with 'bb' because the regex failed in both chars first when 'b' did not match the first char 'a' and second when 'a' was the previous char.

if you just want abbbbb eg to become ababab

then you might try s/(?<=(ab))b/a/g

Reply | Thread