Re: BSD awk bug ?

From: Ngie Cooper (yaneurabeya) <"Ngie>
Date: Tue, 15 Aug 2017 23:28:50 -0700
> On Aug 15, 2017, at 20:15, KIRIYAMA Kazuhiko <kiri_at_kx.openedu.org> wrote:
> 
> At Wed, 16 Aug 2017 10:36:36 +0900,
> Tomoya Tabuchi wrote:
>> 
>> On Wed, Aug 16, 2017 at 10:14:46AM +0900, KIRIYAMA Kazuhiko wrote:
>>> admin_at_tbedfpc:~/tmp % ll
>>> total 12
>>> -rw-r--r--  1 admin  admin  235 Aug 16 10:01 regex-1.sh
>>> -rw-r--r--  1 admin  admin  236 Aug 16 10:01 regex-2.sh
>>> -rw-r--r--  1 admin  admin  260 Aug 16 10:01 regex.sh
>>> admin_at_tbedfpc:~/tmp % cat regex.sh
>>> #!/bin/sh
>>> 
>>> data='1 2 3 4 5 6
>>> 1 2 3 4 5
>>> 1 2 3 4 5 6
>>> 1 2 3 4 5 6
>>> 1 2 3 4
>>> 1 2 3'
>>> 
>>> IFS=$'\n'
>>> for datum in $data; do
>>>    if echo "$datum" | egrep -q  '^([^[:space:]]+[[:space:]]+){5}'; then
>>>        echo "$datum"
>>>    else
>>>        echo "Not 6 components! : \"$datum\""
>>>    fi
>>> done
>>> admin_at_tbedfpc:~/tmp % sh ./regex.sh
>>> 1 2 3 4 5 6
>>> Not 6 components! : "1 2 3 4 5"
>>> 1 2 3 4 5 6
>>> 1 2 3 4 5 6
>>> Not 6 components! : "1 2 3 4"
>>> Not 6 components! : "1 2 3"
>>> admin_at_tbedfpc:~/tmp % cat regex-1.sh
>>> #!/bin/sh
>>> 
>>> _f_awk='
>>> {
>>>        if ($0 ~ /^([^[:space:]]+[[:space:]]+){5}/) {
>>>                print $0
>>>        } else {
>>>                print "Not 6 components! : \"" $0 "\""
>>>        }
>>> }'
>>> 
>>> data='1 2 3 4 5 6
>>> 1 2 3 4 5
>>> 1 2 3 4 5 6
>>> 1 2 3 4 5 6
>>> 1 2 3 4
>>> 1 2 3'
>>> 
>>> echo "$data" | awk "$_f_awk"
>>> admin_at_tbedfpc:~/tmp % sh ./regex-1.sh
>>> Not 6 components! : "1 2 3 4 5 6"
>>> Not 6 components! : "1 2 3 4 5"
>>> Not 6 components! : "1 2 3 4 5 6"
>>> Not 6 components! : "1 2 3 4 5 6"
>>> Not 6 components! : "1 2 3 4"
>>> Not 6 components! : "1 2 3"
>>> admin_at_tbedfpc:~/tmp % cat regex-2.sh
>>> #!/bin/sh
>>> 
>>> _f_awk='
>>> {
>>>        if ($0 ~ /^([^[:space:]]+[[:space:]]+){5}/) {
>>>                print $0
>>>        } else {
>>>                print "Not 6 components! : \"" $0 "\""
>>>        }
>>> }'
>>> 
>>> data='1 2 3 4 5 6
>>> 1 2 3 4 5
>>> 1 2 3 4 5 6
>>> 1 2 3 4 5 6
>>> 1 2 3 4
>>> 1 2 3'
>>> 
>>> echo "$data" | gawk "$_f_awk"
>>> admin_at_tbedfpc:~/tmp % sh ./regex-2.sh
>>> 1 2 3 4 5 6
>>> Not 6 components! : "1 2 3 4 5"
>>> 1 2 3 4 5 6
>>> 1 2 3 4 5 6
>>> Not 6 components! : "1 2 3 4"
>>> Not 6 components! : "1 2 3"
>>> admin_at_tbedfpc:~/tmp % uname -a
>>> FreeBSD tbedfpc 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r321597: Thu Jul 27 12:30:57 UTC 2017     root_at_tbedfc:/usr/obj/usr/src/sys/GENERIC  amd64
>>> admin_at_tbedfpc:~/tmp % pkg info -aI|grep gawk
>>> gawk-4.1.4_1                   GNU version of Awk
>>> admin_at_tbedfpc:~/tmp %
>>> 
>>> 
>>> Is this the BSD awk (/usr/bin/awk) bug ?
>> 
>> Hello Kiriyama-san,
>> 
>> The man page awk(1) says that {m,n} matcning is not supported. The "{5}"
>> part matches the literal sequence of characters it's made out of, I suppose.
> 
> Oops. I missed "STANDARDS" section. Thanks for pointed out.
> 
> # But as it says in front "awk supports extended regular
> # expressions (EREs).  See re_format(7) for more information
> # on regular expressions.", I'd like to coinside with
> # re_format(7) spec.

Hello Kiriyama-san,
	I asked this same question a while back and was told that the {n} form didn’t work with nawk. I’ll have to dig up the exact post if it’s somewhere public.
Cheers,
-Ngie

Received on Wed Aug 16 2017 - 04:28:53 UTC

This archive was generated by hypermail 2.4.0 : Wed May 19 2021 - 11:41:12 UTC