bash using awk and amp to write between the passing variable

For discussions about programming, and for programming questions and advice


Moderator: Forum moderators

Post Reply
blumenwesen
Posts: 37
Joined: Sun Apr 10, 2022 10:02 pm

bash using awk and amp to write between the passing variable

Post by blumenwesen »

Hello, I wanted to know how it is possible with awk in example one with the general definitions of punct, alpha to pass the variable with the amp character so that it can be written between the passing variable, as with the result of example two.
can the variable passed by the amp be shortened, for example with "0[&]{1}1[&]{2}"?

Code: Select all

echo "a:b:cd" | awk '{gsub(/[[:punct:]][[:alpha:]]/, "0&1"); print $0}' # a0:b10:c1d error
echo "a:b:cd" | awk '{gsub(/:b:c/, "0:1c0:1c"); print $0}' # a0:1c0:1cd correct
User avatar
misko_2083
Posts: 196
Joined: Wed Dec 09, 2020 11:59 pm
Has thanked: 10 times
Been thanked: 20 times

Re: bash using awk and amp to write between the passing variable

Post by misko_2083 »

Matching punctuation and alphanumerical characters

Code: Select all

echo "a:b:cd" | awk '{gsub(/[[:punct:]][[:alpha:]]/, "0&1"); print $0}' # a0:b10:c1d error

Matcing punctuation characters, which are: [][!"#$%&'()*+,./:;<=>?@\\^_`{|}~]

Code: Select all

echo "a:b:cd" | awk '{gsub(/[[:punct:]]/, "0&1"); print $0}' # a0:1c0:1cd correct
echo "a@b+cd" | awk '{gsub(/[[:punct:]]/, "0&1"); print $0}'  # a0@1b0+1cd correct?

Matching ':'

Code: Select all

echo "a:b:cd" | awk '{gsub(/:/, "0&1"); print $0}' # a0:1c0:1cd correct

Do you want to replace ":" with "&"?

Code: Select all

echo "a:b:cd" | awk '{gsub(/:/, "0\\&1"); print $0}' # a0&1c0&1cd 

Do you want to exit the Circus? The Harsh Truth
https://www.youtube.com/watch?v=ZJwQicZHp_c

Burunduk
Posts: 257
Joined: Thu Jun 16, 2022 6:16 pm
Has thanked: 7 times
Been thanked: 127 times

Re: bash using awk and amp to write between the passing variable

Post by Burunduk »

blumenwesen wrote: Sun Jul 17, 2022 3:09 pm

Code: Select all

echo "a:b:cd" | awk '{gsub(/[[:punct:]][[:alpha:]]/, "0&1"); print $0}' # a0:b10:c1d error
echo "a:b:cd" | awk '{gsub(/:b:c/, "0:1c0:1c"); print $0}' # a0:1c0:1cd correct

Is it supposed to be "a0:1b0:1cd correct"?

This will match [[:punct:]] before [[:alnum:]_] :

Code: Select all

echo "a:b:cd" | awk '{gsub(/[[:punct:]]\</,"0&1");print}' # a0:1b0:1cd
echo "a:b:cd" | sed 's/[[:punct:]]\b/0&1/g'               # a0:1b0:1cd

If it needs to be [[:alpha:]] or if you do want to replace the [[:alpha:]] with 'c' as in your example, then:

Code: Select all

echo "a:b:cd" | sed -r 's/([[:punct:]])([[:alpha:]])/0\11\2/g'                    # a0:1b0:1cd
echo "a:b:cd" | awk '{print gensub(/([[:punct:]])([[:alpha:]])/,"0\\11\\2","g")}' # a0:1b0:1cd

echo "a:b:cd" | sed -r 's/([[:punct:]])[[:alpha:]]/0\11c/g'                       # a0:1c0:1cd
echo "a:b:cd" | awk '{print gensub(/([[:punct:]])[[:alpha:]]/,"0\\11c","g")}'     # a0:1c0:1cd

Backreferences \1 \2 are used instead of &. It's not possible to take only a part of &. It always represents the whole match.
Awk's gsub() doesn't support backreferences.

blumenwesen
Posts: 37
Joined: Sun Apr 10, 2022 10:02 pm

Re: bash using awk and amp to write between the passing variable

Post by blumenwesen »

I'm sorry made a mistake, meant the code should be with gsub and passing variables parameters such as "()[]{}&$_" split and write between these characters 0 + ":" + 1 + "b" + 0 + ":" + 1 + "c".

# example

Code: Select all

echo "a:b:cd" | awk -F "" '{print $1"0"$2"1"$3"0"$4"1"$5}'
echo "a:b:cd" | awk '{gsub(/:b:c/, "0:1b01:1c"); print $0}'
echo "a:b:cd" | awk '{split($0, a, ""); print a[1]"0"a[2]"1"a[3]"0"a[4]"1"a[5]}'

# example gsub passing parameters

Code: Select all

echo "a:b:cd" | awk '{gsub(/[[:punct:]][[:alpha:]]/, "0&{1}1&{2}"); print $0}'
echo "a:b:cd" | awk '{gsub(/[[:punct:]][[:alpha:]]/, "0&{1}1&{2}0&{3}1&{4}"); print $0}'

[EDIT]

thanks was the perfect example.

Code: Select all

echo "a:b:cd" | awk '{a=gensub(/([[:punct:]])([[:alpha:]])/, "0\\11\\2", "g"); print substr(a, 1, length(a)-1)}'
blumenwesen
Posts: 37
Joined: Sun Apr 10, 2022 10:02 pm

Re: bash using awk and amp to write between the passing variable

Post by blumenwesen »

how do I have to change the script if I only want to replace the space between the numbers?
results="02:43 (03:36)945+7 :0+1+07:37 - 08:59 (01:21)"

Code: Select all

echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{a=gensub(/([[:digit:]]) ([[:digit:]])/, "+\\1+1\\1", "g"); print a}'
User avatar
MochiMoppel
Posts: 1246
Joined: Mon Jun 15, 2020 6:25 am
Location: Japan
Has thanked: 22 times
Been thanked: 446 times

Re: bash using awk and amp to write between the passing variable

Post by MochiMoppel »

blumenwesen wrote: Wed Jul 20, 2022 4:48 pm

how do I have to change the script if I only want to replace the space between the numbers?

In principle this should do it

Code: Select all

# echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{a=gensub(/([[:digit:]]) ([[:digit:]])/, "\\1+\\2", "g"); print a}'
02:43 (03:36)945+7 :0+1 07:37 - 08:59 (01:21)

However the problem in the source string is
02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)
The substrings 0 1 and 1 0 both match , but they are overlapping and awk replaces only the first. Don't know if awk can repeat pattern matching until all matches are replaced.

Possible (and simpler) with sed:

Code: Select all

# echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | sed -r ':x; s/([[:digit:]]) ([[:digit:]])/\1+\2/g ;t x'
02:43 (03:36)945+7 :0+1+07:37 - 08:59 (01:21)

BTW: I don't see the point to use [[:digit:]]. I find [0-9] easier to type and easier to read..

Burunduk
Posts: 257
Joined: Thu Jun 16, 2022 6:16 pm
Has thanked: 7 times
Been thanked: 127 times

Re: bash using awk and amp to write between the passing variable

Post by Burunduk »

You need something like this:

Code: Select all

echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | perl -pe 's/(?<=[0-9]) (?=[0-9])/+/g'

Neither awk nor sed supports lookaround assertions. The sed loop is probably the simplest solution.

blumenwesen
Posts: 37
Joined: Sun Apr 10, 2022 10:02 pm

Re: bash using awk and amp to write between the passing variable

Post by blumenwesen »

ok thank you

User avatar
MochiMoppel
Posts: 1246
Joined: Mon Jun 15, 2020 6:25 am
Location: Japan
Has thanked: 22 times
Been thanked: 446 times

Re: bash using awk and amp to write between the passing variable

Post by MochiMoppel »

If it has to be awk, then this should work:

Code: Select all

echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{ while(/[0-9] [0-9]/){a=gensub(/(.*[0-9]) ([0-9].*)/,"\\1+\\2", "g");$0=a} print a}'

At least let's put this into a readable format:

Code: Select all

echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{
  while (/[0-9] [0-9]/)
  {
    a = gensub(/(.*[0-9]) ([0-9].*)/, "\\1+\\2", "g")
    $0 = a
  }
  print a
}'



[EDIT] Another way would be to forget about awk, sed and perl and do it with pure bash. Faster and doesn't even require regex:

Code: Select all

t='02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)'
while [[ $t = *[0-9]\ [0-9]* ]];do a=${t%[0-9] [0-9]*}; b=${t#*$a}; b=${b/ /+}; t=$a$b; done
echo "$t"

Last edited by MochiMoppel on Sat Jul 23, 2022 1:08 am, edited 1 time in total.
Burunduk
Posts: 257
Joined: Thu Jun 16, 2022 6:16 pm
Has thanked: 7 times
Been thanked: 127 times

Re: bash using awk and amp to write between the passing variable

Post by Burunduk »

MochiMoppel wrote: Thu Jul 21, 2022 8:25 am

Code: Select all

echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{ while(/[0-9] [0-9]/){a=gensub(/(.*[0-9]) ([0-9].*)/,"\\1+\\2", "g");$0=a} print a}'

I'm trying to understand what are those greedy .* for. The pattern matches the whole string every time replacing only 1 space per iteration. Your sed example works without them. Another question, is it possible to construct a string that would require more than 2 iterations?

Code: Select all

echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{while(/[0-9] [0-9]/){$0=gensub(/([0-9]) ([0-9])/, "\\1+\\2", "g")} print}'
echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{for(i=2;i;i--){$0=gensub(/([0-9]) ([0-9])/, "\\1+\\2", "g")} print}'
User avatar
MochiMoppel
Posts: 1246
Joined: Mon Jun 15, 2020 6:25 am
Location: Japan
Has thanked: 22 times
Been thanked: 446 times

Re: bash using awk and amp to write between the passing variable

Post by MochiMoppel »

Burunduk wrote: Thu Jul 21, 2022 4:17 pm

I'm trying to understand what are those greedy .* for. The pattern matches the whole string every time replacing only 1 space per iteration. Your sed example works without them.

..and so would awk. I thought that awk needs them and didn't test :oops: .

Another question, is it possible to construct a string that would require more than 2 iterations?

You mean for blumenwesen's pattern? I dare to say: No, not possible.

Post Reply

Return to “Programming”