Page 1 of 1

bash using awk and amp to write between the passing variable

Posted: Sun Jul 17, 2022 3:09 pm
by blumenwesen

Hello, I wanted to know how it is possible with awk in example one with the general definitions of punct, alpha to pass the variable with the amp character so that it can be written between the passing variable, as with the result of example two.
can the variable passed by the amp be shortened, for example with "0[&]{1}1[&]{2}"?

Code: Select all

echo "a:b:cd" | awk '{gsub(/[[:punct:]][[:alpha:]]/, "0&1"); print $0}' # a0:b10:c1d error
echo "a:b:cd" | awk '{gsub(/:b:c/, "0:1c0:1c"); print $0}' # a0:1c0:1cd correct

Re: bash using awk and amp to write between the passing variable

Posted: Sun Jul 17, 2022 6:33 pm
by misko_2083

Matching punctuation and alphanumerical characters

Code: Select all

echo "a:b:cd" | awk '{gsub(/[[:punct:]][[:alpha:]]/, "0&1"); print $0}' # a0:b10:c1d error

Matcing punctuation characters, which are: [][!"#$%&'()*+,./:;<=>?@\\^_`{|}~]

Code: Select all

echo "a:b:cd" | awk '{gsub(/[[:punct:]]/, "0&1"); print $0}' # a0:1c0:1cd correct
echo "a@b+cd" | awk '{gsub(/[[:punct:]]/, "0&1"); print $0}'  # a0@1b0+1cd correct?

Matching ':'

Code: Select all

echo "a:b:cd" | awk '{gsub(/:/, "0&1"); print $0}' # a0:1c0:1cd correct

Do you want to replace ":" with "&"?

Code: Select all

echo "a:b:cd" | awk '{gsub(/:/, "0\\&1"); print $0}' # a0&1c0&1cd 

Re: bash using awk and amp to write between the passing variable

Posted: Mon Jul 18, 2022 12:03 am
by Burunduk
blumenwesen wrote: Sun Jul 17, 2022 3:09 pm

Code: Select all

echo "a:b:cd" | awk '{gsub(/[[:punct:]][[:alpha:]]/, "0&1"); print $0}' # a0:b10:c1d error
echo "a:b:cd" | awk '{gsub(/:b:c/, "0:1c0:1c"); print $0}' # a0:1c0:1cd correct

Is it supposed to be "a0:1b0:1cd correct"?

This will match [[:punct:]] before [[:alnum:]_] :

Code: Select all

echo "a:b:cd" | awk '{gsub(/[[:punct:]]\</,"0&1");print}' # a0:1b0:1cd
echo "a:b:cd" | sed 's/[[:punct:]]\b/0&1/g'               # a0:1b0:1cd

If it needs to be [[:alpha:]] or if you do want to replace the [[:alpha:]] with 'c' as in your example, then:

Code: Select all

echo "a:b:cd" | sed -r 's/([[:punct:]])([[:alpha:]])/0\11\2/g'                    # a0:1b0:1cd
echo "a:b:cd" | awk '{print gensub(/([[:punct:]])([[:alpha:]])/,"0\\11\\2","g")}' # a0:1b0:1cd

echo "a:b:cd" | sed -r 's/([[:punct:]])[[:alpha:]]/0\11c/g'                       # a0:1c0:1cd
echo "a:b:cd" | awk '{print gensub(/([[:punct:]])[[:alpha:]]/,"0\\11c","g")}'     # a0:1c0:1cd

Backreferences \1 \2 are used instead of &. It's not possible to take only a part of &. It always represents the whole match.
Awk's gsub() doesn't support backreferences.


Re: bash using awk and amp to write between the passing variable

Posted: Mon Jul 18, 2022 11:39 am
by blumenwesen

I'm sorry made a mistake, meant the code should be with gsub and passing variables parameters such as "()[]{}&$_" split and write between these characters 0 + ":" + 1 + "b" + 0 + ":" + 1 + "c".

# example

Code: Select all

echo "a:b:cd" | awk -F "" '{print $1"0"$2"1"$3"0"$4"1"$5}'
echo "a:b:cd" | awk '{gsub(/:b:c/, "0:1b01:1c"); print $0}'
echo "a:b:cd" | awk '{split($0, a, ""); print a[1]"0"a[2]"1"a[3]"0"a[4]"1"a[5]}'

# example gsub passing parameters

Code: Select all

echo "a:b:cd" | awk '{gsub(/[[:punct:]][[:alpha:]]/, "0&{1}1&{2}"); print $0}'
echo "a:b:cd" | awk '{gsub(/[[:punct:]][[:alpha:]]/, "0&{1}1&{2}0&{3}1&{4}"); print $0}'

[EDIT]

thanks was the perfect example.

Code: Select all

echo "a:b:cd" | awk '{a=gensub(/([[:punct:]])([[:alpha:]])/, "0\\11\\2", "g"); print substr(a, 1, length(a)-1)}'

Re: bash using awk and amp to write between the passing variable

Posted: Wed Jul 20, 2022 4:48 pm
by blumenwesen

how do I have to change the script if I only want to replace the space between the numbers?
results="02:43 (03:36)945+7 :0+1+07:37 - 08:59 (01:21)"

Code: Select all

echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{a=gensub(/([[:digit:]]) ([[:digit:]])/, "+\\1+1\\1", "g"); print a}'

Re: bash using awk and amp to write between the passing variable

Posted: Thu Jul 21, 2022 4:07 am
by MochiMoppel
blumenwesen wrote: Wed Jul 20, 2022 4:48 pm

how do I have to change the script if I only want to replace the space between the numbers?

In principle this should do it

Code: Select all

# echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{a=gensub(/([[:digit:]]) ([[:digit:]])/, "\\1+\\2", "g"); print a}'
02:43 (03:36)945+7 :0+1 07:37 - 08:59 (01:21)

However the problem in the source string is
02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)
The substrings 0 1 and 1 0 both match , but they are overlapping and awk replaces only the first. Don't know if awk can repeat pattern matching until all matches are replaced.

Possible (and simpler) with sed:

Code: Select all

# echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | sed -r ':x; s/([[:digit:]]) ([[:digit:]])/\1+\2/g ;t x'
02:43 (03:36)945+7 :0+1+07:37 - 08:59 (01:21)

BTW: I don't see the point to use [[:digit:]]. I find [0-9] easier to type and easier to read..


Re: bash using awk and amp to write between the passing variable

Posted: Thu Jul 21, 2022 5:55 am
by Burunduk

You need something like this:

Code: Select all

echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | perl -pe 's/(?<=[0-9]) (?=[0-9])/+/g'

Neither awk nor sed supports lookaround assertions. The sed loop is probably the simplest solution.


Re: bash using awk and amp to write between the passing variable

Posted: Thu Jul 21, 2022 7:42 am
by blumenwesen

ok thank you


Re: bash using awk and amp to write between the passing variable

Posted: Thu Jul 21, 2022 8:25 am
by MochiMoppel

If it has to be awk, then this should work:

Code: Select all

echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{ while(/[0-9] [0-9]/){a=gensub(/(.*[0-9]) ([0-9].*)/,"\\1+\\2", "g");$0=a} print a}'

At least let's put this into a readable format:

Code: Select all

echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{
  while (/[0-9] [0-9]/)
  {
    a = gensub(/(.*[0-9]) ([0-9].*)/, "\\1+\\2", "g")
    $0 = a
  }
  print a
}'



[EDIT] Another way would be to forget about awk, sed and perl and do it with pure bash. Faster and doesn't even require regex:

Code: Select all

t='02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)'
while [[ $t = *[0-9]\ [0-9]* ]];do a=${t%[0-9] [0-9]*}; b=${t#*$a}; b=${b/ /+}; t=$a$b; done
echo "$t"


Re: bash using awk and amp to write between the passing variable

Posted: Thu Jul 21, 2022 4:17 pm
by Burunduk
MochiMoppel wrote: Thu Jul 21, 2022 8:25 am

Code: Select all

echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{ while(/[0-9] [0-9]/){a=gensub(/(.*[0-9]) ([0-9].*)/,"\\1+\\2", "g");$0=a} print a}'

I'm trying to understand what are those greedy .* for. The pattern matches the whole string every time replacing only 1 space per iteration. Your sed example works without them. Another question, is it possible to construct a string that would require more than 2 iterations?

Code: Select all

echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{while(/[0-9] [0-9]/){$0=gensub(/([0-9]) ([0-9])/, "\\1+\\2", "g")} print}'
echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{for(i=2;i;i--){$0=gensub(/([0-9]) ([0-9])/, "\\1+\\2", "g")} print}'

Re: bash using awk and amp to write between the passing variable

Posted: Fri Jul 22, 2022 2:34 am
by MochiMoppel
Burunduk wrote: Thu Jul 21, 2022 4:17 pm

I'm trying to understand what are those greedy .* for. The pattern matches the whole string every time replacing only 1 space per iteration. Your sed example works without them.

..and so would awk. I thought that awk needs them and didn't test :oops: .

Another question, is it possible to construct a string that would require more than 2 iterations?

You mean for blumenwesen's pattern? I dare to say: No, not possible.