Page 1 of 1
bash using awk and amp to write between the passing variable
Posted: Sun Jul 17, 2022 3:09 pm
by blumenwesen
Hello, I wanted to know how it is possible with awk in example one with the general definitions of punct, alpha to pass the variable with the amp character so that it can be written between the passing variable, as with the result of example two.
can the variable passed by the amp be shortened, for example with "0[&]{1}1[&]{2}"?
Code: Select all
echo "a:b:cd" | awk '{gsub(/[[:punct:]][[:alpha:]]/, "0&1"); print $0}' # a0:b10:c1d error
echo "a:b:cd" | awk '{gsub(/:b:c/, "0:1c0:1c"); print $0}' # a0:1c0:1cd correct
Re: bash using awk and amp to write between the passing variable
Posted: Sun Jul 17, 2022 6:33 pm
by misko_2083
Matching punctuation and alphanumerical characters
Code: Select all
echo "a:b:cd" | awk '{gsub(/[[:punct:]][[:alpha:]]/, "0&1"); print $0}' # a0:b10:c1d error
Matcing punctuation characters, which are: [][!"#$%&'()*+,./:;<=>?@\\^_`{|}~]
Code: Select all
echo "a:b:cd" | awk '{gsub(/[[:punct:]]/, "0&1"); print $0}' # a0:1c0:1cd correct
echo "a@b+cd" | awk '{gsub(/[[:punct:]]/, "0&1"); print $0}' # a0@1b0+1cd correct?
Matching ':'
Code: Select all
echo "a:b:cd" | awk '{gsub(/:/, "0&1"); print $0}' # a0:1c0:1cd correct
Do you want to replace ":" with "&"?
Code: Select all
echo "a:b:cd" | awk '{gsub(/:/, "0\\&1"); print $0}' # a0&1c0&1cd
Re: bash using awk and amp to write between the passing variable
Posted: Mon Jul 18, 2022 12:03 am
by Burunduk
blumenwesen wrote: Sun Jul 17, 2022 3:09 pm
Code: Select all
echo "a:b:cd" | awk '{gsub(/[[:punct:]][[:alpha:]]/, "0&1"); print $0}' # a0:b10:c1d error
echo "a:b:cd" | awk '{gsub(/:b:c/, "0:1c0:1c"); print $0}' # a0:1c0:1cd correct
Is it supposed to be "a0:1b0:1cd correct"?
This will match [[:punct:]] before [[:alnum:]_] :
Code: Select all
echo "a:b:cd" | awk '{gsub(/[[:punct:]]\</,"0&1");print}' # a0:1b0:1cd
echo "a:b:cd" | sed 's/[[:punct:]]\b/0&1/g' # a0:1b0:1cd
If it needs to be [[:alpha:]] or if you do want to replace the [[:alpha:]] with 'c' as in your example, then:
Code: Select all
echo "a:b:cd" | sed -r 's/([[:punct:]])([[:alpha:]])/0\11\2/g' # a0:1b0:1cd
echo "a:b:cd" | awk '{print gensub(/([[:punct:]])([[:alpha:]])/,"0\\11\\2","g")}' # a0:1b0:1cd
echo "a:b:cd" | sed -r 's/([[:punct:]])[[:alpha:]]/0\11c/g' # a0:1c0:1cd
echo "a:b:cd" | awk '{print gensub(/([[:punct:]])[[:alpha:]]/,"0\\11c","g")}' # a0:1c0:1cd
Backreferences \1 \2 are used instead of &. It's not possible to take only a part of &. It always represents the whole match.
Awk's gsub() doesn't support backreferences.
Re: bash using awk and amp to write between the passing variable
Posted: Mon Jul 18, 2022 11:39 am
by blumenwesen
I'm sorry made a mistake, meant the code should be with gsub and passing variables parameters such as "()[]{}&$_" split and write between these characters 0 + ":" + 1 + "b" + 0 + ":" + 1 + "c".
# example
Code: Select all
echo "a:b:cd" | awk -F "" '{print $1"0"$2"1"$3"0"$4"1"$5}'
echo "a:b:cd" | awk '{gsub(/:b:c/, "0:1b01:1c"); print $0}'
echo "a:b:cd" | awk '{split($0, a, ""); print a[1]"0"a[2]"1"a[3]"0"a[4]"1"a[5]}'
# example gsub passing parameters
Code: Select all
echo "a:b:cd" | awk '{gsub(/[[:punct:]][[:alpha:]]/, "0&{1}1&{2}"); print $0}'
echo "a:b:cd" | awk '{gsub(/[[:punct:]][[:alpha:]]/, "0&{1}1&{2}0&{3}1&{4}"); print $0}'
[EDIT]
thanks was the perfect example.
Code: Select all
echo "a:b:cd" | awk '{a=gensub(/([[:punct:]])([[:alpha:]])/, "0\\11\\2", "g"); print substr(a, 1, length(a)-1)}'
Re: bash using awk and amp to write between the passing variable
Posted: Wed Jul 20, 2022 4:48 pm
by blumenwesen
how do I have to change the script if I only want to replace the space between the numbers?
results="02:43 (03:36)945+7 :0+1+07:37 - 08:59 (01:21)"
Code: Select all
echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{a=gensub(/([[:digit:]]) ([[:digit:]])/, "+\\1+1\\1", "g"); print a}'
Re: bash using awk and amp to write between the passing variable
Posted: Thu Jul 21, 2022 4:07 am
by MochiMoppel
blumenwesen wrote: Wed Jul 20, 2022 4:48 pm
how do I have to change the script if I only want to replace the space between the numbers?
In principle this should do it
Code: Select all
# echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{a=gensub(/([[:digit:]]) ([[:digit:]])/, "\\1+\\2", "g"); print a}'
02:43 (03:36)945+7 :0+1 07:37 - 08:59 (01:21)
However the problem in the source string is
02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)
The substrings 0 1 and 1 0 both match , but they are overlapping and awk replaces only the first. Don't know if awk can repeat pattern matching until all matches are replaced.
Possible (and simpler) with sed:
Code: Select all
# echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | sed -r ':x; s/([[:digit:]]) ([[:digit:]])/\1+\2/g ;t x'
02:43 (03:36)945+7 :0+1+07:37 - 08:59 (01:21)
BTW: I don't see the point to use [[:digit:]]. I find [0-9] easier to type and easier to read..
Re: bash using awk and amp to write between the passing variable
Posted: Thu Jul 21, 2022 5:55 am
by Burunduk
You need something like this:
Code: Select all
echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | perl -pe 's/(?<=[0-9]) (?=[0-9])/+/g'
Neither awk nor sed supports lookaround assertions. The sed loop is probably the simplest solution.
Re: bash using awk and amp to write between the passing variable
Posted: Thu Jul 21, 2022 7:42 am
by blumenwesen
Re: bash using awk and amp to write between the passing variable
Posted: Thu Jul 21, 2022 8:25 am
by MochiMoppel
If it has to be awk, then this should work:
Code: Select all
echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{ while(/[0-9] [0-9]/){a=gensub(/(.*[0-9]) ([0-9].*)/,"\\1+\\2", "g");$0=a} print a}'
At least let's put this into a readable format:
Code: Select all
echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{
while (/[0-9] [0-9]/)
{
a = gensub(/(.*[0-9]) ([0-9].*)/, "\\1+\\2", "g")
$0 = a
}
print a
}'
[EDIT] Another way would be to forget about awk, sed and perl and do it with pure bash. Faster and doesn't even require regex:
Code: Select all
t='02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)'
while [[ $t = *[0-9]\ [0-9]* ]];do a=${t%[0-9] [0-9]*}; b=${t#*$a}; b=${b/ /+}; t=$a$b; done
echo "$t"
Re: bash using awk and amp to write between the passing variable
Posted: Thu Jul 21, 2022 4:17 pm
by Burunduk
MochiMoppel wrote: Thu Jul 21, 2022 8:25 am
Code: Select all
echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{ while(/[0-9] [0-9]/){a=gensub(/(.*[0-9]) ([0-9].*)/,"\\1+\\2", "g");$0=a} print a}'
I'm trying to understand what are those greedy .* for. The pattern matches the whole string every time replacing only 1 space per iteration. Your sed example works without them. Another question, is it possible to construct a string that would require more than 2 iterations?
Code: Select all
echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{while(/[0-9] [0-9]/){$0=gensub(/([0-9]) ([0-9])/, "\\1+\\2", "g")} print}'
echo "02:43 (03:36)945 7 :0 1 07:37 - 08:59 (01:21)" | awk '{for(i=2;i;i--){$0=gensub(/([0-9]) ([0-9])/, "\\1+\\2", "g")} print}'
Re: bash using awk and amp to write between the passing variable
Posted: Fri Jul 22, 2022 2:34 am
by MochiMoppel
Burunduk wrote: Thu Jul 21, 2022 4:17 pmI'm trying to understand what are those greedy .* for. The pattern matches the whole string every time replacing only 1 space per iteration. Your sed example works without them.
..and so would awk. I thought that awk needs them and didn't test .
Another question, is it possible to construct a string that would require more than 2 iterations?
You mean for blumenwesen's pattern? I dare to say: No, not possible.