Page 1 of 2

Abiword and Geany character coding issues (SOLVED)

Posted: Thu Feb 04, 2021 7:31 pm
by amethyst

Okay, I think it's character coding issues. I have problems with Abiword and Geany displaying some gibberish symbols when opening text documents. For instance, the apostrophe will be replaced by some gibberish. I do not have this issue when using leafpad or another word processor like wordpad or Atlantis running with Wine. What should I change so that Abiword and Geany display text correctly (I'm using its default settings so haven't changed anything).

I seem to have this problem when documents have been converted to txt ,like epub to text for example. So it seems as if these two applications try to "complicate" matters. It's very annoying.


Re: Abiword and Geany character coding issues

Posted: Thu Feb 04, 2021 8:08 pm
by backi

Hi !
Abiword is known for giving Problems......quite buggy.
Maybe better use Libre-Office....as an sfs (squashfs) Module
viewtopic.php?f=96&t=404

Or as an AppImage :
viewtopic.php?f=96&t=1630
https://www.libreoffice.org/download/appimage/


Re: Abiword and Geany character coding issues

Posted: Thu Feb 04, 2021 8:34 pm
by zigbert

In Geany you can check the encoding: Menu>File>Properties.
If the encoding doesn't match with your locale setting it will give issues. Run 'locale' in terminal...

Geany have an option to reload file, but that won't probably not help you since it won't convert the chars.
Maybe try iconv. This will convert file from ISO-8859-1 to UTF-8

Code: Select all

# iconv -f ISO-8859-1 -t UTF-8//TRANSLIT /path/infile -o /path/outfile.utf

Re: Abiword and Geany character coding issues

Posted: Sat Apr 24, 2021 10:29 am
by amethyst

For Geany this seems to work for me (after years of driving me crazy and trying to get rid of its auto-character-detecting non-sensical blocks of gibberish):
From edit tab > Preferences > Files > Set default encoding for new files to Western (WINDOWS-1252) > Tick the block for Use fixed encoding when opening non-Unicode files > Set default encoding for existing non-unicode files to Western (WINDOWS-1252) > Apply the changes. This seems to work for hundreds of books in different formats which have been converted to text. This is for Geany, Leafpad (as simple as it is) seems to display text correctly. Now, to get Abiword to get it right too (this is a daunting task).


Re: Abiword and Geany character coding issues

Posted: Sat Apr 24, 2021 12:00 pm
by MochiMoppel
amethyst wrote: Thu Feb 04, 2021 7:31 pm

I have problems with Abiword and Geany displaying some gibberish symbols when opening text documents.

Care to post an (abbreviated) sample document?


Re: Abiword and Geany character coding issues

Posted: Sat Apr 24, 2021 2:15 pm
by amethyst
MochiMoppel wrote: Sat Apr 24, 2021 12:00 pm
amethyst wrote: Thu Feb 04, 2021 7:31 pm

I have problems with Abiword and Geany displaying some gibberish symbols when opening text documents.

Care to post an (abbreviated) sample document?

Mostly apostrophes but also many other things......


Re: Abiword and Geany character coding issues

Posted: Sat Apr 24, 2021 2:27 pm
by MochiMoppel

Please post the document. It is impossible to determine the problem with your source document just with a screenshot.
BTW these are not apostrophes, these are 'single quotation marks' - 2 different characters


Re: Abiword and Geany character coding issues

Posted: Sat Apr 24, 2021 2:33 pm
by amethyst

ALL my documents look like this. All are books converted to text from ebook formats. I'm not going to bother to post more text from more text documents that just looks the same and have the same problem. Waste of time. It's not a source problem because other word processors display the same text correctly. Abiword is just crap with its automatic encoding it seems. Rubbish application.


Re: Abiword and Geany character coding issues

Posted: Sat Apr 24, 2021 2:45 pm
by MochiMoppel

I'm not asking you to post "more text". I'm asking you to post the text file that causes the problem.


Re: Abiword and Geany character coding issues

Posted: Sat Apr 24, 2021 2:48 pm
by amethyst
MochiMoppel wrote: Sat Apr 24, 2021 2:45 pm

I'm not asking you to post "more text". I'm asking you to post the text file that causes the problem.

See above post, it's not a source problem.


Re: Abiword and Geany character coding issues

Posted: Sat Apr 24, 2021 4:02 pm
by MochiMoppel
amethyst wrote: Thu Feb 04, 2021 7:31 pm

What should I change so that Abiword and Geany display text correctly

I wonder why you asked this if you are obviously not interested in an answer.
Here is mine anyway: Abiword displays text of WINDOWS-1252 encoded txt files (like yours) correctly when you "Open file as type" ...no, not "Automatically detected", but rather "Encoded Text" => "Western European, Window Code Page 1252".
Now you can argue that this is crap, but that's beyond your question.


Re: Abiword and Geany character coding issues

Posted: Sat Apr 24, 2021 4:26 pm
by amethyst
MochiMoppel wrote: Sat Apr 24, 2021 4:02 pm
amethyst wrote: Thu Feb 04, 2021 7:31 pm

What should I change so that Abiword and Geany display text correctly

I wonder why you asked this if you are obviously not interested in an answer.
Here is mine anyway: Abiword displays text of WINDOWS-1252 encoded txt files (like yours) correctly when you "Open file as type" ...no, not "Automatically detected", but rather "Encoded Text" => "Western European, Window Code Page 1252".
Now you can argue that this is crap, but that's beyond your question.

Why do you presume I'm not interested in an answer and why are you referring to my opening post when we were interacting later on in the thread? You wanted some source text which is not applicable to this situation as pointed out to you (because it displays correctly in all other word processors I've tried) . This would be useful if one can set the default open option with that encoding set like one can do with Geany as mentioned (where are the options to set a default open char set with Abiword?). Can't see it in my Abiword version.


Re: Abiword and Geany character coding issues

Posted: Sat Apr 24, 2021 8:30 pm
by April

If you post a screenshot then what you see can be determined by others. Good so far ?
If you post the "file" that causes the trouble then what you cannot see can be determined by others .

Then a sensible answer can be offered not workarounds .


Re: Abiword and Geany character coding issues

Posted: Sat Apr 24, 2021 11:03 pm
by rockedge

maybe something like this?

Code: Select all

#!/bin/sh
LANG=de_DE.ISO-8859-1
#LANG=en_US.UTF-8
export $LANG
#export LC_ALL="en_US.UTF-8"
abiword

Re: Abiword and Geany character coding issues

Posted: Sun Apr 25, 2021 4:16 am
by MochiMoppel
amethyst wrote: Sat Apr 24, 2021 4:26 pm

Why do you presume I'm not interested in an answer and why are you referring to my opening post when we were interacting later on in the thread?

I referred to you opening post because you seem to have forgotten what you asked for and ignored all attempts by zigbert and me to help you. Since you couldn't "bother" to simply post one of the files in question I took the trouble to create a demo text file myself because I don't have WINDOWS-1252 encoded files. I don't even know if this is the encoding of your files because you refused to send a file for checking and on the other hand don't tell us what encoding is used. I know, it's "not applicable" to you but it's essential to answer your question.

So let's make it short: I'm pretty sure that Abiword can display your files and I told you how to do this. Did you try and did it work? Yes or no?

Once we got this solved we can move on to your next question and discuss, why Abiword does things differently.


Re: Abiword and Geany character coding issues

Posted: Sun Apr 25, 2021 5:01 am
by amethyst

Deleted


Re: Abiword and Geany character coding issues

Posted: Sun Apr 25, 2021 6:18 am
by amethyst
amethyst wrote: Sun Apr 25, 2021 5:01 am
MochiMoppel wrote: Sun Apr 25, 2021 4:16 am
amethyst wrote: Sat Apr 24, 2021 4:26 pm

Why do you presume I'm not interested in an answer and why are you referring to my opening post when we were interacting later on in the thread?

I referred to you opening post because you seem to have forgotten what you asked for and ignored all attempts by zigbert and me to help you. Since you couldn't "bother" to simply post one of the files in question I took the trouble to create a demo text file myself because I don't have WINDOWS-1252 encoded files. I don't even know if this is the encoding of your files because you refused to send a file for checking and on the other hand don't tell us what encoding is used. I know, it's "not applicable" to you but it's essential to answer your question.

So let's make it short: I'm pretty sure that Abiword can display your files and I told you how to do this. Did you try and did it work? Yes or no?

Once we got this solved we can move on to your next question and discuss, why Abiword does things differently.

It worked but basically useless as mentioned because one can't set the default opening action like with Geany (well, it can problably but I'll figure that out by myself, like I did with finding out myself that the Windows char set happens to work). A bit of googling suggests that the Abiword system profile can be manipulated for default settings so I'll try to figure out how I can get that to work. I'll also explore what rockedge suggested as an interim workaround. So, how many times must I tell and explain to you that this is not related to a specific source file? I have hundreds of files giving this problem with Abiword and Geany BUT the same hundreds of files display "correctly" (I'll come back to this later) with at least 5 other word processors/editors I've tried. Even Leafpad, as mentioned, does not have this issue. Also - none of the source files I've checked are actually Windows-1252 encoded. I know this because Notepad (Windows application available with WINE) only offers a few char sets that can be used and the Windows char set is not one of them (yet the files display correctly). I've checked some with file -i command and result is charset=unknown-8bit. Right-click for properties gives Non-ISO extended ACII English text. Also, a quick google does not suggest that ebooks use this specific Windows char set as mentioned. This Windows char set luckily just works accidently as an interim workaround for crappy applications like Abiword and Geany which have problems with automatic detecting/employing suitable char sets resulting in blocks of non-sensical gibberish.

The apostrophe issue is a strange one. When typing the ' when using a small font size, it may look wrong but do it with enlarge font size it actually looks like the apostrophe we learnt at school (it does with the Word processor I'm using anyway). So it may/must actually be the apostrophe character. However, this seems font related. Some fonts display the proper apostrophe character (when you enlarge the size enough) and others don't.

PS: As this specifc topic seems to have irritated some hot heads, this post will be my last on this specific topic.


Re: Abiword and Geany character coding issues

Posted: Sun Apr 25, 2021 8:46 am
by April

Head in sand stubborn.


Re: Abiword and Geany character coding issues

Posted: Sun Apr 25, 2021 9:07 am
by amethyst
April wrote: Sun Apr 25, 2021 8:46 am

Head in sand stubborn.

Reported for shit stirring and on ignore you go. I like this new feature. :thumbup:


Re: Abiword and Geany character coding issues

Posted: Sun Apr 25, 2021 10:06 am
by JASpup

I repeatedly see Abiword use warnings.

It's also a very prevalent word processor... and fast as Hades.

I don't use it enough for failures, but I have noticed font rendering problems. Have a Libre document with a bunch of fonts in the system and Abi shows the same generic font for the whole document - stuff like that.

Abi would be goto if it weren't buggy.

As a Win convert I've been perplexed by Wordpad's lack of a spell checker the majority of my life now. They were just trying to sell Word. Word peaked in the 90s, so give the people a spell checker! They're in our browsers for goodness sake!

I'm worried about Gnumeric. Hopefully it's as bug-free as its alternatives.

Geany's color-coding is glorious if we're looking at code. I'm in Leafpad most of the time.


Re: Abiword and Geany character coding issues

Posted: Sun Apr 25, 2021 11:32 am
by amethyst

Fixed by adding default settings to system profile. Topic marked as SOLVED (by myself). :D :thumbup:


Re: Abiword and Geany character coding issues (SOLVED)

Posted: Sun Apr 25, 2021 12:36 pm
by rockedge

My tip was basically to change the system profile during the usage of abiword.

Could you show how the settings were set?


Re: Abiword and Geany character coding issues (SOLVED)

Posted: Mon Apr 26, 2021 5:50 am
by MochiMoppel
rockedge wrote: Sun Apr 25, 2021 12:36 pm

My tip was basically to change the system profile during the usage of abiword.

Your tip was to use
LANG=de_DE.ISO-8859-1
export $LANG
abiword

Does this work for you? When I run this I get an error
export: `de_DE.ISO-8859-1': not a valid identifier
(process:17388): Gtk-WARNING **: Locale not supported by C library.
Using the fallback 'C' locale.

When I run the command locale -m I can see that ISO-8859-1 is in the list, as is ANSI_X3.4-1968, which would be more appropriate since amethyst's files are undoubtedly WINDOWS-1252 encoded and not in any ISO charset, but my question is rather how to run the command without error. There may be a locale.conf file needed, but I don't have one.


Re: Abiword and Geany character coding issues (SOLVED)

Posted: Mon Apr 26, 2021 6:35 am
by amethyst

amethyst's files are undoubtedly WINDOWS-1252 encoded

Undoubtedly WRONG. You are also wrong with assuming all these files are non-ISO. Many of them are actually ISO-8859-1.


Re: Abiword and Geany character coding issues (SOLVED)

Posted: Mon Apr 26, 2021 8:05 am
by MochiMoppel
amethyst wrote: Mon Apr 26, 2021 6:35 am

You are also wrong with assuming all these files are non-ISO. Many of them are actually ISO-8859-1.

Your words: "ALL my documents look like this".
Are you telling us that your ISO-8859-1 files also show "non-sensical gibberish"? This would really be interesting and unexpected. Readable screenshot please - if this is not too much trouble. I don't dare to ask you for a sample file.

The only clue I have from you is your screenshot, and this image, when enlarged, shows that the "non-sensical gibberish" makes a lot of sense and indeed reveals the encoding used: WINDOWS-1252 (aka ANSI as Microsoft likes to call it and as it may be listed in your MS Notepad).


Re: Abiword and Geany character coding issues (SOLVED)

Posted: Mon Apr 26, 2021 8:28 am
by amethyst

Are you telling us that your ISO-8859-1 files also show "non-sensical gibberish

YES!!!.. Anyways, all my files (whatever charset) seem to display correctly with the Windows code so I'm not going to bother to just focus on ISO-8859-1.
Picture already posted are one of those.

aka ANSI as Microsoft likes to call it and as it may be listed in your MS Notepad

Windows-1252 = Extended ASCII
For educational purposes: http://www.differencebetween.net/techno ... and-ascii/

makes a lot of sense and indeed reveals the encoding used: WINDOWS-1252 (aka ANSI as Microsoft likes to call it and as it may be listed in your MS Notepad)

.
WRONG, already very clearly explained to you in previous posts. You seem to ignore or don't understand what I wrote. You are assuming things and quite frankly right off the mark in fact you are now wasting my time. Cheers.


Re: Abiword and Geany character coding issues (SOLVED)

Posted: Mon Apr 26, 2021 9:41 am
by MochiMoppel
amethyst wrote: Mon Apr 26, 2021 8:28 am

YES!!!.. Anyways, all my files (whatever charset) seem to display correctly with the Windows code so I'm not going to bother to just focus on ISO-8859-1.

What YES!!!? They show gibberish in Abiword? I didn't ask when they show correctly, I asked how you manage to show them with gibberish. Abiword should display them correctly by default.

Picture already posted are one of those.

Was not a ISO-8859-1 file.


Re: Abiword and Geany character coding issues (SOLVED)

Posted: Mon Apr 26, 2021 9:43 am
by amethyst
MochiMoppel wrote: Mon Apr 26, 2021 9:41 am
amethyst wrote: Mon Apr 26, 2021 8:28 am

YES!!!.. Anyways, all my files (whatever charset) seem to display correctly with the Windows code so I'm not going to bother to just focus on ISO-8859-1.

What YES!!!? They show gibberish in Abiword? I didn't ask when they show correctly, I asked how you manage to show them with gibberish. Abiword should display them correctly by default.

Picture already posted are one of those.

Was not a ISO-8859-1 file.

Oh, FFS. Cheers.


Re: Abiword and Geany character coding issues (SOLVED)

Posted: Mon Apr 26, 2021 7:23 pm
by greengeek
MochiMoppel wrote: Mon Apr 26, 2021 5:50 am

Your tip was to use
LANG=de_DE.ISO-8859-1
export $LANG
abiword

Does this work for you? When I run this I get an error
export: `de_DE.ISO-8859-1': not a valid identifier
(process:17388): Gtk-WARNING **: Locale not supported by C library.
Using the fallback 'C' locale.

Similar here on Tahr 606:

abiword_start_test.jpg
abiword_start_test.jpg (56.51 KiB) Viewed 772 times

Re: Abiword and Geany character coding issues (SOLVED)

Posted: Tue Apr 27, 2021 7:58 am
by MochiMoppel
greengeek wrote: Mon Apr 26, 2021 7:23 pm

Similar here on Tahr 606:

This makes already two of us :)