.
Simple, hackable offline speech to text
Moderator: Forum moderators
- mikewalsh
- Moderator
- Posts: 6163
- Joined: Tue Dec 03, 2019 1:40 pm
- Location: King's Lynn, UK
- Has thanked: 795 times
- Been thanked: 1983 times
Re: Simple, hackable offline speech to text
@Jasper :-
Hm! "Sounds" interesting....
I may very well try this one out myself. Text-to-speech seems quite common - I have at least 3 Windows TTS apps running happily under WINE, including what used at one time to be the 'industry leader' in this field, TextAloud! (before Dragon Naturally Speaking came along and took over market dominance) - but the reverse is far less so. I can think of all sorts of uses for a genuine, functionally usable 'dictation' app...
(I never use Abiword myself, but I assume this will output to whatever your default word processor happens to be, yes?)
Thanks for the research, BTW. More power to your search engines!
Mike.
Re: Simple, hackable offline speech to text
@mikewalsh
Please do give it a try, The downloaded files are relatively small in size eg 50mb and you need at least 300mb RAM to run the application.
One thing I did not remember to mention , is to ensure that you have a working microphone and if required adjust the level.
Yes, any word/text processing application should work.
I remember Text Aloud, I used it many years ago to convert PDF's to audio books. I had to use "copy & paste" a lot to create chapters. It did work well and I thought it was a great program. Never really used Dragon Speaking as it required you to have to speak prescribed text to ensure that it understood you.
A dictation application I think would be useful as they are common on mobile devices today and ideal for students or just recording voice notes.
Let me know your results, I would be interested
- mikewalsh
- Moderator
- Posts: 6163
- Joined: Tue Dec 03, 2019 1:40 pm
- Location: King's Lynn, UK
- Has thanked: 795 times
- Been thanked: 1983 times
Re: Simple, hackable offline speech to text
@Jasper :-
O-kay. Had an issue with this last night before turning in, so I've returned to it today. I'm getting this Python error:-
Code: Select all
root# cd ~./nerd-dictation
bash: cd: ~./nerd-dictation: No such file or directory
root# cd ~/nerd-dictation
root# ./nerd-dictation begin --vosk-model-dir=./model &
[1] 24943
root# Connection failure: Connection refused
pa_context_connect() failed: Connection refused
Traceback (most recent call last):
File "./nerd-dictation", line 1974, in <module>
main()
File "./nerd-dictation", line 1970, in main
args.func(args)
File "./nerd-dictation", line 1835, in <lambda>
func=lambda args: main_begin(
File "./nerd-dictation", line 1437, in main_begin
found_any = text_from_vosk_pipe(
File "./nerd-dictation", line 957, in text_from_vosk_pipe
import vosk # type: ignore
File "/usr/local/lib/python3.8/dist-packages/vosk/__init__.py", line 12, in <module>
from .vosk_cffi import ffi as _ffi
File "/usr/local/lib/python3.8/dist-packages/vosk/vosk_cffi.py", line 2, in <module>
import _cffi_backend
ModuleNotFoundError: No module named '_cffi_backend'
No module named "_cffi_backend"..? Where should this be? I can sorta find my way around Python directories, but is this an extra module that needs adding from the repos? Python is an absolute bastard to trouble-shoot, as you're probably aware..!
EDIT:- 'Kay. Installed the _cffi_backend module from the repos. All I'm getting now is this:-
Code: Select all
root# cd ~/nerd-dictation
root# ./nerd-dictation begin --vosk-model-dir=./model &
[1] 24943
root# Connection failure: Connection refused
pa_context_connect() failed: Connection refused
Not even any traceback, so.....I'm stumped. Any ideas? I'm aware that you've updated a ton of stuff in your FP64 9.5.....ANY of which could be affecting this.
Mike.
Re: Simple, hackable offline speech to text
@mikewalsh
Try this at the beginning
Code: Select all
pip install cffi
then move onto
Code: Select all
pip3 install vosk
................ just running through it again!!
Re: Simple, hackable offline speech to text
@mikewalsh
Thanks for the feedback ........ I am embarrassed to say I was working on a laptop at the time which was running UpupJammy64 not FP95
I have corrected my initial post and updated the details.
- mikewalsh
- Moderator
- Posts: 6163
- Joined: Tue Dec 03, 2019 1:40 pm
- Location: King's Lynn, UK
- Has thanked: 795 times
- Been thanked: 1983 times
Re: Simple, hackable offline speech to text
Aaahh......
Actually, I doubt this would have worked anyway. We desktop guys have one major issue here that you laptop guys don't have.
Laptops all come with a built-in microphone, which is seen by any OS as a global default for the entire system. With a desktop, there IS no built-in microphone. You have to plug in your own microphone, or use webcam microphone(s), or employ a headset .....and you have to find a way to specify which one you want to use. And in Puppy, AFAIK (I'm willing to be corrected here!) there is no method for setting a microphone for global use across the system. Everything has to be specified on a per-app basis. Especially when you're an ALSA guy like me.....you can keep PulseAudio/Pipewire as far as I'm concerned, because they're just adding further unnecessary complexity.
Never mind..!
Mike.
- cobaka
- Posts: 572
- Joined: Thu Jul 16, 2020 6:04 am
- Location: Central Coast, NSW - au
- Has thanked: 94 times
- Been thanked: 63 times
Re: Simple, hackable offline speech to text
Hello all
I'm at the beginning of the process to get/run Jasper's speech-to-text application.
Jasper wrote:
You will need to load up the DevX.SFS first (tested only in FP95) as you will need to use Python and Git.
I'm using FossaPup 96. I don't know a lot about DevX (of DevX.sfs). At first Jasper said he did this in Fossa95 (above), but later wrote he worked in Jammy-Pup.
Well, I'm in Fossa96. I assumed I could get *.sfs files from the menu: Menu -> Setup -> SFS-Load -> <click> didn't work
Soooo .... I think I must get DevX.sfs from another place, but where? (see below)
After that: place DevX file in my 'home' directory. For me, I think this should be: /mnt/home/SYSTEM - ie the folder where puppy_fossapup64_9.6.sfs is found. Yes/No? <--<<
Question: Where do I find/download DevX.sfs for Fossa96?
Maybe here?
https://www.mediafire.com/file/j0v9gye5 ... 5.sfs/file <--<< This is '95" not "96". Is that important?
Thanks everyone!
собака --> это Русский --> a dog
"c" -- say "s" - as in "see" or "scent" or "sob".
- mikeslr
- Posts: 2965
- Joined: Mon Jul 13, 2020 11:08 pm
- Has thanked: 178 times
- Been thanked: 922 times
Re: Simple, hackable offline speech to text
devx.sfs for F96, from the OP of the F96 thread, https://www.forum.puppylinux.com/viewto ... 882#p85882 > https://rockedge.org/kernels/data/ISO/F ... 64_9.6.sfs
The OP of Puppy threads often provides such link.
Further info, limitations: https://github.com/ideasman42/nerd-dictation, requires Python 3.6 (or newer).
- cobaka
- Posts: 572
- Joined: Thu Jul 16, 2020 6:04 am
- Location: Central Coast, NSW - au
- Has thanked: 94 times
- Been thanked: 63 times
Re: Simple, hackable offline speech to text
Hi @mikeslr & @Jasper
I found the DevX file (and - embarrassment, embarrassment - when I went to save it discovered it was already on my HDD.
Yes, in a folder /mnt/sda2/software/Puppy_Linux_masters/Fossa64/
What an unusual place to keep the DevX file. Strange but true!
Loaded DevX. Confirmed by message from Python: ->
python <-- me
Python 3.8.10 (default, Jun 22 2022, 20:18:18) <--<< Python
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
After running pip install cffi a few times (got a syntax error message each time) I read that "pip" does not work in the Python shell, but I should be in the bash shell.
(I'm a novice here). Still trying to get "pip install cffi" to run w/out error. <-- present stage of progress.
Help very welcome while I read everything I can find using Duck-Duck search engine .....
I hope I can get/run speech to text.
cobaka
собака --> это Русский --> a dog
"c" -- say "s" - as in "see" or "scent" or "sob".
- cobaka
- Posts: 572
- Joined: Thu Jul 16, 2020 6:04 am
- Location: Central Coast, NSW - au
- Has thanked: 94 times
- Been thanked: 63 times
Re: Simple, hackable offline speech to text
Continuing from the previous post (Tue, Mar 5th at 8:58)
I'm trying to install/run speech to text - following Jas[er's success.
I'm still at the beginning of the process - but I'm confident I will get "there".
I have discovered I don't have "pip" or "pip3" on my PC.
I read the following command-line should install pip: sudo apt install python3-pip
The result: sudo: apt: command not found
I need another method to install "pip" or "pip3".
Help appreciated.
Cobaka.
My rig is: Fossa96CE running on a 2012/13 Giga-something 64-bit CPU.
I have DexX loaded from an "official" website.
Python 3.8.10 runs when I type "python" in bash/terminal/CLI.
I'm pretty much a novice at this game.
собака --> это Русский --> a dog
"c" -- say "s" - as in "see" or "scent" or "sob".
Re: Simple, hackable offline speech to text
I attempted to run nerd-dictation in BW64 and got the same error as mentioned earlier. No success with FP95 which had pulseaudio. Similar in a Debian Dog Virtual machine using QEMU.
However, I setup a Jammypup64 virtual machine using QEMU, and it worked first time. The audio passthrough (using switch: -device AC97) seemed to be effective. Now I need to get the audio setup optimised, as my first try was littered with output errors. I have since installed a frugal version of JP64 and will try it on that setup, where it will have better speed and memory availability.
The instructions do not say that the output will be wherever the cursor is, once nerd-dictation is started. So if you start the program from the terminal, that is where the text will go unless you quickly move the cursor to a blank text document before talking. Stopping the application is also an issue, so I wrote a bash script to open up another terminal and end the process.
- cobaka
- Posts: 572
- Joined: Thu Jul 16, 2020 6:04 am
- Location: Central Coast, NSW - au
- Has thanked: 94 times
- Been thanked: 63 times
Re: Simple, hackable offline speech to text
Hello @Jasper
My puppy seems to lack "pip"
This is what 'bash/terminal' told me:
pip install cffi
bash: pip: command not found
python3 -m pip install --upgrade pip
/usr/bin/python3: No module named pip
python3 -m pip install
/usr/bin/python3: No module named pip
find / -iname "pip" <== found nothing. Well, maybe 'pip' has some trailing characters beginning with dash "-"
find / -iname "pip-" <== Anything? No ...
find / -iname "pip" | more <=== I won't print the result of this search. I found pages of "pipes" and a few other files, but not one "pip*"
Did 'find' search EVERY storage device? I'm confident it searched /mnt/sda2.
I'm currently searching the web for a way to get/install pip3
o-o-o time passes - then:
I found this command called 'curl' and tried it as you see below.
# curl https://bootstrap.pypa.io/pip/2.7/get-pip.py --output get-pip.py
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1863k 100 1863k 0 0 2317k 0 --:--:-- --:--:-- --:--:-- 2314k
o-o-o
a file called "get-pip.py" is in the "real" root directory. The directory called "/" not "root".
o-o-o to be continued .... o-o-o
cobaka.
собака --> это Русский --> a dog
"c" -- say "s" - as in "see" or "scent" or "sob".
- cobaka
- Posts: 572
- Joined: Thu Jul 16, 2020 6:04 am
- Location: Central Coast, NSW - au
- Has thanked: 94 times
- Been thanked: 63 times
Re: Simple, hackable offline speech to text
@Jasper
A summary of this posting: The software is in place, but I haven't plugged in/used a microphone.
The rest of this thread describes getting/installing the software. Very routine.
I'll write about using the microphone in a new message.
o-o-o
Following on from previous posting.
I needed to get pip - and I believe I have it. Looky!
$ which pip
/usr/local/bin/pip
$ pip --version
pip 20.3.4 from /usr/local/lib/python3.8/dist-packages/pip (python 3.8)
o-o-o OK. That suggests I got and installed pip.
Now I continue using your (Jasper's) dialog from the original posting.
Thank you for your patience. I'm a novice in this part of the woods ....
o-o-o however - I am now following every step given in your original posting and everything (until now) looks good.
I am in the directory 'nerd-dictation' and ls reveals this:
$ ls
changelog.rst hacking.rst _misc package readme.rst readme-ydotool.rst
examples LICENSE nerd-dictation pyproject.toml readme-sox.rst tests
$
o-o-o-K sorry to be verbose. It looks like everything worked (once I got pip working - thank to you)
I am now at the point (in your instructions) where you wrote:
Once you have completed the above, you are ready to test it yourself.
My next task is to plug a microphone into the desktop and see what happens.
Wish me luck.
cobaka
PS I ran into a minor problem in the command wget. Copying and pasting gave an error message.
I gave up pasting and retyped the line, in the terminal and bingo - it worked.
собака --> это Русский --> a dog
"c" -- say "s" - as in "see" or "scent" or "sob".
- cobaka
- Posts: 572
- Joined: Thu Jul 16, 2020 6:04 am
- Location: Central Coast, NSW - au
- Has thanked: 94 times
- Been thanked: 63 times
Re: Simple, hackable offline speech to text
Hello @Jasper and @mikewalsh
The state of the game at the moment.
I thought I had the software installed correctly.
I found a microphone and using gWaveEdit (from the menu) saw the VU meter move to center scale when I spoke etc.
The mic is connecting to Puppy in some fundamental way.
At this point I found that the directory 'nerd-dictation' is in the directory "/" - i.e. the REAL root of the directory tree.
In your setup the directory 'nerd-dictation' is in the directory /root. This is the directory for a USER called 'root'
Looky here:
Code: Select all
# echo $USER
root
Well - when I am running from the directory 'nerd-dictation' and run the program I get an error. You can see this under my signature.
Help/observations welcome!
cobaka
color=#FF0000 The error:[/color]
Code: Select all
# pwd
/
# cd /nerd-dictation/
# ./nerd-dictation begin --vosk-model-dir=./model &
[1] 11879
# Traceback (most recent call last):
File "./nerd-dictation", line 1974, in <module>
main()
File "./nerd-dictation", line 1970, in main
args.func(args)
File "./nerd-dictation", line 1835, in <lambda>
func=lambda args: main_begin(
File "./nerd-dictation", line 1437, in main_begin
found_any = text_from_vosk_pipe(
File "./nerd-dictation", line 957, in text_from_vosk_pipe
import vosk # type: ignore
File "/usr/local/lib/python3.8/dist-packages/vosk/__init__.py", line 12, in <module>
from .vosk_cffi import ffi as _ffi
File "/usr/local/lib/python3.8/dist-packages/vosk/vosk_cffi.py", line 2, in <module>
import _cffi_backend
ModuleNotFoundError: No module named '_cffi_backend'
write() failed: Broken pipe
^C
[1]+ Exit 1 ./nerd-dictation begin --vosk-model-dir=./model
#
What about cffi?
Well - cffi exists. I know only that 'cffi' is the "C" function interface. Nothing more.
Code: Select all
$find / -iname "cffi"
/usr/lib/python3/dist-packages/cffi
(the end)
собака --> это Русский --> a dog
"c" -- say "s" - as in "see" or "scent" or "sob".
- cobaka
- Posts: 572
- Joined: Thu Jul 16, 2020 6:04 am
- Location: Central Coast, NSW - au
- Has thanked: 94 times
- Been thanked: 63 times
Re: Simple, hackable offline speech to text
Hello @Jasper, @mikewalsh & @mikeslr
Mike - you had a 'go' at running this - but seem to have fallen away in the last week or so.
I notice you re-located discussion to other topics too.
I'm keen to re-activate this topic; I would like (very much) to introduce speech-to-text on Fossa 96CE (and other pups too).
I believe I'm only a few steps away from running speech to text on my desktop.
(See below for detail).
cobaka
собака --> это Русский --> a dog
"c" -- say "s" - as in "see" or "scent" or "sob".
- cobaka
- Posts: 572
- Joined: Thu Jul 16, 2020 6:04 am
- Location: Central Coast, NSW - au
- Has thanked: 94 times
- Been thanked: 63 times
Re: Simple, hackable offline speech to text
Hello all.
I want to get nerd dictation running (after an absence of some months).
I am running FossaPup 64 9.6 CE, distro date -> March 2023 (if that helps).
Further down this thread I see advice about getting pip and cff1.
On my system I have pip-24.2.
Here it is: # find / -iname 'pip'
/usr/local/lib/python3.8/dist-packages/pip
I have cffi
# pip install cffi
Requirement already satisfied: cffi in /usr/lib/python3/dist-packages (1.14.0)
# find / -iname 'cffi_*'
/usr/lib/python3/dist-packages/cffi/cffi_opcode.py
When I run the command: ./nerd-dictation begin --vosk-model-dir=./model &
I get a series of error messages. Tracing these through the last error message is:
File "/usr/local/lib/python3.8/dist-packages/vosk/vosk_cffi.py", line 2, in <module>
import _cffi_backend
ModuleNotFoundError: No module named '_cffi_backend'
In the file vosk_cffi.py the first six lines are:
Code: Select all
# auto-generated file
import _cffi_backend
[color=#BF0000]Note:[/color] error message reports "ModuleNotFoundError: No module named _cffi_backend"
ffi = _cffi_backend.FFI('vosk.vosk_cffi',
_version = 0x2601,
_types = b'\x00\x00\x04\x0D\x00\x00\x65\x03\x00\x00\x00\x0F\x00\x00\x1C\x0D\x00\x00\x60\x03\x
Code (above) reads "import _cffi_backend" (This is line 2 mentioned in the error report)
The problems may be in the underscore ("-") before "cffi". It appears in line 2.
The file in my file system has no underscore.
Does anyone, especially (@Jasper know what's going on?
I think I'm almost there. Help appreciated.
cobaka
собака --> это Русский --> a dog
"c" -- say "s" - as in "see" or "scent" or "sob".
Re: Simple, hackable offline speech to text
@cobaka
Please note, my initial instructions were for Jammypup64.
I tested this on a laptop which has an inbuilt microphone.
Maybe this will help?
Update PIP
Code: Select all
python -m pip install --upgrade pip
then remove the existing package cffi
Code: Select all
pip uinstall cffi
finally, download the cffi package again
Code: Select all
pip install --upgrade --force-reinstall cffi
- cobaka
- Posts: 572
- Joined: Thu Jul 16, 2020 6:04 am
- Location: Central Coast, NSW - au
- Has thanked: 94 times
- Been thanked: 63 times
Re: Simple, hackable offline speech to text
@Jasper
I understand your point - you use JammyPup and I'm trying to install nerd-dictation on FossaPup.
I want to get nerd-dictation working under FossaPup (eventually) because you recommend it but also there is a good review of nerd-dictation here:
https://www.youtube.com/watch?v=Cw1SESc8sdA
As a first step, I will install nerd-dictation with JammyPup on a spare machine I have here.
After I get it running with your configuration I'll return to FossaPup.
This piece of software appears most useful. I suggest every Puppian with a 64-bit machine will find it useful.
Especially this Puppian.
I may install other Puppies too and see if it will run. The review on YouTube suggests it's very useful.
Thanks for your help, Jasper.
Cobaka.
PS: For anyone who can comment about the file-name discrepancy in my preceding posting (leading underscore vs no underscore) - then please speak up. I suspect this is the problem, but I don't see the solution (yet). I think that posting is self-contained - i.e. you will understand the problem after reading just one posting.
собака --> это Русский --> a dog
"c" -- say "s" - as in "see" or "scent" or "sob".
- cobaka
- Posts: 572
- Joined: Thu Jul 16, 2020 6:04 am
- Location: Central Coast, NSW - au
- Has thanked: 94 times
- Been thanked: 63 times
Re: Simple, hackable offline speech to text
. Here is a review of nerd-diction on YouTube by arthurPizza
The link:
https://www.youtube.com/watch?v=Cw1SESc8sdA
It's the same link I gave earlier in this thread.
I added it to make this posting self-contained.
Here is a comment by @DigitalMetal about Arthur's review:
<I commented earlier, but I don't see my comment now.
Thanks for this video. I've played with mozilla Deep Speak in the past, but it always seemed kind of bulky and slowly me. although i never did try setting a to use the gpu rather than a cpu. this [nerd-dictation] seems like a simpler solution. in fact i already have it up and running and i'm writing this comment using my voice, which is why nothing seems to be capitalized. [comment: I (cobaka) added the capitals to 'deep speak' to clarify DigitalMetal's meaning somewhat.]this will definitely be useful for me, and i already have it set up where the keyboard shortcut to turn it on and off. >
and another comment by @arnaudmosse6894:
<This tool is great, but without capitalize it is not usable. Punctiation can be done by hand or in the config file (replace period by .), but capital at the first letter of sentence cannot. >
My observation: Any written communication should be proof-read. The task of capitalisation while proof-reading is a minor inconvenience. I write reports and critiques. When I do this I remember the advice of John Steinbeck, who said something like: Editing becomes difficult after the fourth revision. Steinbeck is right. That's hard advice, but true. I will capitalize at the same time I edit. For me - no problemo.
Here is another observation by @now-you-know-it:
<Yes you are good! The problem here is us "fairly newbies" do not have a clue [about] what you just said and do. Our heads are still a bit to flat. Would it be possible to make a video how to step by step. Show what to type and install the commands etc. Yes a lot of work but for newbies this would be a good learning experience.>
Finally, there are still some bugs. @harshavardhanaradhyahu2870 got the same bug as I got.
<I got an error like this "failed to create a model">
No need to say more.
My challenge is to get this working with FossaPup.
собака --> это Русский --> a dog
"c" -- say "s" - as in "see" or "scent" or "sob".