new tts just dropped (just kidding)
Played 115 times
Uploaded by: michaelos
Upload date: 2/17/2025
Description:
michaelOs tts... Well, this is a proof concept. This was created with the help of my voice and earcons and speech rules. the only caveat here and huge drawback is hat to make it pronounce things in english, you cannot do it normally. What you can do is instead is write it phoneticaly to be read phonetically and pronounced like so by the synthesizer. As this is a proof of concept, it only can speak spanish, and even then, there are things to be changed when wanting to pronounce it correctly. the speech might be unintelligible For example, que and qui. those would be pronounced "kay" and "kee" but if you input them like they are, they will be pronunced something like "que and qui" if you have an english speech synthesizer then you'll notice how they would be pronounced. Same happens with gue and gui, those special combinations of letters which change the pronounciation of the g same happens with c, this is why you'll not hear the synthesizer proof of concept pronouncing audacity as ah oo dah see tee but instead will hear it pronouncing it like a oo dah kee tee, so there's that. There's another drawback, and it's that it sometimes won't shut up with control. That's a problem, yes it is, but this applies only to certain situations. and here are the steps on making it 1: create a portable copy of nvda with earcons and speech rules. If you have it already, you an create a portable copy from your current nvda installation with earcons and speech rules. 2: record your voice. At least 26 clips for each letter and vowel, from a to z. A, e, i, o and u should be pronounced similar to ah, a, ee, o and oo. The other letters can be pronounced phonetically like how you'd normally pronounce them I haven't yet discovered a way to make numbers unless I prerecorded my own and made those, so for now, that's how it might be 3: make a new folder inside user config\addons\phoneticPunctuation\sounds\. It should be named character. In here, export all the audio files you created of your prerecorded voice uttering the vowels and characters, each one with the letter's name. Make sure to edit and cut. If you've recorded a whole audio of you uttering them, from a to z, then you should cut and split into different tracks. Or you can go the longer route of making separate tracks. Or if you don't have something that allows for multitrack editing then you can record one, export, record another, export. It depends on your workflow and editor preference 4: once that is done... replace letters go to nvda settings, earcons and speech rules. In here, press on the button to add a rule. Your first letter will be an a. Put that into the pattern box, go to the category combo box, and press c to get to characters. Immediately press tab to head yourself to the next combo box to select the wav file, and select the file of yourself uttering the vowel a do this with the subsequent letters replacing as you go patterns with the audio, til you get to z. From there, go back to a, but this time, caps. To z again 5: extras you can also make a silence track, export that, and use that as punctuation. In my case I just manually inserted silences this is a very, very rudimentary of doign this sutff. If you want higher quality, you could likely make a long audio of your voice and somehow make a piper tts version of your voice, of which you're left on your own because I've never done it, so yeah. i don't know any other alternatives, especially local ones. Local voice cloning likely requires a good gpu, the voice cloning thing could be done with rvc I believe, I don't exactly know, or eleven labs if you're into that and prefer online voice cloning.
Comments
jim_pickens
so, when does the nvda addon drop? Sounds like it'd be a good eloquence replacement, certainly sounds better than eloquence anyway.
MichaelOs
I like the irony in one of the comments. Well, addressing that. the synthesizer is bad because it's a proof of concept, and a proof of concept is just the beginning. For now I cannot make it speak english in another way hat is not phonetically, as pronouncing it as if it's written is not easy to understand. Second, I don't have a gpu to run rvc. Third, I don't even know how to use and run rvc, at least for now. if it sounds horrible it's because i replaced all the letters of the alphabet including accented letters and ñ with recorded phonemes of my voice. It's only meant to be an experiment. A proof of concept. I used earcons and speech rules, also known as phonetic punctuation. If I could code in python and make a sort of synthesizer which has folders each one with different characters then I could create multiple voices, but for now, I can only change characters to other spliced prerecorded voices with another portable copy of nvda. anyway, thx. Mwahaha
danestange
use chatgpt and make this thing awesome!! This is a very cool concept and with work it could be great!
danestange
if you release the code or speech samples with a readme, maybe I can play with it and make a version and share it with you.
MichaelOs
perhaps I might release the speech samples, because that's what I used. I'd have to check the files. And about chatGPT... perhaps I might try to see if it will generate some python script for a ssynthesizer, but I don't exactly trust this stuff, as I don't know what I'd be getting into. The only thing I can understand to a degree is html, and even then, I get confused around things like borders and button labels. In the descriptions there''s a step by step to making your own "speech synthesizer" with the earcons and speech rules nvda addon. The step by step is similar to making entries for a dictionary. earcons and speech rules apparently supports the creation of folders, so this is how I got this working. Putting the characters folder into the sound folders of the portable I made. Perhaps I might be able to upload the nvda portable with the whole thing but I don't promise it will be good.
MichaelOs
I don't promise I might do all of that, as I am often forgetful and sometimes distracted by stuff, though.
patricus
also, what language are the samples?
MichaelOs
it's more like phonetic. One can make it speak any language that has the vowels included. It cannot speak english perfectly, of course not, but at least it can provided you spell thigns the right way to be poronounced the right way. Perhaps with more phonemes I might be able to make it speak some other languages that require extra hponemes. It can speak english with a very horrible accent, and spanish, though the new version is harder to understand due to me decreasing the vowel lengths
patricus
I meant, that didn't sounded like an English tts, phonemes sounded like some non English stuff.
MichaelOs
it isn't meant for english, as english is not phonetically spelled. To make it speak a very horrible english I had to spell the whole thing phonetically else it wouldn't. It says thigns like they're spelled, and it was originally meant for spanish as it's pretty phonetic except for some other things, so yeah
patricus
rofl, the same as Polish, really phonetical, Polish is phonetical 99% of the time.
MichaelOs
right. Vaguely reminds me of gregor except gregor sounded even worse, at least in my opinion, though this one seems to use similar techniques as gregor to produce rudimentary tts.
MichaelOs
transcription of the audio (quotes means there's a grammatical error and also it occasionally means false statements): speech mode talk blank this is an ew synthesizer blank espaci... blank espacio (space) this is a new synthesizer but it only supports spanish for now show hidden icons button audacity audacity blank let's read a text in spanish to show you the power of this synthesizer show hidden... documents documents file explorer books AI gener... old spanish documents AI generated local content bo... Unti... Auda... I have to modify the document to "read it correctly" so apologies, and to read in english you have to "literally" write it phonetically. Warning the following document is in spanish, so only proceed if you understand spanish. Take note the spanish here is a horrible attempt at using the vosotros forms and I "made it" when I was a little kid so excuse me for "the very cringy things" without further ado here we go translation of spanish text: how to make industrial potato chips hi folks. Today I'll teach y'all how to make industrial potato chips, so buckle your belts, that this recipe will leave your jaws dropped. first, you peel the shells from the potatoes, "dredging them in flour and egg" after you're done with them so that when they're fried they look so amazing and crunchy that you sh*t yourself. y'all gotta fry them in a frying pan while "stirring the flour and the egg mixture" you have to use a machine to leave them tatters thin with the final touch "dredge with a bit of salt and stir well" oh, how cool it came out! finish y'all's industrial potato chips. share em with your family bruh, see how they ended out? they're cool, right? my mouth is watering! Original spanish text: cómo hacer patatas fritas de fábrica hola chavales, hoy, os enseñaré como hacer patatas fritas de fábrica. Así que abróchense los cinturones, que esta receta los dejará con la boca abierta primero, os sacaros las cascaras de las patatas y poneros huevo y harina para que tengan un buen aspecto chulo y crugiente que te cagas. os tenéis que freírlas en una sartén y os revolveros el huevo con la harina para que queden tan chulas como las de fábrica, tenéis que usar una máquina para dejar finas las patacas con el toque final echaros sal y os revolveros muy bien. ¡o que chulo quedó! terminaros buestras patatas fritas de fabrica comparteros con tú familia ¡os tío cómo ha quedado! molan mucho y se me hace agua la boca
Athlon
yeah I would never have figured that out from the synth
patricus