At the
"starter" end there is perhaps the simple "Voice-to-code,
Code-to-voice" protocol of speech recognition to keyer …
Dear LF Group,
In my rather limited knowledge, there seem
to be two approaches. One is to encode the speech waveform in an efficient way
that allows transmission at a relatively low data rate, whilst still allowing a
reasonable facsimile of the speaker’s voice to be reconstructed at the
receiving end. An example of this would be the “CELP” encoding used
for GSM mobile phones. I think the problem for LF would be that the minimum
practical bandwidth would still be fairly large – I recall G4JNT did some
work on digital speech for HF in a SSB bandwidth – perhaps he could
comment.
The other way would be not to worry about
exactly reproducing the speaker’s voice, accepting that whoever the
speaker was, they would always sound like Prof. Stephen Hawking at the
receiver, but as G4GVW says, transmitting just the meaning of the spoken words.
One way to to do this would be for speech recognition software at the
transmitter to generate text that could be transmitted at maybe several 10s to
a few 100 bits/s, and then fed into a speech synthesizer at the receiver. I
think the problem with this is that speech recognition algorithms don’t
seem to be very accurate – when used with word processors, etc. the user
has to speak carefully, extensively “train” the software and make
many corrections when the software identifies the wrong word. This is not so
bad when the speaker reviews the text afterwards, but disastrous for a “real
time” conversation in a difficult environment! I think that a possible
way round this would be to break the speaker’s voice down into phonetic
information rather than text – but perhaps someone knows better?
Cheers, Jim Moritz
73 de M0BMU