Post on 22-Dec-2015
transcript
Digital Voice Communication on HF
Ford Amateur Radio LeagueMarch 12, 2015
David Treharne, N8HKU
Overview
• J2E emissions (part of telephony). Need to have protocols that work, 2 or more, and not considered encryption or a cypher.
• Amateurs have working on it since 2000, or even before that
• A lot of proprietary algorithms that do not work together and are not open for experimentation
• Can Codec 2 work?• Where will it be useful? (Rag chewing, emergency
communications, contests?)
Why digital?
• Analog has done quite well for SSB and FM.• The commercial world uses a lot of digital
signals for both voice and data. • Digital can improve signal to noise ratio by
eliminating the channel noise. • The Challenge: How to get a digital
compression code down below our 2,700 Hz SSB bandwidth
Conventional compression
• Normal Digital sampling– Need 2x the data minimum when a signal is
digitized• 3000 Hz of voice needs 6,000 samples per second,
more reasonably 8,000 samples per second.
– Compression algorithms:• Use patterns to reduce the size• Use lossy compression techniques.
– If you lose some of the compressed data, then you lose the complete signal.
From: David Rowe Presentations on Codec2
• The following slides were from two Codec2 presentations by David Rowe, VK5DGR.
• 2010 codec2_tapr_2010_v0.2• 2012 Ica_2012_codec2
Codec 2● Make a digital encoding that models speech, not just
compress it.● Similar to commercial versions, but made simpler, and
designed for voice work only.
● open source speech codec● low bit rate (2400 bit/s down to 1400 bit/s)● applications include digital speech for HF and VHF radio● fills gap in open source speech codecs beneath 5000
bit/s● work in progress● samples & source: rowetel.com/codec2.html
This is Not a DSP talk● Pitch Estimation● Linear Prediction● Line Spectrum Pairs● Voicing Estimation● Vector Quantisation● Source-Filter Model of Speech production● Inverse Discrete Fourier Transform● Overlap-add Synthesis
Codec2 @ 1400 bit/s● The benefits of a compressed voice signal:
● Send 45 phone calls in standard 64 kbit/s phone channel
● 175 bytes/s● 30 second voice mail in 5250 bytes● 30 minute pod cast in 308 Kbytes (Normal
podcast: 20,000 Kbytes)
Main Application – Voice over Digital Radio
● RF spectrum is extremely limited, noise, bit errors● Traditional analog speech systems (FM, SSB) are
really efficient in power and bandwidth use● But there is interest in using digital techniques .. if the
right codec is available● Compressed speech requires less bit/s over channel● This means less bandwidth● Less transmitter power, saving battery and improving
speech quality
Power Efficiency: Tighter Compression = Better Speech
1 Watt
2 at 0.5 Watt
Noise
1400 bit/s 2800 bit/s
Digital Voice Radio System
A/Dcodec2
encFECenc
mod
HF/VHFradio
D/Acodec2
decFECdec
demod
mic
spkr
Voice is not like data● If you get a single error in a data packet the entire packet is
useless● If you get a few errors in a voice packet it probably sounds
OK (10% errors are OK)● If you discard or lose a voice packet every now and again
it's probably OK (unlike data, where all data must be perfect)
● In speech packets some bits are more important than others (protect the most important data only)
● these factors can be used to build a better voice system (less power, less spectrum, more robust)
Codec 2 Author - David Rowe● Ham Radio operator, VK5DGR, first licensed in 1981 at
age 13 (first computer in 1982)● 20 years experience in speech coding● Built some of the first real time speech codecs in the late
1980's on early DSP chips● Now work full time on open software/open hardware for
developing world communications● (Wants efficient communications in parts of Africa not covered by
phones or even by cell phone towers. Use a type of Mesh network to transmit efficient voice over long distances with little power or equipment)
Proprietary Codecs● come in hardware or licensed software form● difficult to distribute● they cannot be modified● understanding how they work is discouraged● modification is actually illegal under the license
● D-Star uses the AMBE coding system. We cannot modify it, we can just purchase and use the hardware chip that performs this function.
Speech Coding●Take speech samples (e.g. 16 bit samples at 8 kHz sampling rate)●Compress to 1400 to 2400 bit/s●What can we throw away?●Retain intelligible speech●Retain natural speech●Use a model of speech, send model parameters, more efficient than coding waveform (Not just using compression, but looking at how human voice works. (These are not good for music, noises, etc.)
Model Parameter● example of a model parameter is pitch● for humans in the range 50 to 500 Hz
(100Hz for males, 500 Hz for children)● can be accurately represented with 7 bits● updated every 20 ms● so 7/0.02 = 350 bit/s to represent pitch
Sinusoidal Speech Coding
Pitch Period35 samplesor 4.4ms at 8kHzsample rate (230 Hz)
Time (samples)
Amplitude(16 bit samples)
(female speaker)
Notice a lot of repetition of the
signals, this gives us a pitch signal for this 40mS sample)
Sinusoidal Speech CodingAmplitude
(dB)
Frequency (Hz)
Pitch 230Hz or 4.3ms
Harmonics of 230Hz
Sinusoidal Speech Model
Amplitude 1Phase 1Frequency 1
Amplitude 2Phase 2Frequency 2
Amplitude LPhase LFrequency L
1 Sine Wave
3 sine waves
Male speaker, 80 Hz nominal pitch)
10
25
50 sine wave harmonics, just changing phase and amplitude of each, produces a very close signal.
Encoder Block Diagram
LPCAnalysis
MBEVoicing estFFT
16 bit, 8kHzsamples
LPC toLSP
LSPQuant
EnergyQuant
Pitchest
PitchQuant
LPCCorrection
2550 bit/s quantised model
parameters
7 bits of data for the pitch or fundamental frequency of the speaker at that point in time.
5 bits of energy signal (how loud)
1 bit to help with the low pitch frequencies
2 bits (sampled 10mS intervals) of voicing of a vowel or a consonant)
36 bits (The model parameters of the speech)
Bit Allocation
From one of the first versions of the Codec 51 bits per 20ms frame, or 2550 bit/s
Parameter Bits/frame
Spectral magnitudes (LSPs) 36
Low frequency LPC correction 1
Energy 5
Voicing (updated each 10ms) 2
Pitch 7
Total 51
Decoder Block Diagram
InverseFFT
16 bit, 8kHzsamples
RecoverHarm Amps
LSP toLPC
OverlapAdd
PostFilter
FFTLSPs
EnergyLPCCorrection Phase
SynthesisVoicing
FreeDV: The Digital voice for HF
Speech is compressed down to 1600 bit/s then modulated onto a 1.25 kHz wide 16QPSK signal which is sent to the Mic input of a SSB radio. On receive, the signal is received by the SSB radio, then demodulated and decoded by FreeDV. Communications should be readable down to 2 dB S/N, and long-distance contacts are reported using 1-2 watts power.FreeDV was built by an international team of Radio Amateurs working together on coding, design, user interface and testing. FreeDV is open source software, released under the GNU Public License version 2.1. The FDMDV modem and Codec 2 Speech codec used in FreeDV are also open source.
Running Codec2• May 21, 2014: SmartMic announced! An embedded hardware product
that allows you to run FreeDV without a PC. Plug SmartMic into your SSB or FM radio, and you now have Digital Voice (DV). (allows function with only 1 soundcard) $195.00
Issues over HF• Fading: The signal is lost, and data is lost. Can handle some fading by use of the
Forward Error Correction (FEC) bits. Ham Radio HF packet radio communication uses FEC. Codec2 also tries to decode signals even after fading. Most modems are designed to throw out data when signals are missing, since they are designed for data, where everything must be perfect. Codec2 is designed to try to decode even if there are errors. This allows us to still hear the rest of the speech even after a dropout. Since speech has a lot of redundancy, this is not an issue. (We do this all the time on HF)
• Group delay:– When signals bounce off of the ionesphere at different heights, causing some early signals
to arrive later, mixing with the later signals.• 5mS is common delay in HF. Need a Codec that sends one packet of bits for longer than 5mS to
make sure that it does not get affected by the delay. (Ham Radio RTTY at 45 baud sends the same signal for over 5mS before switching to a new signal to handle this delay. That is why that speed works so well for HF communications!!)
• Lining up the signal right in the passband. A problem for us already in HF SSB. Codec 2 starts a transmission by sending a sequence of known tones. The receiver knows the frequencies of these tones and lines up the decoder to match them.
What does this sound like?● http://www.rowetel.com/blog/?page_id=452● Male vs Female (pitch differences, Codec2
has trouble with low frequencies, so it adds in an extra bit)
● background noise and speech codecs (if the noise does not sound like a voice with the harmonics, then it does not bother the coding very much)
Conclusions
• Digital HF will continue in experimentation phase into the future– Use more bits to make it sound better, or to add in more
error correction?– Find a way to do this with one sound card? (I am not sure
why it requires two cards.)• Best used when a communication link has been
established, then switch to digital. • When it does work, it eliminates all of the QRM and
QRN on SSB. It could be the ragchew method of the future! Maybe, just maybe, even for contests!
Bibliography
• freeDV: http://freedv.org/tiki-index.php• Codec 2: http://www.rowetel.com/blog/?page_id=452