Digital Voice Communication on HF Ford Amateur Radio League March 12, 2015 David Treharne, N8HKU.

transcript

Digital Voice Communication on HF

Ford Amateur Radio LeagueMarch 12, 2015

David Treharne, N8HKU

Overview

• J2E emissions (part of telephony). Need to have protocols that work, 2 or more, and not considered encryption or a cypher.

• Amateurs have working on it since 2000, or even before that

• A lot of proprietary algorithms that do not work together and are not open for experimentation

• Can Codec 2 work?• Where will it be useful? (Rag chewing, emergency

communications, contests?)

Why digital?

• Analog has done quite well for SSB and FM.• The commercial world uses a lot of digital

signals for both voice and data. • Digital can improve signal to noise ratio by

eliminating the channel noise. • The Challenge: How to get a digital

compression code down below our 2,700 Hz SSB bandwidth

Conventional compression

• Normal Digital sampling– Need 2x the data minimum when a signal is

digitized• 3000 Hz of voice needs 6,000 samples per second,

more reasonably 8,000 samples per second.

– Compression algorithms:• Use patterns to reduce the size• Use lossy compression techniques.

– If you lose some of the compressed data, then you lose the complete signal.

From: David Rowe Presentations on Codec2

• The following slides were from two Codec2 presentations by David Rowe, VK5DGR.

• 2010 codec2_tapr_2010_v0.2• 2012 Ica_2012_codec2

Codec 2● Make a digital encoding that models speech, not just

compress it.● Similar to commercial versions, but made simpler, and

designed for voice work only.

● open source speech codec● low bit rate (2400 bit/s down to 1400 bit/s)● applications include digital speech for HF and VHF radio● fills gap in open source speech codecs beneath 5000

bit/s● work in progress● samples & source: rowetel.com/codec2.html

This is Not a DSP talk● Pitch Estimation● Linear Prediction● Line Spectrum Pairs● Voicing Estimation● Vector Quantisation● Source-Filter Model of Speech production● Inverse Discrete Fourier Transform● Overlap-add Synthesis

Codec2 @ 1400 bit/s● The benefits of a compressed voice signal:

● Send 45 phone calls in standard 64 kbit/s phone channel

● 175 bytes/s● 30 second voice mail in 5250 bytes● 30 minute pod cast in 308 Kbytes (Normal

podcast: 20,000 Kbytes)

Main Application – Voice over Digital Radio

● RF spectrum is extremely limited, noise, bit errors● Traditional analog speech systems (FM, SSB) are

really efficient in power and bandwidth use● But there is interest in using digital techniques .. if the

right codec is available● Compressed speech requires less bit/s over channel● This means less bandwidth● Less transmitter power, saving battery and improving

speech quality

Power Efficiency: Tighter Compression = Better Speech

1 Watt

2 at 0.5 Watt

1400 bit/s 2800 bit/s

Digital Voice Radio System

A/Dcodec2

encFECenc

HF/VHFradio

D/Acodec2

decFECdec

Voice is not like data● If you get a single error in a data packet the entire packet is

useless● If you get a few errors in a voice packet it probably sounds

OK (10% errors are OK)● If you discard or lose a voice packet every now and again

it's probably OK (unlike data, where all data must be perfect)

● In speech packets some bits are more important than others (protect the most important data only)

● these factors can be used to build a better voice system (less power, less spectrum, more robust)

Codec 2 Author - David Rowe● Ham Radio operator, VK5DGR, first licensed in 1981 at

age 13 (first computer in 1982)● 20 years experience in speech coding● Built some of the first real time speech codecs in the late

1980's on early DSP chips● Now work full time on open software/open hardware for

developing world communications● (Wants efficient communications in parts of Africa not covered by

phones or even by cell phone towers. Use a type of Mesh network to transmit efficient voice over long distances with little power or equipment)

Proprietary Codecs● come in hardware or licensed software form● difficult to distribute● they cannot be modified● understanding how they work is discouraged● modification is actually illegal under the license

● D-Star uses the AMBE coding system. We cannot modify it, we can just purchase and use the hardware chip that performs this function.

Speech Coding●Take speech samples (e.g. 16 bit samples at 8 kHz sampling rate)●Compress to 1400 to 2400 bit/s●What can we throw away?●Retain intelligible speech●Retain natural speech●Use a model of speech, send model parameters, more efficient than coding waveform (Not just using compression, but looking at how human voice works. (These are not good for music, noises, etc.)

Model Parameter● example of a model parameter is pitch● for humans in the range 50 to 500 Hz

(100Hz for males, 500 Hz for children)● can be accurately represented with 7 bits● updated every 20 ms● so 7/0.02 = 350 bit/s to represent pitch

Sinusoidal Speech Coding

Pitch Period35 samplesor 4.4ms at 8kHzsample rate (230 Hz)

Time (samples)

Amplitude(16 bit samples)

(female speaker)

Notice a lot of repetition of the

signals, this gives us a pitch signal for this 40mS sample)

Sinusoidal Speech CodingAmplitude

Frequency (Hz)

Pitch 230Hz or 4.3ms

Harmonics of 230Hz

Sinusoidal Speech Model

Amplitude 1Phase 1Frequency 1

Amplitude 2Phase 2Frequency 2

Amplitude LPhase LFrequency L

1 Sine Wave

3 sine waves

Male speaker, 80 Hz nominal pitch)

50 sine wave harmonics, just changing phase and amplitude of each, produces a very close signal.

Encoder Block Diagram

LPCAnalysis

MBEVoicing estFFT

16 bit, 8kHzsamples

LPC toLSP

LSPQuant

EnergyQuant

Pitchest

PitchQuant

LPCCorrection

2550 bit/s quantised model

parameters

7 bits of data for the pitch or fundamental frequency of the speaker at that point in time.

5 bits of energy signal (how loud)

1 bit to help with the low pitch frequencies

2 bits (sampled 10mS intervals) of voicing of a vowel or a consonant)

36 bits (The model parameters of the speech)

Bit Allocation

From one of the first versions of the Codec 51 bits per 20ms frame, or 2550 bit/s

Parameter Bits/frame

Spectral magnitudes (LSPs) 36

Low frequency LPC correction 1

Energy 5

Voicing (updated each 10ms) 2

Pitch 7

Total 51

Decoder Block Diagram

InverseFFT

16 bit, 8kHzsamples

RecoverHarm Amps

LSP toLPC

OverlapAdd

PostFilter

FFTLSPs

EnergyLPCCorrection Phase

SynthesisVoicing

FreeDV: The Digital voice for HF

Speech is compressed down to 1600 bit/s then modulated onto a 1.25 kHz wide 16QPSK signal which is sent to the Mic input of a SSB radio. On receive, the signal is received by the SSB radio, then demodulated and decoded by FreeDV. Communications should be readable down to 2 dB S/N, and long-distance contacts are reported using 1-2 watts power.FreeDV was built by an international team of Radio Amateurs working together on coding, design, user interface and testing. FreeDV is open source software, released under the GNU Public License version 2.1. The FDMDV modem and Codec 2 Speech codec used in FreeDV are also open source.

Running Codec2• May 21, 2014: SmartMic announced! An embedded hardware product

that allows you to run FreeDV without a PC. Plug SmartMic into your SSB or FM radio, and you now have Digital Voice (DV). (allows function with only 1 soundcard) $195.00

Issues over HF• Fading: The signal is lost, and data is lost. Can handle some fading by use of the

Forward Error Correction (FEC) bits. Ham Radio HF packet radio communication uses FEC. Codec2 also tries to decode signals even after fading. Most modems are designed to throw out data when signals are missing, since they are designed for data, where everything must be perfect. Codec2 is designed to try to decode even if there are errors. This allows us to still hear the rest of the speech even after a dropout. Since speech has a lot of redundancy, this is not an issue. (We do this all the time on HF)

• Group delay:– When signals bounce off of the ionesphere at different heights, causing some early signals

to arrive later, mixing with the later signals.• 5mS is common delay in HF. Need a Codec that sends one packet of bits for longer than 5mS to

make sure that it does not get affected by the delay. (Ham Radio RTTY at 45 baud sends the same signal for over 5mS before switching to a new signal to handle this delay. That is why that speed works so well for HF communications!!)

• Lining up the signal right in the passband. A problem for us already in HF SSB. Codec 2 starts a transmission by sending a sequence of known tones. The receiver knows the frequencies of these tones and lines up the decoder to match them.

What does this sound like?● http://www.rowetel.com/blog/?page_id=452● Male vs Female (pitch differences, Codec2

has trouble with low frequencies, so it adds in an extra bit)

● background noise and speech codecs (if the noise does not sound like a voice with the harmonics, then it does not bother the coding very much)

Conclusions

• Digital HF will continue in experimentation phase into the future– Use more bits to make it sound better, or to add in more

error correction?– Find a way to do this with one sound card? (I am not sure

why it requires two cards.)• Best used when a communication link has been

established, then switch to digital. • When it does work, it eliminates all of the QRM and

QRN on SSB. It could be the ragchew method of the future! Maybe, just maybe, even for contests!

Bibliography

• freeDV: http://freedv.org/tiki-index.php• Codec 2: http://www.rowetel.com/blog/?page_id=452

Digital Voice Communication on HF Ford Amateur Radio League March 12, 2015 David Treharne, N8HKU.

Documents