MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
Effects (APO/DSP vendor)
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
• Provide value-added features, e.g. AEC, AGC
• COM object, and run in user mode
• Proxy APO for Hardware DSP, Windows provide a default proxy
APO (MsApoFxProxy.dll)
• Three different location for APO:
o Stream Effect (SFX):
an instance of the effect for every stream
o Mode Effect (MFX):
applied to all streams that are mapped to the same mode
o Endpoint Effect(EFX):
Endpoint Effect (EFX) are applied to all streams that use the same
endpoint, always applied event to RAW
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
Expose all audio effects including Beam Forming, Noise suppression and
echo cancelation via FX_Stream_CLSID, FX_Mode_CLSID, and
FX_Endpoint_CLSID APOs
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
• Describe Microphone’s number,
position, type, angle, and so on
• Audio driver reported to
Windows by
KSPROPERTY_AUDIO_MIC_ARR
AY_GEOMETRY
• Very important for Windows
Speech platform enhancement
pipeline
• Descriptor
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
• Speech mode specifies:
The application expects speech recognition specific signal
processing at the lowest latency
The hardware preferred sample rate for wideband speech (such
as 16 kHz).
• Need support for Speech mode if using OEM pipeline
#define STATIC_AUDIO_SIGNALPROCESSINGMODE_SPEECH 0xfc1cfc9b, 0xb9d6, 0x4cfa, 0xb5, 0xe0, 0x4b, 0xb2, 0x16, 0x68, 0x78, 0xb2DEFINE_GUIDSTRUCT("FC1CFC9B-B9D6-4CFA-B5E0-4BB2166878B2", AUDIO_SIGNALPROCESSINGMODE_SPEECH);#define AUDIO_SIGNALPROCESSINGMODE_SPEECH DEFINE_GUIDNAMED(AUDIO_SIGNALPROCESSINGMODE_SPEECH)
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
• Mic Gain is very key important to Cortana experience
• Default Mic Gain is the OEM recommended Mic Gain for
customer to use in Cortana
• HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech_One
Core\AudioInput\MicWiz\DefaultDefaultMicGain
• The Registry key is set only to integrated mic arrays
• The Registry is set only meet or exceed Standard metrics
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
USB
Terminal Type Code I/O Description
Input Undefined 0x0200 I Input Terminal, undefined Type.
Microphone 0x0201 I A generic microphone that does not fit under any of the other
classifications.
Desktop Microphone 0x0202 I A microphone normally placed on the desktop or integrated
into the monitor.
Personal microphone 0x0203 I A head-mounted or clip-on microphone.
omni-directional
microphone
0x0204 I A microphone designed to pick up voice from more than one
speaker at relatively long ranges.
microphone array 0x0205 I An array of microphones designed for directional processing
using host-based signal processing algorithms.
processing microphone
array
0x0206 I An array of microphones with an embedded signal processor.
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
Is Mic geometryexposed
Is Raw Mode Supported
Is Speech Mode supported
Run MS pipeline in default mode
Start
Yes
Yes
No
Run MS pipeline in raw mode
Run OEM pipeline inspeech mode
Are AEC and NS exposed
Yes
Yes
No
No
No
Is Mic an array
Yes
A single microphone does not require a
microphone geometry
No
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
• Driver Configuration Verification Tool
• OEMVerificationWin10x86.exe
• Recorder and Sound files
• Score Utility
• OEMScoreUtilityx64.exe
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved. Shared with
Partners under NDA.
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
• Good acoustic design is a function of many parameters other than just
microphone design, and is highly dependent on the device integration
and usage
Mic EQ, GainOEM Speech Recognizer
Acoustic ModelsMulti-channel Echo Canceling
Noise Suppression
BeamformingVoice Activation
Microsoft speech pipeline
Automatic Gain Control
Cortana
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
• Microsoft recommends two or more Microphones
• Benefits:
Sound Source Localization
Reduction of ambient noises.
Partial de-reverberation, because most indirect paths are attenuated.
Reducing the effects of electronic noise.
Microphone
array
Eleme
nts
Type NG, dB NGA,
dB
DI, dB
Linear, small 2 uni-
directional
-12.7 -6.0 7.4
Linear, big 2 uni-
directional
-12.9 -6.7 7.1
Linear, 4el 4 uni-
directional
-13.1 -7.6 10.1
L-shaped 4 uni-
directional
-12.9 -7.0 10.2
Linear, 4 el
second
geometry
4 integrated -12.9 -7.3 9.9
Good
omnidirectio
nal
microphone
1 integrated 0 0 4.5
Target Characteristics(for reference)
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
• Cover a quiet office or cubicle with good sound capturing
• Speaker is less than 0.6 meters from the microphone
Small two-element array Big two-element microphone array
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
L-shaped four-element microphone arrayLinear four-element microphone
array
• Cover a quiet office or cubicle with good sound capturing
• Speaker is less than 2 meters from the microphone
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
Circle microphone array geometry
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
• Important to ensure temporal relationship between signals in Mics
• Import to Beam forming and source localizer
Frequency(HZ) PHASE RESPONSE MATCHING
250 <30 deg,<20 will be better
1K <30 deg,<20 will be better
4K <30 deg,<20 will be better
7K <30 deg,<25 will be better
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
Frequency(HZ) THD
250 <3.2%, 2.5% WILL BE BETTER
1K <3.2%, 2.5% WILL BE BETTER
4K <3.2%, 2.5% WILL BE BETTER
5K <4%
6K <6.3%
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
Requirement: Mic array far away from speaker
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
Requirement: Mic array far away from speaker
Requirement: Mic array far away from speaker
• SNR
better than 61 dB, 63 dB will be better
• Mic sensitivity
+/-3 dB, +/-1 dB will be better
• No signal leak
should be no leak from the hole to Microphone physically
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
Component Requirement
Sampling rate 16,000 Hz, synchronized for all ADCs
Sampling synchronization Better than 1/64th of the sampling
period
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
Document Detail Reference Link
Portclass Miniport driver
Develop Guide
Describe how to develop
Audio driver for PCI and DMA
based audio driver. Describe
how to write a Miniport
PortClass Audio Driver.
https://msdn.microsoft.com/EN-
US/library/windows/hardware/ff536829(v=vs.85).aspx
Microphone Array Support
in Windows
Describe Microphone Array
Design Guide in Windows.
https://msdn.microsoft.com/en-
us/library/windows/hardware/dn613960.aspx
Windows Audio Driver
Development Guide
Windows have several kinds
of Audio drivers. This link
Describe how to choose
which kinds of Audio driver
to develop for its device.
https://msdn.microsoft.com/en-
us/library/windows/hardware/ff537861(v=vs.85).aspx
Enabling Great Audio
Experiences in Windows 10
Windows 10 audio framework https://channel9.msdn.com/Events/WinHEC/2015/WHT2
02
MICROSOFT CONFIDENTIAL – for discussion purposes only. © 2015 Microsoft Corporation. All rights reserved.
(c) 2015 Microsoft Corporation. All rights reserved. This document is provided "as-is." Information and views
expressed in this document, including URL and other Internet Web site references, may change without notice. You
bear the risk of using it. This document does not provide you with any legal rights to any intellectual property in any
Microsoft product. You may copy and use this document for your internal, reference purposes.
Some information relates to pre-released product which may be substantially modified before it’s commercially
released. Microsoft makes no warranties, express or implied, with respect to the information provided here.