HoloLens and Cognitive ServicesA powerful combination!
January 4th, 2018
Developer session – level 200
ABOUT ME
• HoloLens Evangelist / Solutions Architect @ ETTU
• Microsoft MVP since 2018
• Founder of the Mixed Reality User Group
• Email: [email protected] or [email protected]
• Twitter: @ameijers
• Blog: http://www.appzinside.com
TODAY’S TALK
• About realities
• Microsoft HoloLens
• Building blocks and tools
• Azure Cognitive Services
• Computer Vision API explained
• Integrate Computer Vision API with HoloLens
• Wrap-up!
REALITIES EXPLAINED
Merging of real and
virtual worlds to produce
new environments and
visualizations where
physical and digital
objects co-exist and
interact in real time.
An overlay of synthetic
content on the real world
that is anchored to and
interacts with the real
world
MIXED REALITY
Direct or indirect view of a
physical, real-world
environment whose
elements
are augmented by
computer-generated
sensory input such as
sound, video, graphics
or GPS data
An overlay of content on
the real world where that
content is not anchored to
or part of it
AUGMENTED REALITY
Generation of realistic
images, sounds and other
sensations that replicate a
real environment or create
an imaginary setting
An immersive experience
created entirely from
computer-generated
Content. Also similar to
360 degree video
VIRTUAL REALITY
EVERYTHING BECOMES MIXED REALITY
Virtual Reality devices
SteamVR
AltSpaceVR
October/November 2017
IMMERSIVE HEADSETS
Mixed Reality Device
Developer and Commercial Suite
version
Since October 2016
HOLOLENS
2019HoloLens v3
HISTORY OF HOLOLENS
• Codenamed Project HoloLens
• Chief inventor Alex Kipman
• Official title is technical fellow
• He dreamed up Kinect in at the end of 2007 and set a vision which incorporated HoloLens
• It is the start of a transforming world
• In the new reality, sensors will be anywhere
• A visual computing platform controlled by speech and gestureAlex Kipman
MICROSOFT HOLOLENSSPECIFICATIONS
• Windows 10 device based on 32 bit architecture
• 64GB flash
• 2GB memory
• 1GB Holographic Processor Unity (HPU)
• First of its kind
• Device is more powerful than a laptop
• No overheating due to warm air flows to the sides
• 2-3 hours active and 2 weeks standby
• Weight 579g
MICROSOFT HOLOLENS SPECIFICATIONS
• Contains depth camera
• Field of vision that spans 120 by 120 degrees
• 18 sensors flooding the device with terabytes of data every second
• Tricks your brain into perceiving holographic images
• Light engine in which light particles are bouncing million times
• Photons enter the two lenses
• Ricochet some between layers of blue, green and red glass
• Hitting back of the eye
DEVELOPMENT TOOLS
• Visual Studio 2017
• UWP workload
• Game development with Unity workload
• Windows 10 SDK (version 1511 or later)
• There is no separate SDK for HoloLens
• Unity 2017.3
• HoloLens emulator
• Hyper-V
• Contains DirectX project templates for Visual Studio
• HoloLens device
• GitHub
• Microsoft/MixedRealityToolkit-Unity
BUILD LIFECYCLE OF A HOLOLENS PROJECT
Create Unity Project Configure HoloLens settings Create scene
Visual Studio
Unity HoloLens
Configure build settings Build and generate Visual Studio
project
Open project with Visual Studio Pair with HoloLens Build & deploy Visual Studio project
Start Application Test & debug Monitor
Unity
WINDOWS DEVICE PORTAL
• 3D View
• Mixed Reality Capture
• Performance
• Performance tracing
• System performance
• Processes
• Apps
• Maintenance
• Crash dumps
• Additional tools
• Logging
• File Explorer
• Virtual Input
COGNITIVE SERVICES
Infuse your devices, apps, websites and bots with intelligent algorithms to see, hear, speak, understand and interpret your user needs through natural methods of communication. Transform your business with AI today
• Makes use of Artificial Intelligence (AI)
• Combination with devices such as HoloLens makes it incredible powerful
• At the moment there are 5 categories; Vision, Speech, Language, Knowledge and Search
• Try cognitive services
• Some do need an Azure subscription
• Cognitive Services Labs
• Early look at new Cognitive Services technologies
• Lot’s of research stuff
• Examples like Project Prague
COGNITIVE SERVICES OVERVIEW
VISION
Computer Vision API
Content Moderator
Custom Vision Service ☼
Face API
Emotion API ☼
Video Indexer ☼
SPEECH
Translator Speech API
Bing Speech API
Speaker Recognition API ☼
Custom Speech Service ☼
LANGUAGE
Language Understanding
(LUIS)
Bing Spell Check API
Web Language Model API(preview)
Text Analytics API
Translator Text API
Linguistic Analysis API ☼
KNOWLEGDE
Recommendations API ☼
Knowledge Exploration Service
☼
Entity Linking Intelligence Service
API ☼
Academic Knowledge API ☼
QnA Maker API ☼
Custom Decision Service ☼
SEARCH
Bing Autosuggest API
BING News Search API
Bing Web Search API
Bing Entity Search API (preview)
Bing Image Search API
Bing Video Search API
Bing Custom Search API
☼ - In preview
PRICING AND COST
• Pricing often based on number and/or requests per second
• Some services are free but limited in
• total number of requests per month
• requests per second
• In most cases preview API’s are free
• Keep in mind that when released there will be a price tag
Example Computer Vision API
LEGAL NOTICE
• Microsoft will use content send to services to improve their underlaying algorithms and models
• You are responsible for getting the right consent of content owners
• The General Privacy and Security Terms in the Online Services Terms do notapply to Cognitive Services.
VISION SERVICES
API Description Azure
Subscription
needed
Preview
Computer Vision
API
Distill actionable information from
images
Yes
Content Moderator Automated image, text, and video
moderation
Yes
Custom Vision
Service
Easily customize your own state-of-
the-art computer vision models for
your unique use case
Yes
Face API Detect, identify, analyze, organize,
and tag faces in photos
Yes
Emotion API Personalize user experiences with
emotion recognition
Yes Yes
Video Indexer Unlock video insights Yes
COMPUTER VISION API
• Mainly used for generation of tags and coherent full-sentence descriptions of images
• Allows us to analyze visual content in different ways
• Cloud based and expects to have an Azure subscription
• Supports raw image binary in the form of an application/octet
VISUAL ALGORITMS
• Content related
• Generate tags
• Creates full-sentence descriptions
• Categorize content
• Domain specific content
• Recognition of printed or written text
• Flag adult content
• Photo related
• Identify type and quality of content
• Distinguish color schemes
THE 86-CATEGORY CONCEPT
INTEGRATE SERVICES
Integration of Cognitive Services with HoloLens is mostly done in the same way
• Setup a Cognitive Service and get a key
• Use the Azure portal for setting up a service
• Get a key to access the service
• Get the URL to call the service
• We need to implement asynchronous calls due to latency in call
• Response of a call depends on pricing tier
• Availability of internet connection
• User interface implementation
• Asynchronous calls are a challenge with Unity
• Unity is single threaded
• Why not using Coroutine?
CONNECT TO VISION API
https://[location].api.cognitive.microsoft.com/vision/v1.0/analyze[?visualFeatures][&details][&language]&[subscription-key]
• [location] – westus, westeurope, etc.
• [visualFeatures] – Categories (default), Tags, Description, Faces, ImageType, Color and Adult
• [details] – Celebrities and Landmarks
• [language] – English (default) or Simplified Chinese
• [subscription-key] - “13hc77781f7e4754b5fcdd72a8df7156” (or in request header)
Use of request headers
• Content-Type = “application/json” or “application/octet-stream”
• Ocp-Apim-Subscription-Key = “13hc77781f7e4754b5fcdd72a8df7156”
APPLICATION ARCHITECTURE
Universal
Windows
App
Universal
Windows
DLL
Vision API
endpoint
GetDataAsync
OnDataCompleted
WebRequest
ThreadPool.RunAsync
Returned Json
{
'tags':[
{
"name":"grass",
"confidence":0.999999761581421
},
{
"name":"outdoor",
"confidence":0.999970674514771
},
{
"name":"sky",
"confidence":999289751052856
},
{
"name":"building",
"confidence":0.996463239192963
}],
}
#if
!UNITY_EDITOR
…
#endif
Asynchronous
code only here
UW DLL
included as
asset
ANALYZING VIDEO
• Use frames taken from the live video stream in HoloLens
• Limited to 10 calls per second
• S1 standard pricing tier
• Additional libraries required
• Cognitive-Samples-VideoFrameAnalysis - A library with sample apps for continuous analysis of live video, using the Microsoft Cognitive Services Vision APIs
• Library which contains the class FrameGrabber
• Uses OpenCVSharp
WRAP-UP
• HoloLens is the Mixed Reality device when it comes to supporting existing processes at enterprise companies in the 3D world
• Microsoft delivers a great number of Cognitive Services which allows you to build artificial intelligence supported applications for any device
• Building intelligent applications is possible by using these services
• Keep in mind that you reaching out to services. You need to built in asynchronous service calls. A challenge with Unity.
• New updates on tools will allow to build more and more complex applications
• Devices such as HoloLens in combination with Cognitive Services makes a powerful combination to build any AI related application