Recent Developments at the IoT Edge - IAR Systems · Arm ML suite of IP: designed for unmatched...

Post on 20-May-2020

1 views 0 download

transcript

© 2018 Arm Limited

Machine Learning

Platform Security Architecture

Recent Developments at the IoT Edge

1 March, 2018Tim Hartley, Product Manager, Machine Learning Group

© 2018 Arm Limited 2

History of ArmJoint venture between

Acorn Computers and Apple

1990

Designed into first mobile

phones and then smartphones

1993 onwards

Now all electronic devices can

use intelligent Arm technology

Today

© 2018 Arm Limited 3

Arm: the Industry’s Architecture of Choice Extraordinary growth – from sensors to server

22 years

4 years

4 years

20171991 2013 2021

50 billionchips shipped

50 billionchips shipped

100 billionchips expected to ship

Machine Learning on Arm Cortex-M Microcontrollers

Why is ML Moving to the Edge?

Bandwidth ReliabilityPower SecurityCost Latency

5 © 2018 Arm Limited

Arm ML suite of IP: designed for unmatched versatility and scalability:

Machine Learning (ML) processorObject Detection (OD) processorNeural Network (NN) software libraries

Market growth in units (today to 2028):Mobile - 1.7Bn to 2.2Bn (source: Strategy Analytics and Arm forecast)

Smart IP Cameras - 160M to 1.3Bn (source: Gartner and Arm forecast)

AI-enabled devices - 300M to 3.2Bn(source: IDC WW Embedded and Intelligent Systems Forecast, 2017-2022 and Arm forecast)

Project Trillium: Arm ML for All Devices

6 © 2018 Arm Limited

Optimum ML Performance on Arm for Any Application

Arm NN software translates existing NN frameworks:

TensorFlow, Caffe, Android NNAPI, MXNet etc.

Developers maintain existing workflow and tools

Reduces overall development time

Abstracts away the complexities of underlying hardware

Arm NN

CMSIS-NN

Cortex-MCPU

MaliGPU

Compute Library

3rd party IP

Partner IP driver and

SW functions

Compute Library

Cortex-ACPU

ArmML processor

Compute Library

NN Frameworks

better efficiency and performance for NN functions

CMSIS-NN 5x

faster than other open-source software (OSS)

Compute Library 15x

7 © 2018 Arm Limited

© Arm 2018 Limited 8

ML Use Case Examples

ImageNet

• 1000+ classes

Large scale speech recognition

Disease detection

CIFAR-10

• < 10 classes

Key word spotting, simple commands

Human activity monitor

Big data ML Small data MLvs

Vision

Audio

Health

© Arm 2018 Limited 9

Cortex-M Challenges for ML

Limited compute resource

Lack of deployment

flow

Limited system

memory

Limited energy

© Arm 2018 Limited 10

Arm NN(in development)

CMSIS-NN

TensorFlow / Caffe etc.

Application

Cortex-M

CMSIS-NN – Efficient NN Kernels for Cortex-M CPUs

• Open Source: launched 23 Jan’18

• CMSIS-NN has the equivalent role for Cortex-M CPUs as Compute Library has for Cortex-A CPUs and Arm Mali GPUs (and ML processor in mid 2018)

• But flow is entirely offline, creating a binary targeting Cortex-M class platform

• SIMD instructions in Cortex-M7/M4 targeted

• Will run on Cortex-M0

© Arm 2018 Limited 11

CMSIS-NN – Efficient NN Kernels for Cortex-M CPUs

Convolution

• Boost compute density with GEMM based implementation

• Reduce data movement overhead with depth-first data layout

• Interleave data movement and compute to minimize memory footprint

Pooling

• Improve performance by splitting pooling into x-y directions

• Improve memory access and footprint with in-situ updates

Activation

• ReLU: Improve parallelism by branch-free implementation

• Sigmoid/Tanh: fast table-lookup instead of exponent computation

*Baseline uses CMSIS 1D Conv and Caffe-like Pooling/ReLUCMSIS-NN is now open-sourced

0

1

2

3

4

5

6

Conv Pooling Activation(ReLU)

Total

Rel

ativ

e th

rou

ghtp

ut

CNN Runtime improvementBaseline New kernels

0

2

4

6

Conv Pooling Activation(ReLU)

Total

Rel

ativ

e O

ps

per

Jo

ule

Energy efficiency improvement

4.9x higher

eff.

4.6x higher perf.

© Arm 2018 Limited 12

Image Classification - Convolutional Neural Network

• CIFAR-10 classification – classify images into 10 different object classes

• 3 convolution layer, 3 pooling layer and 1 fully-connected layer (~80% accuracy)

© Arm 2018 Limited 13

CNN on Cortex-M7

Layer Network Parameter Output activation Operation count Runtime on M7

Conv1 5x5x3x32 (2.3 KB) 32x32x32 (32 KB) 4.9 M 31.4 ms

Pool1 3x3, stride of 2 16x16x32 (8 KB) 73.7 K 1.6 ms

Conv2 5x5x32x32 (25 KB) 16x16x32 (8 KB) 13.1 M 42.8 ms

Pool2 3x3, stride of 2 8x8x32 (2 KB) 18.4 K 0.4 ms

Conv3 5x5x32x64 (50 KB) 8x8x64 (4 KB) 6.6 M 22.6 ms

Pool3 3x3, stride of 2 4x4x64 (1 KB) 9.2 K 0.2 ms

ip1 4x4x64x10 (10 KB) 10 20 K 0.1 ms

Total 87 KB weightsTotal: 55 KB

Max. footprint: 40 KB 24.7 M Ops 99.1 ms

• CNN with 8-bit weights and 8-bit activations

• Total memory footprint: 87 kB weights + 40 kB activations + 10 kB buffers (I/O etc.)

NUCLEO-F746ZG - 216 MHz, 320 KB SRAM

© Arm 2018 Limited 14

Demo with Multiple NNsDetected voice command

• Both image classification and keyword spotting are running at the same time

• Voice command controls the start/stop of the image classification

• Total memory footprint:

• CNN: 87 KB weights + 40 KB activations + 10 KB buffers

• DNN: 66 KB weights + 1 KB activations + 2 KB buffers

CNN throughput

Classification accuracy

Image label and

classification output

© 2018 Arm Limited

Platform Security Architecture

© 2018 Arm Limited 16

Platform Security Architecture

A recipe for building a secure system & a reference implementation

3 Parts to PSA

Software architecture

Common principles across multiple use cases

Hardware requirements

Architecture & Specifications

Threat models &security analysis

Analyze

Hardware & firmware

architecture specifications

Architect

Implement

Firmware source code

Device identity

Trusted boot sequence

Certificatebased

authentication

Secure over-the-

air software update

Confidential © Arm 2018 17

Arm Platform Security Architecture (PSA)

• A common framework for scaling connected device security

• Enables consistent level of security

• Broad ecosystem support from industry leaders

• Trusted Firmware-M – Open source reference firmware

©2018 Arm Limited

Threat models &security analysis

Analyze

Hardware & firmware

architecture specifications

Architect

Implement

Firmware source code

© 2017 Arm Limited 18

First PSA deliverables availablewww.arm.com/psa-resources

Threat Models and Security Analyses (TMSA) documentation

Arm Trusted Firmware-M

• Step 1 of PSA: gather information about threats to a particular device and develop the right security specifications

• Three example TMSAs freely available now

• The first open source reference implementation firmware, which conforms to the PSA specification

• Available as a GitHub project in March

© 2017 Arm Limited 19

Summary

Major initiatives from Arm supporting Cortex-M microcontrollers

• Machine Learning on IoT-class devices

• Enabling existing ML frameworks on Cortex-M through Arm NN

• ML enabled everywhere: Cortex-M0 and upwards

• CMSIS-NN library open source and available now (https://developer.arm.com/embedded/cmsis)

• Platform Security Architecture

• Security from the ground up, at the core of every device

• Trusted Firmware-M coming Q1’18

• First deliverables available now (www.arm.com/psa-resources)

2020 © 2018 Arm Limited

The Arm trademarks featured in this presentation are registered trademarks or trademarks of Arm Limited (or its subsidiaries) in the US and/or elsewhere. All rights reserved. All other marks featured may be trademarks of their respective owners.

www.arm.com/company/policies/trademarks