.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Introduction to TensorFlow Internals
Guangcong Liu
Software Architect@ZTE
2017-11-16
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 1 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Contents
1 Architecture Overview
2 Programming Model
3 Execution Model
4 Model Training
5 Bibliography
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 2 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Architecture Overview1 System Architecture2 Design Principles3 Graph Transformation
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 3 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
System Architecture
System Architecture
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 4 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Design Principles
Design Principles
Deferred Execution:The construction and execution of graph areseparated, and the graph execution is delayed.
Primitive OP:OP is the basic computation unit.
Abstract Accelerator:Support CPU, GPU, and ASIC.
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 5 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Graph Transformation
Graph Construction
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 6 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Graph Transformation
Graph Exection
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 7 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Graph Transformation
Split Graph
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 8 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Graph Transformation
Register Graph
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 9 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Graph Transformation
Run Graph
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 10 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Programming Model1 Dataflow Graph2 Variable3 Session4 Graph Construction & Exection
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 11 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Dataflow Graph
Graph = Set{OP}+ Set{Tensor}
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 12 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Dataflow Graph
OP: Abstract Computation
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 13 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Dataflow Graph
Tensor: Dataflow
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 14 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Variable
Initialization Model
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 15 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Variable
Initialization Dependency
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 16 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Session
Life Cycle: Python
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 17 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Session
Life Cycle: C++
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 18 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Graph Construction & Exection
Graph Construction & Serialization
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 19 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Graph Construction & Exection
Example: OP Constructor
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 20 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Graph Construction & Exection
Example: Create OP
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 21 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Graph Construction & Exection
Example: Graph Construction
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 22 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Execution Model1 Execution Model2 Distributed Example
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 23 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Execution Model
Local Runtime
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 24 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Execution Model
Distributed Runtime
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 25 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Client Session
Create ClientSession
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 26 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Client Session
Polymorphism Creation
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 27 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Client Session
Create MasterSession
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 28 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Client Session
MasterSession Model
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 29 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Run Step
Split Graph by Task)
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 30 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Run Step
Split Graph by Device
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 31 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Example
Split Graph
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 32 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Example
Receive Tensor
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 33 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Training Model1 Compute Gradients2 Apply Gradients3 Training Workflow
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 34 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Compute Gradients
Optimizer: Compute Gradients
�class Optimizer(object):
def minimize(self, loss, var_list=None, global_step=None):
grads_and_vars = self.compute_gradients(
loss, var_list=var_list)
return self.apply_gradients(
grads_and_vars,
global_step=global_step)
def compute_gradients(loss, var_list):
grads = gradients(loss, var_list, grad)
return list(zip(grads, var_list))
def gradients(loss, var_list, grads=1):
ops_and_grads = {}
for op in reversed_graph(loss).topological_sort():
grad = op.grad_fn(grad)
ops_and_grads[op] = grad
return [ops_and_grads.get(var) for var in var_list] � �Guangcong Liu (ZTE) TensorFlow Internals 2017.11 35 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Compute Gradients
Gradient Function
�@ops.RegisterGradient("op_name")
def grad_func(op, grad):
"""construct gradient subgraph for an op type.
Returns:
A list of gradients, one per each input of op.
"""
return cons_grad_subgraph(op, grad) � �(y1, y2, ..., ym) = f (x1, x2, ..., xn)
(∂L/∂x1, ∂L/∂x2, ..., ∂L/∂xn) = g(x1, x2, ..., xn; ∂L/∂y1, ∂L/∂y2, ..., ∂L/∂yn
)
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 36 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Compute Gradients
Example:Square
�@ops.RegisterGradient("Square")
def SquareGrad(op, grad):
x = op.inputs[0]
with ops.control_dependencies([grad.op]):
return grad * (2.0 * x) � �
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 37 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Apply Gradients
Apply Gradients
�def apply_gradients(grads_and_vars, learning_rate):
for (grad, var) in grads_and_vars:
apply_gradient_descent(learning_rate, grad, var) � �
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 38 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Training Workflow
Critical Path: RunStep
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 39 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Training Workflow
Distributed Initialization
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 40 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Bibliography
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 41 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Bibliography
Papers
TensorFlow: Large-Scale Machine Learning on HeterogeneousDistributed Systems, Google Inc.
TensorFlow: A System for Large-Scale Machine Learning, GoogleInc.
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 42 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Acknowledgments
Q&A
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 43 / 44
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Architecture Overview Programming Model Execution Model Model Training Bibliography
Acknowledgments
Thanks
Guangcong Liu (ZTE) TensorFlow Internals 2017.11 44 / 44