Deep Learning with TensorFlow(Second Edition)
上QQ阅读APP看书,第一时间看更新

TensorFlow computational graph

When thinking of executing a TensorFlow program, we should be familiar with the concepts of graph creation and session execution. Basically, the first one is for building the model, and the second one is for feeding the data in and getting the results.

Interestingly, TensorFlow does everything on the C++ engine, which means not even a little multiplication or addition is executed in Python. Python is just a wrapper. Fundamentally, the TensorFlow C++ engine consists of the following two things:

  • Efficient implementations of operations, such as convolution, max pool, and sigmoid for a CNN for example
  • Derivatives of the forwarding mode operation

The TensorFlow lib is an extraordinary lib in terms of coding and it is not like conventional Python code (for example, you can write statements and they get executed). TensorFlow code consists of different operations. Even variable initialization is special in TensorFlow. When you are performing a complex operation with TensorFlow, such as training a linear regression, TensorFlow internally represents its computation using a data flow graph. The graph is called a computational graph, which is a directed graph consisting of the following:

  • A set of nodes, each one representing an operation
  • A set of directed arcs, each one representing the data on which the operations are performed

TensorFlow has two types of edges:

  • Normal: They carry the data structures between the nodes. The output of one operation, that is, from one node, becomes the input for another operation. The edge connecting two nodes carries the values.
  • Special: This edge doesn't carry values, but only represents a control dependency between two nodes, say X and Y. It means that node Y will be executed only if the operation in X has already been executed, but before the relationship between operations on the data.

The TensorFlow implementation defines control dependencies to enforce the order of otherwise independent operations as a way of controlling the peak memory usage.

A computational graph is basically like a data flow graph. Figure 2 shows a computational graph for a simple computation such as TensorFlow computational graph:

TensorFlow computational graph

Figure 2: A very simple execution graph that computes a simple equation

In the preceding figure, the circles in the graph indicate the operations, while the rectangles indicate the computational graph. As stated earlier, a TensorFlow graph contains the following:

  • tf.Operation objects: These are the nodes in the graph. These are usually simply referred to as ops. An op is simply TITO (tensor-in-tensor-out). One or more tensors input and one or more tensors output.
  • tf.Tensor objects: These are the edges of the graph. These are usually simply referred to as tensors.

Tensor objects flow between various ops in the graph. In the preceding figure, d is also an op. It can be a "constant" op whose output is a tensor that contains the actual value assigned to d.

It is also possible to perform a deferred execution using TensorFlow. In a nutshell, once you have composed a highly compositional expression during the building phase of the computational graph, you can still evaluate it in the running session phase. Technically speaking, TensorFlow schedules the job and executes on time in an efficient manner.

For example, parallel execution of independent parts of the code using the GPU is shown in the following figure:

TensorFlow computational graph

Figure 3: Edges and nodes in a TensorFlow graph to be executed on a session on devices such as CPUs or GPUs

After a computational graph is created, TensorFlow needs to have an active session that is executed by multiple CPUs (and GPUs if available) in a distributed way. In general, you really don't need to specify whether to use a CPU or a GPU explicitly, since TensorFlow can choose which one to use.

By default, a GPU will be picked for as many operations as possible; otherwise, CPU will be used. Nevertheless, generally, it allocates all GPU memory even if does not consume it.

Here are the main components of a TensorFlow graph:

  • Variables: Used to contain values for the weights and biases between TensorFlow sessions.
  • Tensors: A set of values that pass between nodes to perform operations (aka. op).
  • Placeholders: Used to send data between the program and the TensorFlow graph.
  • Session: When a session is started, TensorFlow automatically calculates gradients for all the operations in the graph and uses them in a chain rule. In fact, a session is invoked when the graph is to be executed.

Don't worry, each of these preceding components will be discussed in later sections. Technically, the program you will be writing can be considered as a client. The client is then used to create the execution graph in C/C++ or Python symbolically, and then your code can ask TensorFlow to execute this graph. The whole concept gets clearer from the following figure:

TensorFlow computational graph

Figure 4: Using a client-master architecture to execute a TensorFlow graph

A computational graph helps to distribute the workload across multiple computing nodes with a CPU or GPU. This way, a neural network can be equated to a composite function where each layer (input, hidden, or output layer) can be represented as a function. To understand the operations performed on the tensors, knowing a good workaround for the TensorFlow programming model is necessary.