Deep learning noter – Backpropagation – 13/10 2021

Chain Rule of Calculus

If is differentiable at and is differentiable at (), then the composite function = ∘ defined by = is differentiable at and ′ is given by:

Source: SDU

Two different ways to do calculations.

Forwards has more steps. Backward only requires the lost functions of the direvatives to get the results.

We only care about the node, and not the entire network for accumulations during the calculation.

Accumulation is not learning.

xx udklip 3 – backward propogation visualization

xx udklip

xx udklip 1

Vectors follows the same logic.

Symbol to number differentation – takes computentioal graph and a set of numerical values (for the inputs to the graph) and returns a set of numerical values that descrie gradient at the input values. Relevant libs: Torch and Caffe.
Symbol to symbol differentation – takes computentioal graph and returns graph with additional nodes that provides a symbolic description of desired derivatives. Relevant libs: Theano and Tensorflow

Disabling eager excecution in tensorflow makes it use symbl to symbol.