On the different notions of derivative¶

The concept of a derivative is one of the core concepts of mathematical analysis analysis, and it is essential whenever a linear approximation of a function in some point is required. Since the notion of derivative has different meanings in different contexts, this guide has been written to introduce the different derivative concepts used in ODL.

In short, different notions of derivatives that will be discussed here are:

• Derivative. When we write “derivative” in ODL code and documentation, we mean the derivative of an Operator w.r.t to a disturbance in , i.e a linear approximation of for small . The derivative in a point is an Operator .
• Gradient. If the operator is a functional, i.e. , then the gradient is the direction in which increases the most. The gradient in a point is a vector in such that . The gradient operator is the operator .
• Hessian. The hessian in a point is the derivative operator of the gradient operator, i.e. .
• Spatial Gradient. The spatial gradient is only defined for spaces whose elements are functions over some domain taking values in or . It can be seen as a vectorized version of the usual gradient, taken in each point in .
• Subgradient. The subgradient extends the notion of derivative to any convex functional and is used in some optimization solvers where the objective function is not differentiable.

Derivative¶

The derivative is usually introduced for functions via the limit

Here we say that the derivative of in is .

This limit makes sense in one dimension, but once we start considering functions in higher dimension we get into trouble. Consider – what would mean in this case? An extension is the concept of a directional derivative. The derivative of in in direction is :

Here we see (as implied by the notation) that is actually an operator

We can rewrite this using the explicit requirement that is a linear approximation of at , i.e.

This notion naturally extends to an Operator between Banach spaces and with norms and , respectively. Here is defined as the linear operator (if it exists) that satisfies

This definition of the derivative is called the Fréchet derivative.

The Gateaux derivative¶

The concept of directional derivative can also be extended to Banach spaces, giving the Gateaux derivative. The Gateaux derivative is more general than the Fréchet derivative, but is not always a linear operator. An example of a function that is Gateaux differentiable but not Fréchet differentiable is the absolute value function. For this reason, when we write “derivative” in ODL, we generally mean the Fréchet derivative, but in some cases the Gateaux derivative can be used via duck-typing.

Rules for the Fréchet derivative¶

Many of the usual rules for derivatives also hold for the Fréchet derivative, i.e.

• Linearity

• Chain rule

• Linear operators are their own derivatives. If linear, then

Implementations in ODL¶

• The derivative is implemented in ODL for Operator‘s via the Operator.derivative method.
• It can be numerically computed using the NumericalDerivative operator.
• Many of the operator arithmetic classes implement the usual rules for the Fréchet derivative, such as the chain rule, distributivity over addition etc.

In the classical setting of functionals , the gradient is the vector

This can be generalized to the setting of functionals mapping elements in some Banach space to the real numbers by noting that the Fréchet derivative can be written as

where lies in the dual space of , denoted . For most spaces in ODL, the spaces are Hilbert spaces where by the Riesz representation theorem and hence .

We call the (possibly nonlinear) operator the Gradient operator of .

Implementations in ODL¶

• The gradient is implemented in ODL Functional‘s via the Functional.gradient method.
• It can be numerically computed using the NumericalGradient operator.

Hessian¶

In the classical setting of functionals , the Hessian in a point is the matrix such that

with the derivatives are evaluated in the point . It has the property that that the quadratic variation of is

but also that the derivative of the gradient operator is

If we take this second property as the definition of the Hessian, it can easily be generalized to the setting of functionals mapping elements in some Hilbert space to the real numbers.

Implementations in ODL¶

The Hessian is not explicitly implemented anywhere in ODL. Instead it can be used in the form of the derivative of the gradient operator. This is however not implemented for all functionals.

• For an example of a functional whose gradient has a derivative, see RosenbrockFunctional.
• It can be computed by taking the NumericalDerivative of the gradient, which can in turn be computed using the NumericalGradient.

The spatial gradient of a function is an element in the function space such that for any .

Implementations in ODL¶

• The spatial gradient is implemented in ODL in the Gradient operator.
• Several related operators such as the PartialDerivative and Laplacian are also available.