Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define chain rules for complex functions #167

Merged
merged 33 commits into from
Jun 26, 2020
Merged
Changes from 30 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
024dc29
Define chain rules for complex functions
ettersi Jun 2, 2020
4f9426d
use dots not Deltas
oxinabox Jun 12, 2020
0963d16
make FAQ a question
oxinabox Jun 12, 2020
c063297
say pullback
oxinabox Jun 12, 2020
f1562eb
Add expressions for multiplication with Jacobian, and LaTeXify variou…
Jun 13, 2020
a233860
be as precise for frule as for rrule
oxinabox Jun 16, 2020
a9b577a
Delete trailing whitespaces
Jun 17, 2020
36875b6
Add "the"
Jun 17, 2020
a98c0b1
Replace \dot and \bar with \Delta
Jun 17, 2020
a8aafd6
Fix typo
Jun 17, 2020
73b2b27
Minor language tweaks
Jun 17, 2020
5c3ffca
Add code examples
Jun 17, 2020
40166b8
Split code blocks
Jun 17, 2020
80f97af
Remove \,
ettersi Jun 18, 2020
cd20b31
Remove stray rrule(...)[2]
ettersi Jun 18, 2020
521e72f
Remove stray rrule(...)[2]
Jun 18, 2020
06b9d50
Replace x -> z where necessary
Jun 18, 2020
f5c065c
Replace nothing -> Zero()
Jun 18, 2020
98c7e59
"evaluates the function" -> "returns the value"
Jun 18, 2020
a4f829b
Add note about different notions of complex derivatives
Jun 18, 2020
56693f4
Reword note
Jun 18, 2020
8a8030e
Don't treat function definitions as part of the sentence
Jun 18, 2020
5716029
Discuss multidimensional case
Jun 18, 2020
f6d8e2e
`ChainRules.jl` -> ChainRules
Jun 22, 2020
422efc5
Change rrule interpretation from vjp to j'vp
Jun 22, 2020
c27f743
\overline -> \operatorname{conj}
Jun 22, 2020
6ef79f8
Delete multidimensional case
Jun 22, 2020
62347f6
Minor improvements in Note.
Jun 22, 2020
269adf0
pullback -> pullback function
ettersi Jun 23, 2020
2f13561
Minor clarification
Jun 23, 2020
252c0e5
Change rrule formula to ^T
Jun 23, 2020
f1921e9
Change df_dz and friends to du_dx and friends
ettersi Jun 25, 2020
7997aa8
Change df_dz and friends to du_dx and friends
ettersi Jun 25, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 90 additions & 0 deletions docs/src/FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,3 +76,93 @@ ChainRulesTestUtils.jl has some dependencies, so it is a separate package from C
This means your package can depend on the light-weight ChainRulesCore.jl, and make ChainRulesTestUtils.jl a test-only dependency.

Remember to read the section on [On writing good `rrule` / `frule` methods](@ref).

## How do chain rules work for complex functions?

ChainRules follows the convention that `frule` applied to a function ``f(x + i y) = u(x,y) + i v(x,y)`` with perturbation ``\Delta x + i \Delta y`` returns the value and
```math
\tfrac{\partial u}{\partial x} \, \Delta x + \tfrac{\partial u}{\partial y} \, \Delta y + i \, \Bigl( \tfrac{\partial v}{\partial x} \, \Delta x + \tfrac{\partial v}{\partial y} \, \Delta y \Bigr)
.
```
Similarly, `rrule` applied to the same function returns the value and a pullback function which, when applied to the adjoint ``\Delta u + i \Delta v``, returns
```math
\Delta u \, \tfrac{\partial u}{\partial x} + \Delta v \, \tfrac{\partial v}{\partial x} + i \, \Bigl(\Delta u \, \tfrac{\partial u }{\partial y} + \Delta v \, \tfrac{\partial v}{\partial y} \Bigr)
.
```
If we interpret complex numbers as vectors in ``\mathbb{R}^2``, then `frule` (`rrule`) corresponds to multiplication with the (transposed) Jacobian of ``f(z)``, i.e. `frule` corresponds to
```math
\begin{pmatrix}
\tfrac{\partial u}{\partial x} \, \Delta x + \tfrac{\partial u}{\partial y} \, \Delta y
\\
\tfrac{\partial v}{\partial x} \, \Delta x + \tfrac{\partial v}{\partial y} \, \Delta y
\end{pmatrix}
=
\begin{pmatrix}
\tfrac{\partial u}{\partial x} & \tfrac{\partial u}{\partial y} \\
\tfrac{\partial v}{\partial x} & \tfrac{\partial v}{\partial y} \\
\end{pmatrix}
\begin{pmatrix}
\Delta x \\ \Delta y
\end{pmatrix}

```
and `rrule` corresponds to
ettersi marked this conversation as resolved.
Show resolved Hide resolved
```math
\begin{pmatrix}
\tfrac{\partial u}{\partial x} \, \Delta u + \tfrac{\partial v}{\partial x} \, \Delta v
\\
\tfrac{\partial u}{\partial y} \, \Delta u + \tfrac{\partial v}{\partial y} \, \Delta v
\end{pmatrix}
=
\begin{pmatrix}
\tfrac{\partial u}{\partial x} & \tfrac{\partial v}{\partial x} \\
\tfrac{\partial u}{\partial y} & \tfrac{\partial v}{\partial y} \\
\end{pmatrix}
\begin{pmatrix}
\Delta u \\ \Delta v
\end{pmatrix}
sethaxen marked this conversation as resolved.
Show resolved Hide resolved
.
```
The Jacobian of ``f:\mathbb{C} \to \mathbb{C}`` interpreted as a function ``\mathbb{R}^2 \to \mathbb{R}^2`` can hence be evaluated using either of the following functions.
```
function jacobian_via_frule(f,z)
fz,df_dx = frule((Zero(), 1),f,z)
fz,df_dy = frule((Zero(),im),f,z)
return [
real(df_dx) real(df_dy)
imag(df_dx) imag(df_dy)
]
end
ettersi marked this conversation as resolved.
Show resolved Hide resolved
```
```
function jacobian_via_rrule(f,z)
fz, pullback = rrule(f,z)
_,du_dz = pullback( 1)
_,dv_dz = pullback(im)
return [
real(du_dz) imag(du_dz)
real(dv_dz) imag(dv_dz)
]
end
ettersi marked this conversation as resolved.
Show resolved Hide resolved
```

If ``f(z)`` is holomorphic, then the derivative part of `frule` can be implemented as ``f'(z) \, \Delta z`` and the derivative part of `rrule` can be implemented as ``\operatorname{conj}\bigl(f'(z)\bigr) \, \Delta f``.
Consequently, holomorphic derivatives can be evaluated using either of the following functions.
```
function holomorphic_derivative_via_frule(f,z)
fz,df_dz = frule((Zero(),1),f,z)
return df_dz
end
```
```
function holomorphic_derivative_via_rrule(f,z)
fz, pullback = rrule(f,z)
dself, conj_df_dz = pullback(1)
return conj(conj_df_dz)
end
```

!!! note
There are various notions of complex derivatives (holomorphic and Wirtinger derivatives, Jacobians, gradients, etc.) which differ in subtle but important ways.
The goal of ChainRules is to provide the basic differentiation rules upon which these derivatives can be implemented, but it does not implement these derivatives itself.
It is recommended that you carefully check how the above definitions of `frule` and `rrule` translate into your specific notion of complex derivative, since getting this wrong will quietly give you wrong results.