Complex (Cognitive) Systems
work in progress...
This marvelous course by Prof. Markus Müller revealed like a hidden treasure in the most unexpected of places: a socialsciences and humanitiesoriented postgrad program, in an unimportant university of a lessthanblooming country (the cognitive science masters degree from the Autonomous State University of Morelos, Mexico). The aim of selfpublishing notes and assignments, and assembling them into a coherent whole, goes beyond giving everyone brave enough^{1} a shot to enjoy it. It also sits as a personal reminder of my interest at the intersection of mathematical modeling, biophysics and the mind.
the brain meets the operational definition of what a complex^{2} system is:
 hierarchical organisation, scalefree structure, manytomany relations among components, interconnectedness (recurrence)
 nonlinear behaviour, emergent properties (hard to determine from the observation of components alone, which number in the millions)
 chaotic to initial conditions (slight differences lead to widely different states)
 nonstationary measurements (dynamical system whose model constants/parameters actually vary; system doesn't stay in an attractor)
 phase transitions, equilibrium depends on critical states
 noisy measurements, plus some rather stochastic processes (spiking), whose measurable fluctuations may be difficult to distinguish from chaotic determinism
• dynamical systems basics
we'll distinguish between two kinds of systems:
 stochastic
 dynamical (deterministic). e.g. the harmonic oscillator:
• harmonic oscillator
consider the mechanics of a mass attached to a spring (no other forces involved, e.g. friction, gravity). let's say this is a very simple brain model (or of one of its components thereof) whose activity behaves like an oscillating variable:
from Newton's second law and Hooke's law:
$$F=ma=kx$$
therefore^{3}
$$m\frac{{d}^{2}x}{d{t}^{2}}=kx$$
$$\stackrel{\u0308}{x(t)}=\frac{k}{m}x$$
$$\stackrel{\u0308}{x(t)}={\omega}_{0}^{2}x$$
in order to solve the secondorder linear ordinary homogeneous differential equation, we partition it into a system of two firstorder equations. let $v=\stackrel{\u0307}{x}$ , therefore:
$$\{\begin{array}{c}\hfill \stackrel{\u0307}{v}={\omega}_{0}^{2}x\hfill \\ \hfill \stackrel{\u0307}{x}=v\hfill \end{array}$$
$$\{\begin{array}{c}\hfill \stackrel{\u0307}{v}={\omega}_{0}^{2}\int v\hfill \\ \hfill \stackrel{\u0307}{x}=\int {\omega}_{0}^{2}x={\omega}_{0}^{2}\int x\hfill \end{array}$$
notice that with the last substitutions both equations are the same, except in terms of different variables. both sin(ω_{0}t) and cos(ω_{0}t) satisfy it, because:
$$\frac{d}{dt}sin({\omega}_{0}t)={\omega}_{0}cos({\omega}_{0}t)={\omega}_{0}^{2}\int sin({\omega}_{0}t)dt$$
and $$\frac{d}{dt}cos({\omega}_{0}t)={\omega}_{0}sin({\omega}_{0}t)={\omega}_{0}^{2}\int cos({\omega}_{0}t)dt$$
more generally, solutions will conform to the linear superposition principle:
$$x(t)=asen({\omega}_{0}t)+bcos({\omega}_{0}t);\phantom{\rule{0.278em}{0ex}}a,b\in \mathbb{R}$$
any possible function of position on time will look sinusoidal, of varying frequency (depending on $\sqrt{k/m}$ ), amplitude and phase (a and b together). we rewrite it as:
$$x(t)=\sqrt{{a}^{2}+{b}^{2}}[\frac{a}{\sqrt{{a}^{2}+{b}^{2}}}sen({\omega}_{0}t)+\frac{b}{\sqrt{{a}^{2}+{b}^{2}}}cos({\omega}_{0}t)]=A[{c}_{1}sen({\omega}_{0}t)+{c}_{2}sen({\omega}_{0}t)]$$
this has the property that $1<={c}_{1},{c}_{2}<=1$ , and ${c}_{1}^{2}+{c}_{2}^{2}=1$ .
let ${\phi}_{0}$ be the initial phase. then $\exists {\phi}_{0}\phantom{\rule{0.278em}{0ex}}(\frac{a}{\sqrt{{a}^{2}+{b}^{2}}}=cos({\phi}_{0})\wedge \frac{b}{\sqrt{{a}^{2}+{b}^{2}}}=cos({\phi}_{0}))$
so:
$$x(t)=A[cos({\phi}_{0})sen({\omega}_{0}t)+sen({\phi}_{0})sen({\omega}_{0}t)]=$$
$$A[sen({\omega}_{0}t+{\phi}_{0})]$$
graphically:
• initial conditions
finally, let's say the spring is totally relaxed at t=0, so $x(0)=A=A[sen({\phi}_{0})]$ . this implies that ${\phi}_{0}=\pi /2$ :
$$x(t)=A[sen({\omega}_{0}t+\pi /2)]$$
• complex polar notation
prove that $z={e}^{i\phi}$ is the unit circle.
following Euler's formula, ${e}^{i\phi}=cos(\phi )+isin(\phi )\in (\mathbb{R}\to \u2102)$ . therefore $\stackrel{\u20d7}{z}=\sqrt{{e}^{i\phi}{e}^{i\phi}}=\sqrt{{e}^{i\phi i\phi}}=\sqrt{{e}^{0}}=1.$ ^{4}
solutions of the form $A{e}^{\lambda t}$ include the aforementioned trigonometric solutions when λ, A ∈ ℂ. so the #original differential equation can be rewritten as:
$${\lambda}^{2}A{e}^{\lambda t}+{\omega}_{0}^{2}A{e}^{\lambda t}=0=({\lambda}^{2}+{\omega}_{0}^{2})A{e}^{\lambda t}$$
$$\u22a2{\lambda}^{2}+{\omega}_{0}^{2}=0$$
$$\u22a2\lambda =\sqrt{{\omega}_{0}^{2}}=\pm i{\omega}_{0}$$
the general solution becomes:
$$x(t)=A{e}^{i{\omega}_{0}t}+B{e}^{i{\omega}_{0}t}$$
if we restrict x(t) to the type ℝ→ℝ (position is real), it follows that x(t) equals its complex conjugate.
moreover, $\overline{x(t)}=\overline{A{e}^{i{\omega}_{0}t}+B{e}^{i{\omega}_{0}t}}=\overline{A{e}^{i{\omega}_{0}t}}+\overline{B{e}^{i{\omega}_{0}t}}=\overline{A}{e}^{i{\omega}_{0}t}+\overline{B}{e}^{i{\omega}_{0}t}$ . this implies that $A=\overline{B}$ and $B=\overline{A}$ . so:
$$x(t)=(a+ib)[cos({\omega}_{0}t)+isin({\omega}_{0}t)]+(aib)[cos({\omega}_{0}t)isin({\omega}_{0}t)]$$
$$=2acos(\omega t)2bsin(\omega t)\in (\mathbb{R}\to \mathbb{R})$$
compare this expression against the one using trigonometric functions.
• energy loss
(also see #dissipative systems and attractors)
now we will damp the oscillator by adding a term for friction, whose force is proportional to velocity (but in opposite direction):
$${F}_{T}=\sum F$$
$$\stackrel{\u0308}{x(t)}={\omega}_{0}^{2}x(t)2\beta \stackrel{\u0307}{x(t)}$$
assume a family of solutions of the form $x(t)=A{e}^{\lambda t}$ . therefore:
$${\lambda}^{2}A{e}^{\lambda t}+2\beta \lambda A{e}^{\lambda t}+{\omega}_{0}^{2}A{e}^{\lambda t}=0$$
$$A{e}^{\lambda t}({\lambda}^{2}+2\beta \lambda +{\omega}_{0}^{2})=0$$
$${\lambda}^{2}+2\beta \lambda ={\omega}_{0}^{2}$$
$${\lambda}^{2}+2\beta \lambda +{\beta}^{2}={\omega}_{0}^{2}+{\beta}^{2}=(\lambda +\beta {)}^{2}$$
$$\lambda =\beta \pm i\sqrt{{\omega}_{0}^{2}{\beta}^{2}}$$
plug that back into the general solution:
$$x(t)=A{e}^{(\beta +i\sqrt{{\omega}_{0}^{2}{\beta}^{2}})t}+B{e}^{(\beta i\sqrt{{\omega}_{0}^{2}{\beta}^{2}})t}$$
$$=A{e}^{\beta t}{e}^{i\sqrt{{\omega}_{0}^{2}{\beta}^{2}}t}+B{e}^{\beta t}{e}^{i\sqrt{{\omega}_{0}^{2}{\beta}^{2}}t}$$
$$={e}^{\beta t}(A{e}^{i\sqrt{{\omega}_{0}^{2}{\beta}^{2}}t}+B{e}^{i\sqrt{{\omega}_{0}^{2}{\beta}^{2}}t})$$
asymptotically, the system will inevitably evolve to the most entropic dynamic state. let's show this on a casebycase basis, taking note on the different (non)monotonic dynamics:
• underdamped: ω_{0} > β
let $\omega =\sqrt{{\omega}_{0}^{2}{\beta}^{2}}$ :
$$x(t)={e}^{\beta t}(A{e}^{i\omega t}+B{e}^{i\omega t})$$
by restricting x(t) to ℝ→ℝ, and from our previous knowledge of the simple harmonic oscillator solution:
$$\overline{x(t)}={e}^{\beta t}Dsin(\omega t+{\phi}_{0})$$
• ω_{0} = β
$${\lambda}_{1}=\beta \pm i\sqrt{0}=\beta ={\lambda}_{2}$$
we leverage the fact that amplitude isn't constant anymore:
$$x(t)=A(t){e}^{\beta t}$$
$$\stackrel{\u0307}{x}(t)=\stackrel{\u0307}{A}(t){e}^{\beta t}A(t)\beta {e}^{\beta t}$$
$$\stackrel{\u0308}{x}(t)=\stackrel{\u0308}{A}(t){e}^{\beta t}\stackrel{\u0307}{A}(t)\beta {e}^{\beta t}\stackrel{\u0307}{A}(t)\beta {e}^{\beta t}+A(t){\beta}^{2}{e}^{\beta t}$$
and the damped harmonic oscillator equation becomes:
$${e}^{\beta t}[\stackrel{\u0308}{A}2\stackrel{\u0307}{A}\beta +A{\beta}^{2}+2\beta \stackrel{\u0307}{A}2{\beta}^{2}A+{\beta}^{2}A]=0$$
$${e}^{\beta t}\left[\stackrel{\u0308}{A}\right]=0$$
$$\u22a2\stackrel{\u0308}{A}(t)=0$$
$$\u22a2\stackrel{\u0308}{A}(t)=0\leftrightarrow A(t)={a}_{1}+{a}_{2}t$$
i.e., amplitude is a linear function. so the solution must be of the form:
$$x(t)={a}_{1}{e}^{\beta t}+{a}_{2}t{e}^{\beta t}$$
graphically, the spring oscillates only once:
• overdamped: ω_{0} < β
$$\lambda =\beta \pm i\sqrt{{\omega}_{0}^{2}{\beta}^{2}}=\beta \pm {i}^{2}\sqrt{{\beta}^{2}{\omega}_{0}^{2}}\in \mathbb{R}$$
$$x(t)=A{e}^{\beta t}[A{e}^{\omega t}+B{e}^{\omega t}]$$
because $\beta >\omega $ , $A{e}^{\omega t}$ is overshadowed by $B{e}^{\omega t}$ . the plot never gets to complete an oscillation:
• energy input
the damped model will get augmented with the influence of external stimuli, driving our cognitiveunit model to activity. this is known as a driven harmonic oscillator. let's go with an engine rotating at frequency $\stackrel{\u0303}{\omega}$ and amplitude F, which in turn pushes and pulls the rest of the system:
$$\stackrel{\u0308}{x(t)}+2\beta \stackrel{\u0307}{x(t)}+{\omega}_{0}^{2}x(t)=Fcos(\stackrel{\u0303}{\omega}t)$$
the equation isn't homogeneous anymore. the solution is the sum of the general and particular solutions: the homogeneous one is of the type ${e}^{\beta t}$ and eventually dissipates. the particular one oscillates forever:
$${x}_{p}(t)=A{e}^{i\stackrel{\u0303}{\omega}t}$$
• resonance
starting from an exploration of the steady state ("final conditions"), we will show that amplitude is a function of the engine's frequency, and that there are privileged frequencies for which steadystate amplitude forms maxima or singularities: resonance frequencies.
$$\stackrel{\u0307}{{x}_{p}(t)}=i\stackrel{\u0303}{\omega}A{e}^{i\stackrel{\u0303}{\omega}t}$$
$$\stackrel{\u0308}{{x}_{p}(t)}={\stackrel{\u0303}{\omega}}^{2}A{e}^{i\stackrel{\u0303}{\omega}t}$$
so we rewrite the full differential equation:
$$[{\stackrel{\u0303}{\omega}}^{2}A+2\beta iA\stackrel{\u0303}{\omega}+{\omega}_{0}^{2}A]{e}^{i\stackrel{\u0303}{\omega}t}=F{e}^{i\stackrel{\u0303}{\omega}t}$$
$$[{\omega}_{0}^{2}{\stackrel{\u0303}{\omega}}^{2}+2\beta i\stackrel{\u0303}{\omega}]A=F$$
$$A(\stackrel{\u0303}{\omega})=\frac{F}{[{\omega}_{0}^{2}{\stackrel{\u0303}{\omega}}^{2}+i2\beta \stackrel{\u0303}{\omega}]}$$
$$=\frac{F}{[({\omega}_{0}^{2}{\stackrel{\u0303}{\omega}}^{2})+i2\beta \stackrel{\u0303}{\omega}]}\frac{[({\omega}_{0}^{2}{\stackrel{\u0303}{\omega}}^{2})i2\beta \stackrel{\u0303}{\omega}]}{[({\omega}_{0}^{2}{\stackrel{\u0303}{\omega}}^{2})i2\beta \stackrel{\u0303}{\omega}]}$$
$$=\frac{F[({\omega}_{0}^{2}{\stackrel{\u0303}{\omega}}^{2})2\beta i\stackrel{\u0303}{\omega}]}{[({\omega}_{0}^{2}{\stackrel{\u0303}{\omega}}^{2}{)}^{2}+(2\beta \stackrel{\u0303}{\omega}{)}^{2}]}$$
(...magic happens here...)
$$A(\stackrel{\u0303}{\omega})=\frac{F}{\sqrt{({\omega}_{0}^{2}{\stackrel{\u0303}{\omega}}^{2}{)}^{2}+(2\beta \stackrel{\u0303}{\omega}{)}^{2}}}$$
• phase space
phase space contains all information of a dynamical system, with a different axis for each variable of interest. each point in it represents a state of the whole system, and a trajectory in this virtual space corresponds to the evolution of the system.
signal: disregarding noise, taking a series of measurements is equivalent to observing variation in a single dimension of phase space. graphically, this is a projection of the system upon some vector in phase space.
• Fourier analysis
• linear independence of Fourier terms
let ${v}_{n}(x)=\frac{1}{\sqrt{2a}}{e}^{i\frac{n\pi}{a}x}$ with $n=0,\pm 1,\pm 2,...$ be functions on the interval $[a,a]$ . prove that ${v}_{n}\xb7{v}_{m}={\delta}_{nm}$ , given the proper definition of the dot product. ${\delta}_{nm}=\{\begin{array}{cc}\hfill 1\hfill & \hfill \iff n=m\hfill \\ \hfill 0\hfill & \hfill \iff n\ne m\hfill \end{array}$ is Dirac's delta distribution.
let ${V}_{n}\xb7{V}_{m}={\int}_{a}^{a}dx\phantom{\rule{0.278em}{0ex}}{V}_{n}^{\star}(x){V}_{m}(x)$ be the scalar product for functions of the vector basis defined by ${V}_{n}(x)$ .
proof by modus ponens follows:
 restatement of conditional premise
given the above definitions; if ${V}_{n}\xb7{V}_{m}={\delta}_{nm}$ , then ${V}_{n}(x)$ is an orthogonal basis. that is:
$$\forall n\forall m\phantom{\rule{0.278em}{0ex}}(({V}_{n}\xb7{V}_{n}=1)\wedge ({V}_{n}\xb7{V}_{m}=0;n\ne m)\Rightarrow {V}_{n}(x)\phantom{\rule{0.333em}{0ex}}\mathrm{\text{is orthogonal}})$$
 proof of the antecedent premise
indeed: $\forall n\forall m\phantom{\rule{0.278em}{0ex}}(({V}_{n}\xb7{V}_{n}=1)\wedge ({V}_{n}\xb7{V}_{m}=0))$
because:
$$\u22a2({\int}_{a}^{a}dx\phantom{\rule{0.278em}{0ex}}{V}_{n}^{\star}(x){V}_{n}(x)=1)\wedge ({\int}_{a}^{a}dx\phantom{\rule{0.278em}{0ex}}{V}_{n}^{\star}(x){V}_{m}(x)=0)$$
$$\u22a2({\int}_{a}^{a}dx\phantom{\rule{0.278em}{0ex}}\frac{1}{\sqrt{2a}}{e}^{in\pi x/a}\frac{1}{\sqrt{2a}}{e}^{in\pi x/a}=1)\wedge ({\int}_{a}^{a}dx\phantom{\rule{0.278em}{0ex}}\frac{1}{\sqrt{2a}}{e}^{in\pi x/a}\frac{1}{\sqrt{2a}}{e}^{im\pi x/a}=0)$$
$$\u22a2(\frac{1}{2a}{\int}_{a}^{a}dx\phantom{\rule{0.278em}{0ex}}{e}^{in\pi x/a}{e}^{in\pi x/a}=1)\wedge (\frac{1}{2a}{\int}_{a}^{a}dx\phantom{\rule{0.278em}{0ex}}{e}^{in\pi x/a}{e}^{im\pi x/a}=0)$$
$$\u22a2(\frac{1}{2a}{\int}_{a}^{a}dx\phantom{\rule{0.278em}{0ex}}{e}^{(in\pi x/a)(in\pi x/a)}=1)\wedge (\frac{1}{2a}{\int}_{a}^{a}dx\phantom{\rule{0.278em}{0ex}}{e}^{(in\pi x/a)(im\pi x/a)}=0)$$
$$\u22a2(\frac{1}{2a}{\int}_{a}^{a}dx\phantom{\rule{0.278em}{0ex}}{e}^{0}=1)\wedge (\frac{1}{2a}{\int}_{a}^{a}dx\phantom{\rule{0.278em}{0ex}}{e}^{i\pi x(nm)/a}=0)$$
from Euler's formula:
$$\u22a2(\frac{1}{2a}{\int}_{a}^{a}dx=1)\wedge (\frac{1}{2a}{\int}_{a}^{a}dx\phantom{\rule{0.278em}{0ex}}(cos(\frac{\pi x(nm)}{a})+i\phantom{\rule{0.278em}{0ex}}sen(\frac{\pi x(nm)}{a}))=0)$$
$$\u22a2({\frac{x}{2a}}_{a}^{a}=1)\wedge (\frac{1}{2a}{\int}_{a}^{a}dx\phantom{\rule{0.278em}{0ex}}(cos(\frac{\pi x(nm)}{a})+i\phantom{\rule{0.278em}{0ex}}sen(\frac{\pi x(nm)}{a}))=0)$$
from the integration of the odd function $sen(\frac{\pi x(nm)}{a})$ over a zerocentered interval, $[a,a]$ :
$$\u22a2(\frac{a}{2a}\frac{a}{2a}=1)\wedge (\frac{1}{2a}{\int}_{a}^{a}dx\phantom{\rule{0.278em}{0ex}}(cos(\frac{\pi x(nm)}{a})+0)=0)$$
$$\u22a2(1=1)\wedge (\frac{1}{2a}{\int}_{a}^{a}dx\phantom{\rule{0.278em}{0ex}}cos(\frac{\pi x(nm)}{a})=0)$$
let $u=\frac{\pi x(nm)}{a}$ ; then $\frac{du}{dx}=\frac{\pi (mn)}{a}$ .
$$\u22a2\top \wedge (\frac{1}{2a}{\int}_{a}^{a}cos(u)\frac{\pi (nm)}{a}\phantom{\rule{0.278em}{0ex}}dx\phantom{\rule{0.278em}{0ex}}\frac{a}{\pi (nm)}=0)$$
$$\u22a2\top \wedge (\frac{a}{2a\pi (nm)}{\int}_{a}^{a}cos(u)\phantom{\rule{0.278em}{0ex}}du=0)$$
$$\u22a2\top \wedge (\frac{a}{2a\pi (nm)}{[sen(u)]}_{a}^{a}=0)$$
$$\u22a2\top \wedge (\frac{a}{2a\pi (nm)}[sen(\frac{\pi (a)(nm)}{a})sen(\frac{\pi (a)(nm)}{a})]=0)$$
note that $(nm)\in \mathbb{Z}$ . $\u22a2\forall n\forall m\phantom{\rule{0.278em}{0ex}}(sen(\pi (nm))=0)$
$$\u22a2\top \wedge (\frac{a}{2a\pi (nm)}[00]=0)$$
$$\u22a2\top \wedge \top $$
 conclusion
$\u22a2{V}_{n}(x)$ is an orthogonal basis.
• Fourier transform
the Fourier series represents a periodic function as a discrete linear combination of sines and cosines . on the other hand, the Fourier transform is capable of representing a more general set of functions, not restricted to being periodic. how is that achieved?
(for convenience, we will start from the series in complex exponential notation)
$$f(x){f}_{0}=\sum _{n=1}^{\infty}[{a}_{n}cos\left(\frac{n\pi x}{a}\right)+{b}_{n}sen\left(\frac{n\pi x}{a}\right)]$$
$$=\sum _{\infty}^{\infty}\underset{{\alpha}_{n}}{\underset{\u23df}{(\frac{1}{2a}{\int}_{a}^{a}f(x){e}^{in\pi x/a}\phantom{\rule{0.278em}{0ex}}dx)}}{e}^{in\pi x/a}$$
if $f(x)$ weren't periodic, the series would be representing it only at $[a,a]$ . nonetheless, we can make the interval arbitrarily large:
$$=\underset{a\to \infty}{lim}\sum _{\infty}^{\infty}\underset{{\alpha}_{n}}{\underset{\u23df}{(\frac{1}{2a}{\int}_{a}^{a}f(x){e}^{in\pi x/a}\phantom{\rule{0.278em}{0ex}}dx)}}{e}^{in\pi x/a}$$
let ${k}_{n}=\frac{n\pi}{a}$ , then $\Delta k=\frac{\pi}{a}$ . substituting in the last expression yields:
$$=\underset{a\to \infty}{lim}\sum _{\infty}^{\infty}\Delta k{\left(\Delta k\right)}^{1}\left(\frac{1}{2a}\right)({\int}_{a}^{a}f(x){e}^{i{k}_{n}x}\phantom{\rule{0.278em}{0ex}}dx)\left({e}^{i{k}_{n}x}\right)$$
$$=\underset{a\to \infty}{lim}\sum _{\infty}^{\infty}\left(\frac{\pi}{a}\right)\left(\frac{a}{2a\pi}\right)({\int}_{a}^{a}f(x){e}^{i{k}_{n}x}\phantom{\rule{0.278em}{0ex}}dx)\left({e}^{i{k}_{n}x}\right)$$
$$=\underset{\Delta k\to 0}{lim}\phantom{\rule{0.278em}{0ex}}\frac{1}{2\pi}\sum _{\infty}^{\infty}\Delta k({\int}_{\infty}^{\infty}f(x){e}^{i{k}_{n}x}\phantom{\rule{0.278em}{0ex}}dx)\left({e}^{i{k}_{n}x}\right)$$
which is a Riemman sum for variable ${k}_{n}$ :
$$=\frac{1}{2\pi}{\int}_{\infty}^{\infty}d{k}_{n}\phantom{\rule{0.278em}{0ex}}({\int}_{\infty}^{\infty}f(x){e}^{i{k}_{n}x}\phantom{\rule{0.278em}{0ex}}dx)\left({e}^{i{k}_{n}x}\right)$$
$$={\left(\sqrt{\frac{1}{2\pi}}\right)}^{2}{\int}_{\infty}^{\infty}({\int}_{\infty}^{\infty}f(x){e}^{i{k}_{n}x}\phantom{\rule{0.278em}{0ex}}dx){e}^{i{k}_{n}x}\phantom{\rule{0.278em}{0ex}}d{k}_{n}$$
$$=\underset{\mathrm{\text{Inverse Fourier transform}}}{\underset{\u23df}{\frac{1}{\sqrt{2\pi}}{\int}_{\infty}^{\infty}\underset{\mathrm{\text{Fourier transform}}}{\underset{\u23df}{(\frac{1}{\sqrt{2\pi}}{\int}_{\infty}^{\infty}f(x){e}^{i{k}_{n}x}\phantom{\rule{0.278em}{0ex}}dx)}}{e}^{i{k}_{n}x}\phantom{\rule{0.278em}{0ex}}d{k}_{n}}}$$
• properties
which symmetry properties does H(f) has if h(t) is:
 odd real
 odd imaginary
 odd complex
and which symmetry properties does h(t) has if H(f ) is:
 odd real
 even complex
 even imaginary
recall that:
$$H(f)={\int}_{\infty}^{\infty}[r(t)+ij(t)][cos(2\pi ft)+isen(2\pi ft)]dt$$
$$={\int}_{\infty}^{\infty}[r(t)cos(2\pi ft)+j(t)sen(2\pi ft)]dt+i{\int}_{\infty}^{\infty}[j(t)cos(2\pi ft)r(t)sen(2\pi ft)]dt;$$
$$h(t)={\int}_{\infty}^{\infty}[R(f)+iJ(f)][cos(2\pi ft)+isen(2\pi ft)]dt$$
$$={\int}_{\infty}^{\infty}[R(f)cos(2\pi ft)+J(f)sen(2\pi ft)]dt+i{\int}_{\infty}^{\infty}[J(f)cos(2\pi ft)R(f)sen(2\pi ft)]dt.$$
 odd real h(t)
we get rid of the imaginary part of H(f):
$$H(f)={\int}_{\infty}^{\infty}[r(t)cos(2\pi ft)+0)]dt+i{\int}_{\infty}^{\infty}[0r(t)sen(2\pi ft)]dt$$
from integration of an odd function (r(t)cos(2πft)) at a symmetric interval:
$$H(f)=0+i{\int}_{\infty}^{\infty}[r(t)sen(2\pi ft)]dt$$
$sen(2\pi ft)$ is odd. $\therefore H(f)$ is odd and imaginary.
 odd imaginary h(t)
same reasoning:
$$H(f)={\int}_{\infty}^{\infty}[0+j(t)sen(2\pi ft)]dt+i{\int}_{\infty}^{\infty}[j(t)cos(2\pi ft)0]dt$$
$$H(f)={\int}_{\infty}^{\infty}[j(t)sen(2\pi ft)]dt+0$$
$sen(2\pi ft)$ is odd. $\therefore H(f)$ is odd and real.
 odd h(t)
$$H(f)={\int}_{\infty}^{\infty}[0+j(t)sen(2\pi ft)]dt+i{\int}_{\infty}^{\infty}[0r(t)sen(2\pi ft)]dt$$
$sen(2\pi ft)$ is odd. $\therefore H(f)$ is odd and complex.
 odd real H(f)
$$h(t)={\int}_{\infty}^{\infty}[R(f)cos(2\pi ft)+0]dt+i{\int}_{\infty}^{\infty}[0R(f)sen(2\pi ft)]dt$$
$$h(t)=0+i{\int}_{\infty}^{\infty}R(f)sen(2\pi ft)dt$$
$sen(2\pi ft)$ is odd. $\therefore h(t)$ is imaginary and odd.
 complex even H(f)
$$h(t)={\int}_{\infty}^{\infty}[R(f)cos(2\pi ft)+0]dt+i{\int}_{\infty}^{\infty}[J(f)cos(2\pi ft)0]dt$$
$cos(2\pi ft)$ is even. $\therefore h(t)$ is even and complex.
 imaginary even H(f)
$$h(t)={\int}_{\infty}^{\infty}[0+J(f)sen(2\pi ft)]dt+i{\int}_{\infty}^{\infty}[J(f)cos(2\pi ft)0]dt$$
$$h(t)=0+i{\int}_{\infty}^{\infty}J(f)cos(2\pi ft)dt$$
$cos(2\pi ft)$ is even. $\therefore h(t)$ is even and imaginary.
• uncertainty principle
the more shortlived the wave in the time domain, the more widespread its power spectrum (a.k.a. frequency spectrum). vice versa.
• convolution theorem
$y(t)=x(t)\ast h(t)={\int}_{\infty}^{\infty}x(\tau )h(t\tau )d\tau $ is the convolution of functions x(t) y h(t). F[y(t)] denotes its Fourier transform:
$$F[y(t)]={\int}_{\infty}^{\infty}y(t){e}^{2\pi ift}dt$$
$$={\int}_{\infty}^{\infty}[{\int}_{\infty}^{\infty}x(\tau )h(t\tau )d\tau ]{e}^{2\pi ift}dt$$
$$={\int}_{\infty}^{\infty}x(\tau )[{\int}_{\infty}^{\infty}h(t\tau ){e}^{2\pi ift}dt]d\tau $$
let u=tτ.
$$={\int}_{\infty}^{\infty}x(\tau )[{\int}_{\infty}^{\infty}h(u){e}^{2\pi if(u+\tau )}du]d\tau $$
$$={\int}_{\infty}^{\infty}x(\tau )[{\int}_{\infty}^{\infty}h(u){e}^{2\pi ifu}{e}^{2\pi if\tau}du]d\tau $$
$$=[{\int}_{\infty}^{\infty}x(\tau ){e}^{2\pi if\tau}d\tau ][{\int}_{\infty}^{\infty}h(u){e}^{2\pi ifu}du]$$
$$=X(f)H(f)$$
therefore:
$$x(t)\ast h(t)={F}^{1}[X(f)H(f)]$$
timedomain convolution is equivalent to frequencydomain product!
• crosscorrelation theorem
${z}_{x,h}(t)={\int}_{\infty}^{\infty}x(\tau )h(t+\tau )d\tau $ is the correlation (covariance, actually) of functions x(t) y h(t) as a function of a time lag between them. F[z(t)] denotes its Fourier transform:
$$F[z(t)]={\int}_{\infty}^{\infty}z(t){e}^{2\pi ift}dt$$
$$={\int}_{\infty}^{\infty}[{\int}_{\infty}^{\infty}x(\tau )h(t+\tau )d\tau ]{e}^{2\pi ift}dt$$
$$={\int}_{\infty}^{\infty}x(\tau )[{\int}_{\infty}^{\infty}h(t+\tau ){e}^{2\pi ift}dt]d\tau $$
let u=t+τ.
$$={\int}_{\infty}^{\infty}x(\tau )[{\int}_{\infty}^{\infty}h(u){e}^{2\pi if(u\tau )}du]d\tau $$
$$={\int}_{\infty}^{\infty}x(\tau )[{\int}_{\infty}^{\infty}h(u){e}^{2\pi ifu}{e}^{2\pi if\tau}du]d\tau $$
$$=[{\int}_{\infty}^{\infty}x(\tau ){e}^{2\pi if\tau}d\tau ][{\int}_{\infty}^{\infty}h(u){e}^{2\pi ifu}du]$$
$$={X}^{*}(f)H(f)$$
therefore:
$${z}_{x,h}={F}^{1}[{X}^{*}(f)H(f)]$$
crosscorrelation is equivalent to frequency domain product (with conjugate X(f)).
• discrete Fourier transform
the computation of a DFT is done segment by segment. the signal is multiplied by some "window" function (e.g. a square pulse) which is zero outside the window. this is known as windowing.
the effect of using a finite window is that unexisting frequencies appear at the power spectrum (around the original frequencies). this phenomenon is known as leakage. the explanation is evident from the #convolution theorem: one can use similar reasoning to show that $h(t)x(t)={F}^{1}[H(f)\ast X(f)]$ . therefore, multiplication by a window will induce a convolution in the frequency domain.
low frequencies (and therefore #nonstationary systems) are the most problematic to compute a DFT for, since the window may not be long enough to capture them. there's no single #test for stationarity. it can be tested for nonetheless, measuring the effect of varying window sizes on different parameters (centrality and spread statistics, Lyapunov exponent, etc.).
• Whittaker–Shannon (sinc) interpolation
what function should we convolve the signal's Fourier transform with in order to obtain the same result as windowing it with a rectangular function? we will analyze the closelyrelated case of a square pulse window:
calculate the first three terms of the Fourier series for function $f(x)=\{\begin{array}{cc}\hfill 1\hfill & \hfill \iff 2n\pi \le x\le (2n+1)\pi \hfill \\ \hfill 1\hfill & \hfill \iff (2n+1)\pi \le x\le (2n+2)\pi \hfill \end{array}$ , with $n=0,\pm 1,\pm 2,...$
in terms of the Fourier series:
$$f(x)=\underset{{f}_{0}}{\underset{\u23df}{\frac{1}{2a}{\int}_{a}^{a}f(x)dx}}+\sum _{n=1}^{\infty}[\underset{{a}_{n}}{\underset{\u23df}{(\frac{1}{a}{\int}_{a}^{a}dx\phantom{\rule{0.278em}{0ex}}f(x)cos\left(\frac{n\pi x}{a}\right))}}cos\left(\frac{n\pi x}{a}\right)+\underset{{b}_{n}}{\underset{\u23df}{(\frac{1}{a}{\int}_{a}^{a}dx\phantom{\rule{0.278em}{0ex}}f(x)sen\left(\frac{n\pi x}{a}\right))}}sen\left(\frac{n\pi x}{a}\right)]$$
from the integration of odd functions ( $f(x)cos\left(\frac{n\pi x}{a}\right)$ and $f(x)$ ) at symmetrical interval ( $[a,a]$ ):
$$\u22a2f(x)=\underset{{f}_{0}}{\underset{\u23df}{0}}+\sum _{n=1}^{\infty}[\underset{{a}_{n}}{\underset{\u23df}{\left(0\right)}}\left(cos\left(\frac{n\pi x}{a}\right)\right)+\underset{{b}_{n}}{\underset{\u23df}{(\frac{1}{a}{\int}_{a}^{a}dx\phantom{\rule{0.278em}{0ex}}f(x)sen\left(\frac{n\pi x}{a}\right))}}sen\left(\frac{n\pi x}{a}\right)]$$
$$\u22a2f(x)=\sum _{n=1}^{\infty}\underset{{b}_{n}}{\underset{\u23df}{(\frac{1}{a}{\int}_{a}^{a}dx\phantom{\rule{0.278em}{0ex}}f(x)sen\left(\frac{n\pi x}{a}\right))}}sen\left(\frac{n\pi x}{a}\right)$$
we will consider the interval $[a,a]=[\pi ,\pi ]$ , which includes a full period of $f(x)sen(n\pi x/\pi )$ for any n. therefore:
$$\u22a2f(x)=\sum _{n=1}^{\infty}\underset{{b}_{n}}{\underset{\u23df}{(\frac{1}{\pi}{\int}_{\pi}^{\pi}dx\phantom{\rule{0.278em}{0ex}}f(x)sen\left(\frac{n\pi x}{\pi}\right))}}sen\left(\frac{n\pi x}{\pi}\right)$$
$$\u22a2f(x)=\sum _{n=1}^{\infty}\underset{{b}_{n}}{\underset{\u23df}{\frac{1}{\pi}({\int}_{\pi}^{0}dx\phantom{\rule{0.278em}{0ex}}(1)sen(nx)+{\int}_{0}^{\pi}dx\phantom{\rule{0.278em}{0ex}}(+1)sen(nx))}}sen(nx)$$
$$\u22a2f(x)=\sum _{n=1}^{\infty}\underset{{b}_{n}}{\underset{\u23df}{\frac{1}{\pi}(\frac{1}{n}{[cos(nx)]}_{\pi}^{0}\frac{1}{n}{[cos(nx)]}_{0}^{\pi})}}sen(nx)$$
$$\u22a2f(x)=\sum _{n=1}^{\infty}\underset{{b}_{n}}{\underset{\u23df}{\frac{1}{n\pi}(cos(0)cos(n\pi )cos(n\pi )+cos(0))}}sen(nx)$$
$$\u22a2f(x)=\sum _{n=1}^{\infty}\underset{{b}_{n}}{\underset{\u23df}{\frac{1}{n\pi}(2cos(0)2cos(n\pi ))}}sen(nx)$$
$$\u22a2f(x)=\sum _{n=1}^{\infty}\underset{{b}_{n}}{\underset{\u23df}{\frac{1}{n\pi}(22cos(n\pi ))}}sen(nx)$$
$$\u22a2f(x)=\sum _{n=1}^{\infty}\underset{{b}_{n}}{\underset{\u23df}{\frac{1}{n\pi}\left(\{\begin{array}{cc}\hfill 22\hfill & \hfill \iff n\phantom{\rule{0.333em}{0ex}}\mathrm{\text{even}}\hfill \\ \hfill 2+2\hfill & \hfill \iff n\phantom{\rule{0.333em}{0ex}}\mathrm{\text{odd}}\hfill \end{array}\right)}}sen(nx)$$
$$\u22a2f(x)=\sum _{n=1}^{\infty}\{\begin{array}{cc}\hfill 0\hfill & \hfill \iff n\phantom{\rule{0.333em}{0ex}}\mathrm{\text{even}}\hfill \\ \hfill \frac{4}{n\pi}sen(nx)\hfill & \hfill \iff n\phantom{\rule{0.333em}{0ex}}\mathrm{\text{odd}}\hfill \end{array}$$
$$\u22a2f(x)=\sum _{n=0}^{\infty}\frac{4}{\pi (2n+1)}sen(x(2n+1))$$
$$=\frac{4}{\pi}\sum _{n=0}^{\infty}\frac{sen(x(2n+1))}{2n+1}$$
$$\u22a2f(x)=\underset{{f}_{0}}{\underset{\u23df}{0}}+\frac{4}{\pi}sen(x)+0+\frac{4}{3\pi}sen(3x)+0+\frac{4}{5\pi}sen(5x)+...$$
f(x) could be just a discrete sample of a continuous signal, but because convolution with sinc(x) produces a continuous function (a sum of sine waves), the operation will fill in all the missing time points.
• NyquistShannon sampling theorem
continuous functions of bounded frequency bandwidth ("bandlimited") contain a bounded amount of information. they can be perfectly represented using a function of countable (i.e. discrete) sets and an interpolation operation.
the Nyquist frequency is the maximum sampling frequency at which the digitization of a continuous signal still losses fidelity. that is, aliasing occurs:
$${f}_{Nyquist}=2\phantom{\rule{0.278em}{0ex}}{f}_{max}$$
sampling frequency should be greater than twice the maximum frequency in the power spectrum of the original signal.
undersampling results in the unconsidered high frequencies aliasing (adding up) over ${f}_{Nyquist}(f{f}_{Nyquist})$ at the frequency domain. oversampling is harmless, beyond being a waste of disk space.
• testing for undersampling
what if the maximum frequency isn't known in advance? it is possible to test that the maximum frequency has been considered, because aliasing will make two power spectra of the same signal look dissimilar under two different sampling frequencies:
 register using two sampling frequencies, one greater than the one to test for the Nyquist criterion.
 compute both power spectra ( ${F[x(t)]}^{2}$ ).
 if they are simply scaled versions of one another, aliasing didn't occur. i.e. the ratio between two features (e.g. maxima) should stay constant.
• other timeseries topics
• stochastic correlations
suppose the signal is generated by an stochastic process. if low frequencies dominate the power spectrum, then the process is nonstationary (timeseries mean value will be trending over long periods of time) and autocorrelation will be large.
corollary: even two independent sources of noise can display high statistical dependence.
averaging the time series would be no good. prediction should take autocovariance times into account, detrending methods are also available.
• stationarity

strict/particular stationarity: system's parameters are truly constant. too hard a requirement, since we must know the system's differential equations in advance.

weak/general stationarity: empirical transition probabilities in the F map stay constant. affected by lowband frequencies. alternatively, 1st moment and autocovariance stay constant and finite. this requires greater data availability, because short acquisition times may not be enough to establish statistical relevance.
• linear prediction
suppose there's a time series $\stackrel{\u20d7}{s}=({s}_{1},{s}_{2},...,{s}_{N})$ whose value ${s}_{N+1}$ we want to predict. in a linear prediction algorithm we define ${\hat{s}}_{N+1}$ as a linear combination of the last m points:
$${\hat{s}}_{N+1}=\sum _{j=1}^{m}{a}_{j}{s}_{nm+j}$$
it's clear that there are as many linear combinations of the last m points as there are permutations (with substitution) of coefficients ${a}_{j}$ . in order to find the optimal coefficients we try to minimize the mean square error predicting the previous m  1 values:
$$\sum _{n=m}^{N1}({\hat{s}}_{n+1}{s}_{n+1}{)}^{2}$$
• crossvalidation
 insample accuracy: estimated from predictions made for the same data segment used during training/optimisation phase. it doesn't reflect the algorithm's true predictive power (overestimate), since tests aren't predictions but "postdictions".
 outsample accuracy: error estimated from true predictions, using the values of data never "seen" before by our predictive model.
• phasespace methods
• dissipative systems and attractors
dissipative system: on average, the volume of the manifold describing the state trajectory contracts under dynamics, after certain initial conditions. alternatively, it can be said that #energy loss creates an attractor for that system, reducing the set of possible states that it can take with time. nonetheless, the system still operates far from thermodynamic equilibrium.
in order to verify the presence of an attractor, the L1 metric of the determinant of the system's Jacobian (i.e., of the state transition function's (F) partial derivatives) must be smaller than 1, on average:
$$\leftdet\left(\left(\begin{array}{ccc}\hfill \frac{\partial {f}_{1}}{\partial {x}_{1}}\hfill & \hfill \cdots \hfill & \hfill \frac{\partial {f}_{1}}{\partial {x}_{n}}\hfill \\ \hfill \vdots \hfill & \hfill \hfill & \hfill \vdots \hfill \\ \hfill \frac{\partial {f}_{n}}{\partial {x}_{1}}\hfill & \hfill \cdots \hfill & \hfill \frac{\partial {f}_{n}}{\partial {x}_{n}}\hfill \end{array}\right)\right)\right<1$$
this means that the transformation's scaling factor produced by change on the different directions is a contracting one. visually, if phase space contains an attractor then the system is dissipative.
• visual inspection
 Poincaré section: allows to tell whether the trajectory behavior is classically deterministic, chaotic deterministic or random. it amounts to finding the intersection between trajectories and a secant plane.
 stroboscopic map: graphical representation of Poincaré's section technique for attractors. section is adjusted until the scatter plot is maximally decorrelated. also used for qualitative detection of strange/fractal attractors.
 "phase portrait": also known as delay representation. it is a bidimensional representation of any mdimensional phase space. each point is given by a pair of subsequent states in phase space (see also #TakensWhitney embedding theorem). it eases the visualisation of multidimensional systems and helps identify determinism, periodicity, linearity and chaos.
• TakensWhitney embedding theorem
it is possible to reconstruct a phase space manifold that is isomorphic to the original one, even when all information we have is a time series, through the method known as delay reconstruction.
requirements:
 stationary data
 lowdimensional system
 negligible noise
the basic idea is that for each subset of m equidistant points in the signal, there's an associated unique point in mdimensional phase space. each value in this subset of the time series becomes a component of the state vector:
note that the closer ${s}_{1}$ and ${s}_{2}$ are, the more correlated the reconstructed attractor; rendering it less useful:
the spacing parameter among points in the time series is called the time lag τ. to set an adequate value for τ, we want τ > L; with L being the autocovariance length. that is, we want the time between two measurements to be long enough so that they aren't correlated anymore.
finding the right number m of dimensions involves increasing the number of embedding components per point, until the manifold can't unfold anymore. we say there are false neighbors (and therefore that the embedding isn't highdimensional enough) whenever an increase in m separates seemingly close points considerably in reconstructed phase space.
• nonlinear prediction
unlike the #linear prediction algorithm for time series; a nonlinear algorithm doesn't use the last m points combined. rather, it builds upon the intuition that the most relevant states for making a prediction some time Δn after the present are the ones that most resembled the current state. it then averages those similar states after Δn:
$${\hat{s}}_{N+\Delta n}=\frac{1}{\mathcal{U}(\stackrel{\u20d7}{{s}_{N}})}\sum _{\stackrel{\u20d7}{{s}_{n}}\in \mathcal{U}(\stackrel{\u20d7}{{s}_{N}})}{s}_{n+\Delta n}$$
where $\mathcal{U}(\stackrel{\u20d7}{{s}_{N}})$ denotes the cardinality of the current state vector's neighborhood (in phase space).
• as stationarity test

divide the series in intervals so short so as to have lineal segments.

predict and crossvalidate using all possible segment pairs.

if some training segments display unusually different performance, then they don't generalise to the whole series. therefore we can conclude that the attractor changed as we swapped training segments; and the time series isn't stationary.
• as a noise smoother
suppose the measured time series is the sum of its true value and noise:
$${s}_{n}={x}_{n}+{z}_{n}.$$
where ${z}_{n}$ is the random component. also suppose ${x}_{n}$ and ${z}_{n}$ are independent, and that temporal correlation is short.
we want to substitute ${s}_{n}$ with ${\hat{s}}_{n}$ . starting from the #nonlinear prediction algorithm:

in phase space, use trajectories which are close to the current one (both in the past and future portions), since the points inside that neighborhood belong to an ndimensional sphere with center at the vector state ${s}_{n}$ under consideration.

apply the nonlinear prediction algorithm afterwards: average the vectors in the neighborhood. thus the stochastic component will be largely cancelled out.
$${\hat{s}}_{{n}_{0}m/2}=\frac{1}{\mathcal{U}(\stackrel{\u20d7}{{s}_{{n}_{0}}})}\sum _{\stackrel{\u20d7}{{s}_{n}}\in \mathcal{U}(\stackrel{\u20d7}{{s}_{{n}_{0}}})}{s}_{nm/2}$$
• chaos theory
• Lyapunov coefficient
quantifies chaos levels in a dynamical system. it is the exponent in the exponential divergence measurement of phase space. the system has as many Lyapunov exponents as dimensions, positive ones are chaotic dimensions:
Lyapunov coefficient  kind of attractor 

λ < 0  stable fixed point 
λ = 0  stable limit cycle 
λ > 0  chaos 
λ → ∞  noise 
• estimation from time series
it is advisable to perform a #phase portrait beforehand, so as to rule out that the system is nondeterministic.
$$\hat{\lambda}(\Delta t)=\frac{1}{N}\sum _{{n}_{0}=1}^{N}ln\left(\frac{1}{\mathcal{U}(\stackrel{\u20d7}{{s}_{{n}_{0}}})}\phantom{\rule{0.278em}{0ex}}\sum _{\stackrel{\u20d7}{{s}_{n}}\in \mathcal{U}(\stackrel{\u20d7}{{s}_{{n}_{0}}})}{s}_{{n}_{0}+{\Delta}_{n}}{s}_{n+{\Delta}_{n}}\right)$$
caveats:
 embedding: if the #embedding dimension is too low divergence will be underestimated because of false neighbors.
 noise level: overestimated because of noise
 #stochastic correlations: if autocorrelation is big, the attractor will look compressed along a diagonal, underestimating true divergence
• fractal dimension
strange attractors are the flagship of chaos: trajectories are still deterministic (no crossroads) yet noncyclic (exact knowledge is required to tell accurately which trajectory the system is in). since some chaotic systems are also #dissipative, all of this implies that a subset of infinitelydense phase space is being relentlessly filled by a line of infinite detail.
• box counting
box counting is one of many methods to assess scaling property of fractals.
let ε be the side of each box (ndimensional squares), which are in turn used to segment the figure. then let N(ε) be the number of boxes. note that:
$$\underset{N(\epsilon )\to \infty}{lim}\phantom{\rule{0.278em}{0ex}}N(\epsilon )={\left(\frac{1}{\epsilon}\right)}^{D}$$
D being the boxcounting fractal dimension.
$$\u22a2\underset{N(\epsilon )\to \infty}{lim}\phantom{\rule{0.278em}{0ex}}ln(N(\epsilon ))=D\phantom{\rule{0.278em}{0ex}}ln(1/\epsilon )$$
$$\u22a2D=\underset{N(\epsilon )\to \infty}{lim}\phantom{\rule{0.278em}{0ex}}\frac{ln(N(\epsilon ))}{ln(1/\epsilon )}.$$
Example:
N(ε)  ε 

3^{0}  2^{0} 
3^{1}  2^{1} 
3^{2}  2^{2} 
...  ... 
3^{n}  2^{n} 
$$D=\underset{n\to \infty}{lim}\phantom{\rule{0.278em}{0ex}}\frac{ln({3}^{n})}{ln({2}^{n})}=\underset{n\to \infty}{lim}\phantom{\rule{0.278em}{0ex}}\frac{n\phantom{\rule{0.278em}{0ex}}ln(3)}{n\phantom{\rule{0.278em}{0ex}}ln(2)}=\frac{ln(3)}{ln(2)}=1.5849...$$
• correlation dimension
the correlation dimension can be used as an estimator of an attractor's dimension.(under certain circumstances), including fractal dimensions.
the idea now isn't to take the ratio of the number of boxes and their length ε in the limit to infinity. instead, we take the ratio of points (among all possible valid pairs) which fall within the neighborhood (defined by a radius of length ε) of some other point:
$$D=\underset{\epsilon \to 0}{lim}\phantom{\rule{0.278em}{0ex}}\underset{N\to \infty}{lim}\phantom{\rule{0.278em}{0ex}}\frac{\partial ln\phantom{\rule{0.278em}{0ex}}C(\epsilon ,N)}{\partial ln\epsilon};$$
where C(N, ϵ) is the correlation sum:
$$C(N,\u03f5)=\frac{2}{(N)(N1)}\sum _{i=1}^{N}\phantom{\rule{0.278em}{0ex}}\sum _{j=i+1+{n}_{min}}^{N}\mathit{\Theta}(\u03f5\stackrel{\u20d7}{{s}_{i}}\stackrel{\u20d7}{{s}_{j}}).$$
𝛩 is Heaviside's step function
• estimation from time series
given the lack of a phase space, it is necessary to #reconstruct it first. the right choice of delay τ during reconstruction of embedded vectors provides a suitable delay to estimate correlation dimension.
the aforementioned estimator of C(m, ϵ) is biased to underestimate the dimension, because data that are close in a trajectory of phase space tend to retain some temporal correlation.
the corrected estimator of C(m, ϵ) excludes sufficientlyclose vector pairs (in phase space); which usually correspond to subsequent vectors in the time series:
$$C(m,\u03f5)=\frac{2}{(N{n}_{min})(N{n}_{min}1)}\sum _{i=1}^{N}\phantom{\rule{0.278em}{0ex}}\sum _{j=i+1+{n}_{min}}^{N}\mathit{\Theta}(\u03f5\stackrel{\u20d7}{{s}_{i}}\stackrel{\u20d7}{{s}_{j}})$$
with ${n}_{min}={t}_{min}/\Delta t$ . ${t}_{min}$ doesn't necessarily have to be the average correlation time.
the following plots from Kantz & Schreiber's book^{5} come from a maplike system. for a ${t}_{min}=500\phantom{\rule{0.278em}{0ex}}steps$ only 2.5% of all pairs is lost in the corrected correlation dimension estimator. this is statistically negligible.
possible error sources:
 the correlation is a welldefined construction (for finite data sets), whereas the correlation dimension is just an extrapolation that must be handled with care.
 it isn't enough to observe fractal scaling for a single choice of m. saturation should also occur for many values.
 if measurements are too noisy, a scaling regime may never be observed. the dimension estimate can be as big as m whenever ϵ is small enough, so that only noise is captured by it.
 noise affects scales 3 orders of magnitude as big as its own level. in order to observe a scaling region, noise should be at most 1/8 of the whole attractor.
 it is therefore advisable to use a #noise reduction algorithm in advance.
 certain stochastic systems may pass for lowdimension deterministic systems.
 lowresolution digitization will make the estimate look smaller
 selfsimilarity may never be observed, even for goodquality deterministic data, depending on the manifold's largescale structure.
• minimum delay
a spacetime separation plot shows the proportion of points inside the neighborhood as a function of Δt and ϵ. we hope to find some value of Δt after which increments in ϵ don't add extra points. such delay marks the ideal choice of Δt when computing the correlation sum.
this is more rigorous a technique than using the autocorrelation distance.
• as a noise meter
figures 19 and 20 introduced the value ε that marks the limit between noise and scaling regimes. we found a resolution limit in the data, when neighborhoods were so tiny that they only capture randomly distributed points in all directions.
in a nutshell, given that ε is a distance in phase space units, we can define the noise level as the ratio between ε and the attractor's "size".
• neuroscience applications
 how not to use it:
Lehnertz and Elger (1998)^{6} analysed localized intracranial EEG data from 16 epilepsy patients. they claimed that a loss of complexity in brain activity should result from the synchronization of pathological cells before and during seizures.
linear techniques can only predict seizures a few seconds in advance. here the intention is to use the correlation dimension estimate as a complexity detector, even though it cannot actually estimate the fractal dimension for these data.
a state called state 2 coincides with preictal synchronization some 10 to 30 minutes in advance to the seizure. nothing is mentioned with regard to confounding sources of complexity reduction, though.
quasiscaling regions were observed for state 2 (D_{2}^{eff} = 3.7) and the seizure state (D_{2}^{eff} = 5.3), but not the control state. in other words, complexity apparently increased, counter to theory, and there's no base level to compare it with.
plots don't really show a clear power law either that would justify the reliability of their estimates. nor was any statistical hypothesis test performed using surrogate data.
 a more credible one?
Martinerie et al (1998)^{7} attempted to predict seizures by looking at phase transitions between two subsections in reconstructed phase space, also built from intracranial EEG series. it is claimed that for most cases (17 out of 19) it is possible to foretell seizure onset 2 to 6 seconds in advance, and that all of them follow a similar trajectory in phase space.
the alternative hypothesis is actually explored with surrogate data, and authors made sure to smooth noise out previous to the embedding. the number of dimensions (16) may be too unwieldy, though.
• hypothesis testing
the maximal Lyapunov coefficient and the correlation dimension come handy in detecting nonlinear dynamics, however they cannot establish such a fact by themselves. it is necessary to observe a clear scaling region.
a stricter analysis would test the hypothesis that observed data are due to a linear process. this implies that certain parameter's conditional distribution (given H_{0}) has to be empirically estimated, so that it can be contrasted with the real data.
• surrogate data
surrogate data is a akin to resampling methods (permutation, Monte Carlo, bootstrapping) for time series. surrogate data share many similarities with true data for some respects, while consistent with H_{0} for others. when testing for nonlinearity, Fourier phases are generated randomly while the power spectrum is retained, since the imaginary part is where nonlinearity could emerge from.
the general procedure for generating surrogate data is as follows:

decompose the signal in amplitudes and phases using the (fast, discrete) Fourier transform:
$$F[{s}_{0}(t)]\phantom{\rule{0.278em}{0ex}}\to \phantom{\rule{0.278em}{0ex}}\{A(f),\phantom{\rule{0.278em}{0ex}}\phi (f)\}$$

substitute the alleged nonlinear properties with random numbers drawn from the uniform distribution ranging from 0 to 2π radians (trying to make ${\zeta}_{Nk}={\zeta}_{k}$ , so that the imaginary part being odd will result in a substitute series belonging to ℝ. see the #Fourier transform properties section):
$$\{A(f),\phantom{\rule{0.278em}{0ex}}\zeta (f)\}$$

wrap the surrogate series back with the inverse Fourier transform:
$${s}_{1}(t)\phantom{\rule{0.278em}{0ex}}\leftarrow \phantom{\rule{0.278em}{0ex}}{F}^{1}[\{A(f),\phantom{\rule{0.278em}{0ex}}\zeta (f)\}]$$

test the series out (dimension estimation, Lyapunov coefficient, autocorrelation, nonlinear prediction error, etc.).

rinse and repeat till the parameter's distribution given H_{0} is dense enough to yield good typeI error estimates.
• statistical significance
as in standard hypothesis testing, for a significance level of α (type I error: probability that H_{0} is erroneously rejected) we would like at most α% of the area under the surrogate data curve to overlap the original measurements, or be equal or more extreme than our point measurement. such a rankbased test is a nonparametric way of deriving a pvalue.
for instance, in order to reject H_{0} with 95% of success, it would be enough to perform $\frac{1}{.05}1=19$ tests (for a oneside test. $\frac{2}{.05}1=39$ for the twotail case).
• as a determinism detector
does passing the test also mean that the system is deterministic (since we tested for nonlinear determinism)?
partially yes, with some probability. recall that all statistical inference is based on limited evidence and inductive reasoning:
accept H_{0}  reject H_{0}  

H_{0} is true  1α (confidence)  α (type I error) 
H_{0} is false  β (type II error)  1β (power) 
more caveats:
 we have assumed that the available power spectrum captures at least one whole period in the time domain (i.e. no systemic sampling bias due to longterm process). otherwise, random events in a nonstationary process could be misinterpreted as deviations from H_{0}.
 the process may not follow the conjectured probability distribution (this problem is mitigated by nonparametric tests).
 multiple comparisons problem: performing many independent tests overblows type I error. for those cases, significance levels should be made more stringent according to some correction method (Bonferroni e.g.)
• as a phase transition detector
the same nonlinearity tests have been used to distinguish states of a system.
compute the same statistic for different time segments and compare with one another. the reasoning follows what we did in the #stationarity test.

complexity doesn't imply advanced math, for it can emerge from very simple descriptions (see rule 110 for instance). that said, this post relies on calculus (including some differential equations), inferential statistics (also the fundamentals of information theory), bits of graph theory and linear algebra; most (or all) of which should be familiar to STEM undergraduates. topics like Fourier analysis and chaos theory are introduced from there. ↩

not to be confused with "complicated" or "difficult". ↩

we could have arrived at the same harmonic oscillator equation using the mechanics of a pendulum, or an LC circuit. See system equivalence. ↩

$\forall \stackrel{\u20d7}{z}\in \u2102\phantom{\rule{0.278em}{0ex}}(\stackrel{\u20d7}{z}=\sqrt{\stackrel{\u20d7}{z}\xb7\overline{\stackrel{\u20d7}{z}}})$ , because $\sqrt{\stackrel{\u20d7}{z}\xb7\overline{\stackrel{\u20d7}{z}}}=\sqrt{{z}_{1}^{2}+i{z}_{1}{z}_{2}i{z}_{1}{z}_{2}+{z}_{2}^{2}}$ ↩

Holger Kantz, Thomas Schreiber (2003). Nonlinear Time Series Analysis. Cambridge University Press. ↩

Klaus Lehnertz, Christian E. Elger. Can Epileptic Seizures be Predicted? Evidence from Nonlinear Time Series Analysis of Brain Electrical Activity (1998). Phys. Rev. Lett. 80, 5019. https://doi.org/10.1103/PhysRevLett.80.5019 ↩

Martinerie, J., Adam, C., Le Van Quyen, M., Baulac, M., Clemenceau, S., Renault, B., & Varela, F. J. (1998). Epileptic seizures can be anticipated by nonlinear analysis. Nature medicine, 4(10), 1173. https://doi.org/10.1038/2667 ↩