机器学习短期课程笔记

Instructor

Prof. Sergio Matteo Savaresi (sergio.savaresi@polimi.it)

How to Solve It?

Problems

Black Box Estimation (Transfer Function) with Input ${u_1,\cdots,u_N}$ & Ouput ${y_1,\cdots,y_N}$
Time Series Modeling and Prediction with Ouput ${y_1,\cdots,y_N}$
Virtual/Software Sensing

\[z = e^{T_s \cdot s}\]

Three Equivalent Representations

S.S.
T.F.
I.R.

I.R $\to$ S.S

4SID = Subspace-based State Space System IDentification

Observability & Controllability

\[\mathbf{O}= \begin{bmatrix} H\\ HF\\ \vdots\\ HF^{n-1}\\ \end{bmatrix}\] \[\mathbf{R}= \begin{bmatrix} G \quad FG \quad \cdots \quad F^{n-1}G\\ \end{bmatrix}\]

Decompose

Hankel Matrix $\mathbf{H_n} = \begin{bmatrix} \omega(1)\quad\omega(2)\quad\cdots\quad\omega(n)\\ \omega(2)\quad\omega(3)\quad\cdots\quad\omega(n+1)\\ \vdots\quad\vdots\quad\quad\vdots\quad\vdots\\ \omega(n)\quad\omega(n+1)\quad\cdots\quad\omega(2n-1)\\ \end{bmatrix}_{n\times n}$

\[\mathbf{H_n}=\mathbf{O} \mathbf{R}\]

Find the ORDER of the System $\mathbf{n}$
Factorization $\mathbf{H_{n+1}}=\mathbf{O_{n+1}R_{n+1}}$

note $\mathbf{O_1}=\mathbf{O_{n+1}(1:n;:)} \\ \mathbf{O_2}=\mathbf{O_{n+1}(2:n+1;:)} \\$ then $\hat{G}=\mathbf{R_{n+1}(:;1)} \\ \hat{H}=\mathbf{O_{n+1}(1;:)} \\ \hat{F}=\mathbf{O_1^{-1}O_2} \\$ But USELESS, Because of NOISY. This idea went sleeping until, at the end of 80s, a new numerical algebra tool was fully developed $\to$ Singular value decomposition (SVD).

SVD

\[\mathbf{M_{m\times n}=U_{m\times m} \sum_{m\times n} V_{n\times n}}\]

Stochastic Process

\[\mathbf{m(t)=E[v(t,s)]}\\ \mathbf{\delta(t_1,t_2)=E[(v(t_1,s)-m(t_1))(v(t_2,s)-m(t_2))]}\]

Stationary Stochastic Process (S.S.P.)

\[\mathbf{m(t)=m \quad \forall t}\\ \mathbf{\delta(\tau)=E[(v(t)-m)(v(t-\tau)-m)]}\]

White Noise

\[\mathbf{e(t)\sim WN(\mu,\lambda^2) \Leftrightarrow E[e(t)]=\mu, \delta_e(0)=\lambda^2, \delta_e(\tau)=0\,\forall \tau\neq 0}\]

Practical Way

\[\mathbf{\hat{m}_{yN}=\dfrac{1}{N}\sum_{t=1}^{N}y(t)}\\ \mathbf{\hat{\delta}_{yN}=\dfrac{1}{N-\tau}\sum_{t=1}^{N-\tau}(y(t)-\hat{m}_{yN})(y(t+\tau)-\hat{m}_{yN})}\]

ARMA(m,n)

\[\mathbf{y(t)=a_1y(t-1)+\cdots+a_my(t-m)+c_0e(t)+c_1e(t-1)+\cdots+c_ne(t-n)}\]

ARMA is NOT a Strictly Proper System

FIR Filter & IIR Filter

$\mathbf{W(z)\to W(e^{j\omega})}$

Representation of ARMA

Time Domain
Transfer Function
Probability Domain
Frequency Domain

2 Steps

Find the BEST Model $y(t)=\dfrac{C(z)}{A(z)}e(t)$ from DATA.
1. Experiment/Data-collection
2. Select a parametric class of Models: $\mathbf{m(\theta)}$
3. Select a performance index: $\mathbf{J(\theta) \geq 0}$
4. Optimization Step: $\mathbf{\hat{\theta}_N=argmin_\theta\{J_N(\theta)\}}$
  - Quadratic Function: AR $\to$ Least Square Solution
  - Without local minima
  - Local minima $\to$ Iterative Method: ARMA or MA
5. Validation
Order selection of Model:

Cross-Validation
Using this Model, make the BEST Prediction. Information-free: $E[\hat{y}(t+k|t)\cdot \varepsilon(t+k)]=0$

\[\mathbf{\hat{y}(t+k|t)=\dfrac{\tilde R(z)}{C(z)}\cdot y(t)}\]

ARMAX(m,n,p+1)

Continue