Table of contents
Single-hidden-layer neural network:
Optimization objective: Refining the $IW$ (input-layer weights), such that the hidden feature $H$ gets refined.
-
Iteration 1
-
Forward:
$$ \begin{aligned} H = IW ⋅ X \\\ Y_{pred} = \bm β ⋅ H \end{aligned} $$ -
Compute error E:
$$ E = T - Y_{pred} $$ -
Considering there is an imaginary data $P$ resulting in the error $E$ formulated by the equation: $E = \bm β ⋅ P$
Thus, the data $P$ can be solved by the pseudo-inverse of $\bm β$:
$$ P = \bm β⁺ ⋅ E $$ -
Using the $P$ to solve a “supplemental IW” $IW_{supp}$ by considering the relationship: $P = IW_{supp} ⋅ X$
$$ IW_{supp} = X⁺ ⋅ P $$ -
Update the input-layer weight by adding the supplemental $IW_{res}$ to the initial $IW$
$$ IW = IW + IW_{supp} $$ -
Update $\bm β$ based on the updated $IW$ and the equation $T = \bm β ⋅ H$:
$$ \begin{aligned} H = IW ⋅ X \\\ \bm β = H⁺ ⋅ T \end{aligned} $$ -
Compute the error at present:
$$ \begin{aligned} Y_{pred} = \bm β ⋅ H \\\ E = T - Y_{pred} \end{aligned} $$
-
-
Iteration 2:
- Compute $P$
- Compute supplemental $IW_{res}$
- Update $IW$
- Update $β$
- Compute $E$
-
Iteration 3:
- Perform the same 5 steps
The following may be wrong
This morning, I forgot the model architecture consists of 2 weight matrices. So, what I told you this morning is only refining a single weight matrix:
Given an equation: $Y_{pred} = A ⋅ X$, one wants to find the coefficient matrix A.
\begin{algorithm}
\caption{SNN}
\begin{algorithmic}
\STATE \COMMENT{The optimal A sovled by least squares with Moore-Penrose inverse:}
\STATE $A = X⁺ ⋅ Y_{pred}$
\STATE \COMMENT {There are still some errors E:}
\STATE $E = T - Y_{pred}$
\STATE \COMMENT {By considering the E is attributed to an imaginary data P, there is: $E = A ⋅ P$}
\STATE \COMMENT {The data P can be solved as:}
\STATE $P = A⁺ ⋅ E$
\STATE \COMMENT {To fit the error E, we can use another coeff. matrix A₂ and the equation: $E = A_2 ⋅ P$}
\STATE \COMMENT {So, the A₂ can be solved as:}
\STATE $A_2 = P⁺ ⋅ E$
\STATE \COMMENT {Update A:}
\STATE $A = A + A_2$
\STATE \COMMENT {Compute the new error:}
\STATE $E = T - A ⋅ X$
\STATE Go to line \#5.
\end{algorithmic}
\end{algorithm}
(2024-09-30)
- 训练经典的神经网络中的参数,是使用反向传播和梯度下降这一套优化方法, 而训练子网络是不断引入新的权重,每一次迭代是用新权重去“解释”残差,最后把所有的权重合并起来。
Doubt Priviledge
Question:
-
What are the fundamental difference betwen deep neural networks and single layer wide networks?
-
Which one is better?
-
Why people all like deep nets, rather than wide nets?
Spculations:
(2024-11-11)
-
An accuate inverse matrix is not easy to obtained.
-
SLFNN needs to calculate the inverse of a giant matrix, which is difficult and not accurate.
Also, the weight matrix could be singular. this may bring error when computing inverse, although a singular matrix can be inversible by using a regularization term.
Hence, the precison of the optimized parameters could be low.
-