设神经网络有\(n\)层
第\(i\)层有\(m_i\)个神经元
\(C(...)=\sum_{j=1}^{m_n}(a_j^n-ans_j)^2\)
\(z_j^l=b_j^l+\sum_{k=1}^{m_{l-1}}w_{jk}^la_k^{l-1}\)
\(a_j^l=\sigma(z_j^l)\)
\(\frac{\partial C}{\partial w_{jk}^{l}}\)
\(=\frac{\partial C}{\partial a_j^l}\frac{\partial a_j^l}{\partial z_j^l}\frac{\partial z_j^l}{\partial w_{jk}^{l}}\)
\(=\sigma‘(z_j^l)a_k^{l-1}\frac{\partial C}{\partial a_j^l}\)
\(\frac{\partial C}{\partial a_j^l}\)
\(=\sum_{k=1}^{m_{l+1}}\frac{\partial C}{\partial a_k^{l+1}}\frac{\partial a_k^{l+1}}{\partial z_k^{l+1}}\frac{\partial z_k^{l+1}}{\partial a_j^l}\)
\(=\sum_{k=1}^{m_{l+1}}\sigma‘(z_k^{l+1})w_{kj}^{l+1}\frac{\partial C}{\partial a_k^{l+1}}\)
\(\frac{\partial C}{\partial a_j^n}=\sum_{j=1}^{m_n}2(a_j^n-ans_j)\)
原文:https://www.cnblogs.com/int-2147483648/p/14770877.html