## 1.Knowledge Representation

Knowledge refers to stored information or models used by a person or machine to interpret, predict, and appropriately respond to the outside world.

Fischler and Firschein Intelligence: The Eye，the Brain and the Computer

1). What information is actually made explicit(明确的；清楚的；直率的；详述的);
2). How the information is physically encoded for subsequent(后来的，随后的) use.

1). Prior information = the known facts.
2). Observation (measurements). Usually noisy, but give examples(prototypes) for training the neural network.

Rule 1. Similar inputs from similar classes should produce similar representations inside the network, and they should be classiﬁed to the same category.
Rule 2. Items to be categorized as separate classes should be given widely diﬀerent representations in the network.
Rule 3. If a particular feature is important, there should be a large number of neurons involved in representing it in the network.
Rule 4. Prior information and invariances should be built into the design of a neural network.

1). Restricting the network architecture through the use of local connections known as receptive ﬁelds(接受域).
2). Constraining the choice of synaptic weights through the use of weight-sharing.

1). Invariance by Structure - Synaptic connections between the neurons are created so that transformed versions of the same input are forced to produce the same output. Drawback: the number of synaptic connections tends to grow very large.
2). Invariance by Training - The network is trained using diﬀerent examples of the same object corresponding to diﬀerent transformations (for example rotations). Drawbacks: computational load, generalization ability for other objects.
3). Invariant feature space - Try to extract features of the data invariant to transformations. Use these instead of the original input data. Probably the most suitable technique to be used for neural classiﬁers. Requires prior knowledge on the problem.

## 2.Basic Learning Rules

Learning is a process by which the free parameters of a neural network are adapted through a process of stimulation by the environment in which the network is embedded. The type of learning is determined by the manner in which the parameters changes take place.

Simon Haykin Neural Networks: A Comprehensive Foundation

1). The neural network is stimulated by an envirnment.
2). The neural network undergoes changes in its free parameters as a result of this stimulation.
3). The neural network responds in a new way to the envirnment because of the changes that have occurred in its internal structure.

#### 2.1 Error-Correction Learning

Error-Correction的学习方法的核心思想是：对于给定输入，优化权值(weights)使得输出(设为$y_{k}(n)$)与真实值(设为$d_{k}(n)$) 的偏差最小。我们先定义一个error signal如下：

$e_{k}(n) = d_{k}(n) - y_{k}(n)$.

$\mathscr{E}(n) = \frac{1}{2}e^{2}_{k}(n)$.

$\Delta w_{kj}(n) = \eta e_{k}(n)x_{j}(n)$.

#### 2.2 Memory-Based Learning

Memory-Based Learning，顾名思义，是一种将past experiences全部保存起来的策略。假设我们的经验数据集为：

{$(x_{1},d_{1}),(x_{2},d_{2}),...,(x_{N},d_{N})$}.

#### 2.3 Hebbian Learning

Hebb规则是最古老也是最流行的NN学习规则，现在一般都是它的扩展版的规则，其基本思想是根据联接的神经元的活化 水平改变权，即两种神经元间联接权的变化与两神经元的活化值（激活值）相关，若突触(connection)两端的两神经元同时 兴奋，则联接加强；若不同时兴奋，则联接减弱甚至忽略。

Hebbian规则有以下几个特点：
Time-dependent: 权值修正仅发生于突触前(如输入$x_{i}$)和突触后(如输出$y_{j}$)同时存在信号的时候；
Local: 仅使用神经元能够取得的局部的信息；
Interactive: 权值修正同时依赖于突触前和突触后，信号间的交互可以是确定性的或随机的；
Conjunctional or Correlational: 突触前与突触后的信号产生时间与权值修正是密切相关的。

$\Delta w_{kj}(n) = F(y_{k}(n),x_{j}(n))$.

#### 2.5 Boltzmann Learning

Boltzmann的学习方法是一种随机化的学习方法，它结合随机过程、概率和能量等概念来调整网络的变量，从而使网络的能量函数最小（或最大）。 在学习过程中，网络变量的随机变化不是完全随机的，而是据能量函数的改变有指导的进行。网络的变量可以是联接权，也可以是神经元的状 态。能量函数可定义为问题的目标函数或者网络输出的均方差函数。 基于Boltzmann的学习方法的NN称为Boltzmann机，关于Boltzmann机的更多 详细内容将会在后续文章中深入讨论。

## 3.Learning Methodologies

#### 3.1 Credit-Assignment Problem

Credit Assignment(CA) Problem是指，一个learning machine的输出结果应该归功于或归咎于哪些内部或中间decision。在很多情况下，输出结果是由一些列的 actions来决定的，也就是说，中间决策过程影响需要采取的特定的action，然后这些action而不是那些decision直接影响最终的输出的。在这种情况下，我们 可以将这个CA问题分解为两个子问题：
1). The assignment of credit for outcomes to actions. This is called the Temporal Credit-Assignment problem in that it involves the instants of time when the actions that deserve credit were actually taken.
2). The assignment of credit for actions to internal decisions. This is called the Structural Credit-Assignment problem in that it involves assigning credit to the internal strucures of actions generated by the system.

PS：这一节看的云里雾里的，似懂非懂，感觉有点脱离NN的样子，但这ms是一个general的问题，所以其中的一些术语也是general的，比如decision，action，credit等， 导致理解起来比较困难，:-(

#### 3.2 Learning with a Teacher

Learning with a Teacher也就是supervised learning(监督学习)，Error-Correction的学习方法就属于这种。在监督学习中，对于分类或识别问题，输入数据 不仅包含输入的feature，还包含它对应的label，即它所属的类别(也就是teacher提供的answer)。Error-Correction的学习方法的目标函数就是使NN的输出与Teacher的 answer的差异最小，即均方误差最小。经过监督学习之后，NN应该能够在不需要Teacher的情况下对新数据进行处理(分类或识别等)。

#### 3.3 Learning without a Teacher

Learning without a Teacher包含两种学习方法：非监督学习(Unsupervised Learning)和增强学习(Reinforcement Learning)。在非监督学习中， 没有Teacher指导学习过程，也没有可用的critic，此时NN只能尝试着学习出数据中隐含的统计规律，例如用一个适合的线性模型来区分输入数据。 Competitive Learning和Hebbian Learning都算是非监督型学习。经过非监督学习之后，NN可以对输入数据进行特征编码。

#### 4.1 Pattern Association

Associative Memory是一种像大脑一样分布式的、learns by association的memory。Association是人类记忆的主要特点，它可以分为 autoassociationheteroassociation。在autoassociation中，NN需要通过不断的将patterns(vectors)呈现给NN来保存一个pattern集合，最后 NN会呈现原始pattern的部分描述或包含噪声的version，而我们的任务就是要恢复这个特定的pattern。而在heteroassociation中，任意一个输入的pattern集与 另外人一个输出的pattern集是成对的。Autoassociation使用非监督的学习方法，而heteroassociation使用监督学习的方法。
PS:这一段表示看不太懂，有些概念无法理解！

#### 4.3 Function Approximation

$\mathscr{F} =$ { $(x_{1},d_{1}),(x_{2},d_{2}),...,(x_{N},d_{N})$ }.

#### 4.4 Control

NN也可以用于控制系统，例如用在误差反馈控制系统中。

## 推荐资料

Machine Learning Lecture by Andrew Ng, Stanford University
Lecture VIII: Neural Network - Representation
Lecture IX: Neural Network - Learning
Video courses on Coursera: https://class.coursera.org/ml-2012-002/lecture/index
Lecture homepage in Standford: http://cs229.stanford.edu/

## 参考文献

[1] Simon Haykin, “Neural Networks: a Comprehensive Foundation”, 2009 (3rd edition)
[2]T-61.3030 PRINCIPLES OF NEURAL COMPUTING (5 CP)