【发布时间】:2011-04-28 15:27:02
【问题描述】:
EDIT2:
新的训练集...
输入:
[
[0.0, 0.0],
[0.0, 1.0],
[0.0, 2.0],
[0.0, 3.0],
[0.0, 4.0],
[1.0, 0.0],
[1.0, 1.0],
[1.0, 2.0],
[1.0, 3.0],
[1.0, 4.0],
[2.0, 0.0],
[2.0, 1.0],
[2.0, 2.0],
[2.0, 3.0],
[2.0, 4.0],
[3.0, 0.0],
[3.0, 1.0],
[3.0, 2.0],
[3.0, 3.0],
[3.0, 4.0],
[4.0, 0.0],
[4.0, 1.0],
[4.0, 2.0],
[4.0, 3.0],
[4.0, 4.0]
]
输出:
[
[0.0],
[0.0],
[0.0],
[0.0],
[0.0],
[0.0],
[0.0],
[0.0],
[0.0],
[0.0],
[0.0],
[0.0],
[0.0],
[0.0],
[0.0],
[0.0],
[0.0],
[0.0],
[1.0],
[1.0],
[0.0],
[0.0],
[0.0],
[1.0],
[1.0]
]
编辑1:
我已经用我的最新代码更新了这个问题。我修复了一些小问题,但在网络学习后,所有输入组合的输出仍然相同。
这里是反向传播算法的解释:Backprop algorithm
是的,这是一个家庭作业,一开始就说明这一点。
我应该在一个简单的神经网络上实现一个简单的反向传播算法。
我选择 Python 作为这项任务的首选语言,并且我选择了这样的神经网络:
3 层:1 个输入层、1 个隐藏层、1 个输出层:
O O
O
O O
输入神经元上都有一个整数,输出神经元上有 1 或 0。
这是我的整个实现(有点长)。下面我将选择较短的相关 sn-ps,我认为错误可能位于:
import os
import math
import Image
import random
from random import sample
#------------------------------ class definitions
class Weight:
def __init__(self, fromNeuron, toNeuron):
self.value = random.uniform(-0.5, 0.5)
self.fromNeuron = fromNeuron
self.toNeuron = toNeuron
fromNeuron.outputWeights.append(self)
toNeuron.inputWeights.append(self)
self.delta = 0.0 # delta value, this will accumulate and after each training cycle used to adjust the weight value
def calculateDelta(self, network):
self.delta += self.fromNeuron.value * self.toNeuron.error
class Neuron:
def __init__(self):
self.value = 0.0 # the output
self.idealValue = 0.0 # the ideal output
self.error = 0.0 # error between output and ideal output
self.inputWeights = []
self.outputWeights = []
def activate(self, network):
x = 0.0;
for weight in self.inputWeights:
x += weight.value * weight.fromNeuron.value
# sigmoid function
if x < -320:
self.value = 0
elif x > 320:
self.value = 1
else:
self.value = 1 / (1 + math.exp(-x))
class Layer:
def __init__(self, neurons):
self.neurons = neurons
def activate(self, network):
for neuron in self.neurons:
neuron.activate(network)
class Network:
def __init__(self, layers, learningRate):
self.layers = layers
self.learningRate = learningRate # the rate at which the network learns
self.weights = []
for hiddenNeuron in self.layers[1].neurons:
for inputNeuron in self.layers[0].neurons:
self.weights.append(Weight(inputNeuron, hiddenNeuron))
for outputNeuron in self.layers[2].neurons:
self.weights.append(Weight(hiddenNeuron, outputNeuron))
def setInputs(self, inputs):
self.layers[0].neurons[0].value = float(inputs[0])
self.layers[0].neurons[1].value = float(inputs[1])
def setExpectedOutputs(self, expectedOutputs):
self.layers[2].neurons[0].idealValue = expectedOutputs[0]
def calculateOutputs(self, expectedOutputs):
self.setExpectedOutputs(expectedOutputs)
self.layers[1].activate(self) # activation function for hidden layer
self.layers[2].activate(self) # activation function for output layer
def calculateOutputErrors(self):
for neuron in self.layers[2].neurons:
neuron.error = (neuron.idealValue - neuron.value) * neuron.value * (1 - neuron.value)
def calculateHiddenErrors(self):
for neuron in self.layers[1].neurons:
error = 0.0
for weight in neuron.outputWeights:
error += weight.toNeuron.error * weight.value
neuron.error = error * neuron.value * (1 - neuron.value)
def calculateDeltas(self):
for weight in self.weights:
weight.calculateDelta(self)
def train(self, inputs, expectedOutputs):
self.setInputs(inputs)
self.calculateOutputs(expectedOutputs)
self.calculateOutputErrors()
self.calculateHiddenErrors()
self.calculateDeltas()
def learn(self):
for weight in self.weights:
weight.value += self.learningRate * weight.delta
def calculateSingleOutput(self, inputs):
self.setInputs(inputs)
self.layers[1].activate(self)
self.layers[2].activate(self)
#return round(self.layers[2].neurons[0].value, 0)
return self.layers[2].neurons[0].value
#------------------------------ initialize objects etc
inputLayer = Layer([Neuron() for n in range(2)])
hiddenLayer = Layer([Neuron() for n in range(100)])
outputLayer = Layer([Neuron() for n in range(1)])
learningRate = 0.5
network = Network([inputLayer, hiddenLayer, outputLayer], learningRate)
# just for debugging, the real training set is much larger
trainingInputs = [
[0.0, 0.0],
[1.0, 0.0],
[2.0, 0.0],
[0.0, 1.0],
[1.0, 1.0],
[2.0, 1.0],
[0.0, 2.0],
[1.0, 2.0],
[2.0, 2.0]
]
trainingOutputs = [
[0.0],
[1.0],
[1.0],
[0.0],
[1.0],
[0.0],
[0.0],
[0.0],
[1.0]
]
#------------------------------ let's train
for i in range(500):
for j in range(len(trainingOutputs)):
network.train(trainingInputs[j], trainingOutputs[j])
network.learn()
#------------------------------ let's check
for pattern in trainingInputs:
print network.calculateSingleOutput(pattern)
现在的问题是,在学习网络后,对于所有输入组合,即使是那些应该接近 1.0 的组合,似乎都返回了一个非常接近 0.0 的浮点数。
我在 100 个循环中训练网络,在每个循环中我都这样做:
对于训练集中的每一组输入:
- 设置网络输入
- 使用 sigmoid 函数计算输出
- 计算输出层中的错误
- 计算隐藏层中的错误
- 计算权重的增量
然后我根据学习率和累积的增量调整权重。
这是我的神经元激活函数:
def activationFunction(self, network):
"""
Calculate an activation function of a neuron which is a sum of all input weights * neurons where those weights start
"""
x = 0.0;
for weight in self.inputWeights:
x += weight.value * weight.getFromNeuron(network).value
# sigmoid function
self.value = 1 / (1 + math.exp(-x))
这就是我计算增量的方式:
def calculateDelta(self, network):
self.delta += self.getFromNeuron(network).value * self.getToNeuron(network).error
这是我算法的一般流程:
for i in range(numberOfIterations):
for k,expectedOutput in trainingSet.iteritems():
coordinates = k.split(",")
network.setInputs((float(coordinates[0]), float(coordinates[1])))
network.calculateOutputs([float(expectedOutput)])
network.calculateOutputErrors()
network.calculateHiddenErrors()
network.calculateDeltas()
oldWeights = network.weights
network.adjustWeights()
network.resetDeltas()
print "Iteration ", i
j = 0
for weight in network.weights:
print "Weight W", weight.i, weight.j, ": ", oldWeights[j].value, " ............ Adjusted value : ", weight.value
j += j
输出的最后两行是:
0.552785449458 # this should be close to 1
0.552785449458 # this should be close to 0
它实际上返回所有输入组合的输出编号。
我错过了什么吗?
【问题讨论】:
-
我认为您将不得不自己做更多的工作——这比您合理期望人们为您调试的代码要多。在所有重要位置添加
logging.log语句以跟踪边缘的权重,并使用计算器通过数字计算几个步骤以查看它们的不一致之处。 -
阅读:stackoverflow.com/questions/3704570/…。对于 Bayseian 滤波器,这是一个标准问题,具有标准解决方案。对于非常非常小的浮点数,您似乎也有同样的标准问题。
-
@katrielalex 是的,当然,我也会继续努力。
-
@S.Lott:问题不可能来自那里,因为 OP 已经使用对数作为权重,这就是为什么需要
math.exp。这导致了另一个问题:当 x 变得太小或太大时,python 会引发异常,但这与观察到的虚假行为无关(只是一个普通的旧错误)。 -
只需在
calculateSingleOutput中添加:self.layers[2].runActivationFunctionForAllNeurons(self)即可。但是除了错误修正之外,收敛性不如您编辑后的第一个版本,这令人惊讶。我看不出哪个更改会产生这种效果。
标签: python algorithm math neural-network