我应该怎么做。我有一张黑白图像(100×100像素):
我需要用这张图片训练一个反向传播神经网络。输入是图像的x、y坐标(从0到99),输出是1(白色)或0(黑色)。
一旦网络学习完成,我希望它能够根据其权重重现图像,并获得尽可能接近原始图像的图像。
这是我的反向传播实现:
import osimport mathimport Imageimport randomfrom random import sample#------------------------------ class definitionsclass Weight: def __init__(self, fromNeuron, toNeuron): self.value = random.uniform(-0.5, 0.5) self.fromNeuron = fromNeuron self.toNeuron = toNeuron fromNeuron.outputWeights.append(self) toNeuron.inputWeights.append(self) self.delta = 0.0 # delta value, this will accumulate and after each training cycle used to adjust the weight value def calculateDelta(self, network): self.delta += self.fromNeuron.value * self.toNeuron.errorclass Neuron: def __init__(self): self.value = 0.0 # the output self.idealValue = 0.0 # the ideal output self.error = 0.0 # error between output and ideal output self.inputWeights = [] self.outputWeights = [] def activate(self, network): x = 0.0; for weight in self.inputWeights: x += weight.value * weight.fromNeuron.value # sigmoid function if x < -320: self.value = 0 elif x > 320: self.value = 1 else: self.value = 1 / (1 + math.exp(-x))class Layer: def __init__(self, neurons): self.neurons = neurons def activate(self, network): for neuron in self.neurons: neuron.activate(network)class Network: def __init__(self, layers, learningRate): self.layers = layers self.learningRate = learningRate # the rate at which the network learns self.weights = [] for hiddenNeuron in self.layers[1].neurons: for inputNeuron in self.layers[0].neurons: self.weights.append(Weight(inputNeuron, hiddenNeuron)) for outputNeuron in self.layers[2].neurons: self.weights.append(Weight(hiddenNeuron, outputNeuron)) def setInputs(self, inputs): self.layers[0].neurons[0].value = float(inputs[0]) self.layers[0].neurons[1].value = float(inputs[1]) def setExpectedOutputs(self, expectedOutputs): self.layers[2].neurons[0].idealValue = expectedOutputs[0] def calculateOutputs(self, expectedOutputs): self.setExpectedOutputs(expectedOutputs) self.layers[1].activate(self) # activation function for hidden layer self.layers[2].activate(self) # activation function for output layer def calculateOutputErrors(self): for neuron in self.layers[2].neurons: neuron.error = (neuron.idealValue - neuron.value) * neuron.value * (1 - neuron.value) def calculateHiddenErrors(self): for neuron in self.layers[1].neurons: error = 0.0 for weight in neuron.outputWeights: error += weight.toNeuron.error * weight.value neuron.error = error * neuron.value * (1 - neuron.value) def calculateDeltas(self): for weight in self.weights: weight.calculateDelta(self) def train(self, inputs, expectedOutputs): self.setInputs(inputs) self.calculateOutputs(expectedOutputs) self.calculateOutputErrors() self.calculateHiddenErrors() self.calculateDeltas() def learn(self): for weight in self.weights: weight.value += self.learningRate * weight.delta def calculateSingleOutput(self, inputs): self.setInputs(inputs) self.layers[1].activate(self) self.layers[2].activate(self) #return round(self.layers[2].neurons[0].value, 0) return self.layers[2].neurons[0].value#------------------------------ initialize objects etcinputLayer = Layer([Neuron() for n in range(2)])hiddenLayer = Layer([Neuron() for n in range(10)])outputLayer = Layer([Neuron() for n in range(1)])learningRate = 0.4network = Network([inputLayer, hiddenLayer, outputLayer], learningRate)# let's get the training setos.chdir("D:/stuff")image = Image.open("backprop-input.gif")pixels = image.load()bbox = image.getbbox()width = 5#bbox[2] # image widthheight = 5#bbox[3] # image heighttrainingInputs = []trainingOutputs = []b = w = 0for x in range(0, width): for y in range(0, height): if (0, 0, 0, 255) == pixels[x, y]: color = 0 b += 1 elif (255, 255, 255, 255) == pixels[x, y]: color = 1 w += 1 trainingInputs.append([float(x), float(y)]) trainingOutputs.append([float(color)])print "\nOriginal image ... Black:"+str(b)+" White:"+str(w)+"\n"#------------------------------ let's trainfor i in range(500): for j in range(len(trainingOutputs)): network.train(trainingInputs[j], trainingOutputs[j]) network.learn() for w in network.weights: w.delta = 0.0#------------------------------ let's checkb = w = 0for x in range(0, width): for y in range(0, height): out = network.calculateSingleOutput([float(x), float(y)]) if 0.0 == round(out): color = (0, 0, 0, 255) b += 1 elif 1.0 == round(out): color = (255, 255, 255, 255) w += 1 pixels[x, y] = color #print outprint "\nAfter learning the network thinks ... Black:"+str(b)+" White:"+str(w)+"\n"
显然,我的实现存在一些问题。上面的代码返回:
原始图像…黑色:21 白色:4
学习后网络认为…黑色:25 白色:0
如果我尝试使用更大的训练集(我仅测试上述图像中的25个像素以进行测试),它也会做同样的事情。它返回说学习后所有像素都应该是黑色的。
现在,如果我使用像这样的手动训练集:
trainingInputs = [ [0.0,0.0], [1.0,0.0], [2.0,0.0], [0.0,1.0], [1.0,1.0], [2.0,1.0], [0.0,2.0], [1.0,2.0], [2.0,2.0]]trainingOutputs = [ [0.0], [1.0], [1.0], [0.0], [1.0], [0.0], [0.0], [0.0], [1.0]]#------------------------------ let's trainfor i in range(500): for j in range(len(trainingOutputs)): network.train(trainingInputs[j], trainingOutputs[j]) network.learn() for w in network.weights: w.delta = 0.0#------------------------------ let's checkfor inputs in trainingInputs: print network.calculateSingleOutput(inputs)
例如,输出是:
0.0330125791296 # this should be 0, OK0.953539182136 # this should be 1, OK0.971854575477 # this should be 1, OK0.00046146137467 # this should be 0, OK0.896699762781 # this should be 1, OK0.112909223162 # this should be 0, OK0.00034058462280 # this should be 0, OK0.0929886299643 # this should be 0, OK0.940489647869 # this should be 1, OK
换句话说,网络猜对了所有像素(黑色和白色)。如果我使用来自图像的实际像素而不是像上面那样硬编码的训练集,为什么它说所有像素都应该是黑色的?
我尝试更改隐藏层中的神经元数量(最多100个神经元),但没有成功。
这是一个作业。
这也是我对反向传播的之前问题的延续。
回答:
已经有一段时间了,但我确实获得了这方面的学位,所以我认为希望其中一些已经掌握了。
从我所能判断的来看,您将中间层的神经元过度地加载了输入集。也就是说,您的输入集包含10,000个离散输入值(100像素x 100像素);您试图将这10,000个值编码到10个神经元中。这种级别的编码很困难(我怀疑这是可能的,但肯定很困难);至少,您需要大量的训练(超过500次运行)才能使其合理地重现。即使对于中间层有100个神经元,您也正在进行相对密集的压缩级别(100个像素到1个神经元)。
至于如何解决这些问题;嗯,这很棘手。您可以显着增加中间神经元的数量,并且会获得合理的效果,但当然这需要很长时间才能训练。但是,我认为可能有一种不同的解决方案;如果可能,您可以考虑使用极坐标而不是笛卡尔坐标作为输入;快速观察输入模式表明具有高度的对称性,实际上您正在查看沿角度坐标重复可预测变形的线性模式,这似乎可以在少量中间层神经元中很好地编码。
这些东西很棘手;为模式编码寻找通用解决方案(就像您的原始解决方案一样)非常复杂,通常(即使有大量中间层神经元)也需要大量的训练传递;另一方面,一些高级启发式任务分解和稍微重新定义问题(即,提前从笛卡尔坐标转换为极坐标)可以为定义明确的问题集提供良好的解决方案。当然,其中存在永久的摩擦;通用解决方案很难获得,但稍微更具体指定的解决方案确实非常好。
无论如何,有趣的东西!