卷积神经网络(Convolutional Neural Networks, CNN)是一类包含卷积计算且具有深度结构的前馈神经网络(Feedforward Neural Networks),是深度学习(deep learning)的代表算法之一 。卷积神经网络具有表征学习(representation learning)能力,能够按其阶层结构对输入信息进行平移不变分类(shift-invariant classification),因此也被称为“平移不变人工神经网络(Shift-Invariant Artificial Neural Networks, SIANN)
此篇文章我将使用tensorflow的keras库尝试搭建精简过的alexnet演示卷积神经网络
导入cifar10训练集
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.cifar10.load_data()
train_images, test_images = train_images / 255.0, test_images / 255.0
class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer',
'dog', 'frog', 'horse', 'ship', 'truck']
print(train_labels.shape)
train_labels=train_labels.squeeze(axis=1)
print(train_labels.shape)
test_labels=test_labels.squeeze(axis=1)
(50000, 1)
(50000,)
发现训练标签与测试标签多了一个无用的轴,为了后续方便处理数据,去除无用的轴
查看是否成功导入
plt.imshow(train_images[0])
plt.show()
print(class_names[train_labels[0]])

建立keras神经网络
model=tf.keras.Sequential([
tf.keras.layers.Conv2D(64,input_shape=(32,32,3),kernel_size=(3,3),activation='relu',padding='same'),
tf.keras.layers.MaxPool2D(),
tf.keras.layers.Conv2D(256,kernel_size=(3,3),activation='relu',padding='same'),
tf.keras.layers.MaxPool2D(),
tf.keras.layers.Conv2D(256,kernel_size=(3,3),activation='relu',padding='same'),
tf.keras.layers.Conv2D(128,kernel_size=(3,3),activation='relu',padding='same'),
tf.keras.layers.Conv2D(128,kernel_size=(3,3),activation='relu',padding='same'),
tf.keras.layers.MaxPool2D(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(256,activation='relu'),
tf.keras.layers.Dense(128,activation='relu'),
tf.keras.layers.Dense(10)
])
建立的神经网络一共有12层,第一层是卷积层,卷积核大小为3*3,使用relu作为激活函数
第二层是池化层,默认缩放1/2
第三层也是卷积层
第四层为池化层
第五第六层和第七层连续三次卷积
第八层为池化层
第九层为平坦层,将二维图片转换为线性数据
第十层与第十一层为全连接层,激活函数同样使用relu
第十二层为输出层,由于样本有10类,所以神经元数量为10
AlexNet使用ReLU代替了Sigmoid,其能更快的训练,同时解决sigmoid在训练较深的网络中出现的梯度消失问题
模型总结
model.summary()
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 32, 32, 64) 1792 max_pooling2d (MaxPooling2D (None, 16, 16, 64) 0 ) conv2d_1 (Conv2D) (None, 16, 16, 256) 147712 max_pooling2d_1 (MaxPooling (None, 8, 8, 256) 0 2D) conv2d_2 (Conv2D) (None, 8, 8, 256) 590080 conv2d_3 (Conv2D) (None, 8, 8, 128) 295040 conv2d_4 (Conv2D) (None, 8, 8, 128) 147584 max_pooling2d_2 (MaxPooling (None, 4, 4, 128) 0 2D) flatten (Flatten) (None, 2048) 0 dense (Dense) (None, 256) 524544 dense_1 (Dense) (None, 128) 32896 dense_2 (Dense) (None, 10) 1290 ================================================================= Total params: 1,740,938 Trainable params: 1,740,938 Non-trainable params: 0 _________________________________________________________________
这里可以发现faltten层之前数据就已经被不断卷积为64个1*1的特征了,(后续操作证明即使这样模型也是可以跑的),这也是我删除一层卷积层的原因,因为到后面池化不了了
训练模型
model.compile(optimizer='adam',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),metrics='accuracy')
history=model.fit(train_images, train_labels, epochs=10,
validation_data=(test_images, test_labels))
Epoch 1/10 1563/1563 [==============================] - 11s 7ms/step - loss: 1.5650 - accuracy: 0.4183 - val_loss: 1.3051 - val_accuracy: 0.5351 Epoch 2/10 1563/1563 [==============================] - 10s 7ms/step - loss: 1.1022 - accuracy: 0.6071 - val_loss: 1.0242 - val_accuracy: 0.6373 Epoch 3/10 1563/1563 [==============================] - 10s 7ms/step - loss: 0.8881 - accuracy: 0.6858 - val_loss: 0.9532 - val_accuracy: 0.6645 Epoch 4/10 1563/1563 [==============================] - 11s 7ms/step - loss: 0.7554 - accuracy: 0.7325 - val_loss: 0.8001 - val_accuracy: 0.7246 Epoch 5/10 1563/1563 [==============================] - 10s 7ms/step - loss: 0.6523 - accuracy: 0.7707 - val_loss: 0.8201 - val_accuracy: 0.7166 Epoch 6/10 1563/1563 [==============================] - 10s 7ms/step - loss: 0.5700 - accuracy: 0.7990 - val_loss: 0.8234 - val_accuracy: 0.7247 Epoch 7/10 1563/1563 [==============================] - 10s 7ms/step - loss: 0.4890 - accuracy: 0.8260 - val_loss: 0.8609 - val_accuracy: 0.7231 Epoch 8/10 1563/1563 [==============================] - 10s 7ms/step - loss: 0.4236 - accuracy: 0.8495 - val_loss: 0.8407 - val_accuracy: 0.7398 Epoch 9/10 1563/1563 [==============================] - 10s 7ms/step - loss: 0.3664 - accuracy: 0.8699 - val_loss: 0.8994 - val_accuracy: 0.7366 Epoch 10/10 1563/1563 [==============================] - 10s 7ms/step - loss: 0.3265 - accuracy: 0.8831 - val_loss: 0.9323 - val_accuracy: 0.7352
使用adam作为优化器(优化器决定了梯度下降的方式),使用SparseCategoricalCrossentropy作为损失函数(损失函数用于评估模型输出与真实数据之间的误差)
使用测试数据测试模型准确度
test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
print('\nTest accuracy:', test_acc)
313/313 - 1s - loss: 0.9323 - accuracy: 0.7352 - 807ms/epoch - 3ms/step Test accuracy: 0.7351999878883362
使用softmax作为激活函数可以将输出的数据转换为更为容易理解的范围
probability_model = tf.keras.Sequential([model,
tf.keras.layers.Softmax()])
predictions = probability_model.predict(test_images)
为绘图做准备
def plot_image(i, predictions_array, true_label, img):
predictions_array, true_label, img = predictions_array, true_label[i], img[i]
plt.grid(False)
plt.xticks([])
plt.yticks([])
plt.imshow(img, cmap=plt.cm.binary)
predicted_label = np.argmax(predictions_array)
if predicted_label == true_label:
color = 'blue'
else:
color = 'red'
plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label],
100*np.max(predictions_array),
class_names[true_label]),
color=color)
def plot_value_array(i, predictions_array, true_label):
predictions_array, true_label = predictions_array, true_label[i]
plt.grid(False)
plt.xticks(range(10))
plt.yticks([])
thisplot = plt.bar(range(10), predictions_array, color="#777777")
plt.ylim([0, 1])
predicted_label = np.argmax(predictions_array)
thisplot[predicted_label].set_color('red')
thisplot[true_label].set_color('blue')
i = 0
plt.figure(figsize=(6,3))
plt.subplot(1,2,1)
plot_image(i, predictions[i], test_labels, test_images)
plt.subplot(1,2,2)
plot_value_array(i, predictions[i], test_labels)
plt.show()

i = 1
plt.figure(figsize=(6,3))
plt.subplot(1,2,1)
plot_image(i, predictions[i], test_labels, test_images)
plt.subplot(1,2,2)
plot_value_array(i, predictions[i], test_labels)
plt.show()

num_rows = 5
num_cols = 3
num_images = num_rows*num_cols
plt.figure(figsize=(2*2*num_cols, 2*num_rows))
for i in range(num_images):
plt.subplot(num_rows, 2*num_cols, 2*i+1)
plot_image(i, predictions[i], test_labels, test_images)
plt.subplot(num_rows, 2*num_cols, 2*i+2)
plot_value_array(i, predictions[i], test_labels)
plt.tight_layout()
plt.show()

从上面的测试结果中可以看出,模型对于大部分的图片预测还是很准确的