卷积神经网络(Convolutional Neural Networks, CNN)是一类包含卷积计算且具有深度结构的前馈神经网络(Feedforward Neural Networks),是深度学习(deep learning)的代表算法之一 。卷积神经网络具有表征学习(representation learning)能力,能够按其阶层结构对输入信息进行平移不变分类(shift-invariant classification),因此也被称为“平移不变人工神经网络(Shift-Invariant Artificial Neural Networks, SIANN)
此篇文章我将使用tensorflow的keras库尝试搭建精简过的alexnet演示卷积神经网络
导入cifar10训练集
1 2 3 |
import tensorflow as tf import numpy as np import matplotlib.pyplot as plt |
1 2 3 4 |
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.cifar10.load_data() train_images, test_images = train_images / 255.0, test_images / 255.0 class_names = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck'] |
1 2 3 4 |
print(train_labels.shape) train_labels=train_labels.squeeze(axis=1) print(train_labels.shape) test_labels=test_labels.squeeze(axis=1) |
1 |
(50000, 1)<br>(50000,) |
发现训练标签与测试标签多了一个无用的轴,为了后续方便处理数据,去除无用的轴
查看是否成功导入
1 2 3 |
plt.imshow(train_images[0]) plt.show() print(class_names[train_labels[0]]) |

建立keras神经网络
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
model=tf.keras.Sequential([ tf.keras.layers.Conv2D(64,input_shape=(32,32,3),kernel_size=(3,3),activation='relu',padding='same'), tf.keras.layers.MaxPool2D(), tf.keras.layers.Conv2D(256,kernel_size=(3,3),activation='relu',padding='same'), tf.keras.layers.MaxPool2D(), tf.keras.layers.Conv2D(256,kernel_size=(3,3),activation='relu',padding='same'), tf.keras.layers.Conv2D(128,kernel_size=(3,3),activation='relu',padding='same'), tf.keras.layers.Conv2D(128,kernel_size=(3,3),activation='relu',padding='same'), tf.keras.layers.MaxPool2D(), tf.keras.layers.Flatten(), tf.keras.layers.Dense(256,activation='relu'), tf.keras.layers.Dense(128,activation='relu'), tf.keras.layers.Dense(10) ]) |
建立的神经网络一共有12层,第一层是卷积层,卷积核大小为3*3,使用relu作为激活函数
第二层是池化层,默认缩放1/2
第三层也是卷积层
第四层为池化层
第五第六层和第七层连续三次卷积
第八层为池化层
第九层为平坦层,将二维图片转换为线性数据
第十层与第十一层为全连接层,激活函数同样使用relu
第十二层为输出层,由于样本有10类,所以神经元数量为10
AlexNet使用ReLU代替了Sigmoid,其能更快的训练,同时解决sigmoid在训练较深的网络中出现的梯度消失问题
模型总结
1 |
model.summary() |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 32, 32, 64) 1792 max_pooling2d (MaxPooling2D (None, 16, 16, 64) 0 ) conv2d_1 (Conv2D) (None, 16, 16, 256) 147712 max_pooling2d_1 (MaxPooling (None, 8, 8, 256) 0 2D) conv2d_2 (Conv2D) (None, 8, 8, 256) 590080 conv2d_3 (Conv2D) (None, 8, 8, 128) 295040 conv2d_4 (Conv2D) (None, 8, 8, 128) 147584 max_pooling2d_2 (MaxPooling (None, 4, 4, 128) 0 2D) flatten (Flatten) (None, 2048) 0 dense (Dense) (None, 256) 524544 dense_1 (Dense) (None, 128) 32896 dense_2 (Dense) (None, 10) 1290 ================================================================= Total params: 1,740,938 Trainable params: 1,740,938 Non-trainable params: 0 _________________________________________________________________ |
这里可以发现faltten层之前数据就已经被不断卷积为64个1*1的特征了,(后续操作证明即使这样模型也是可以跑的),这也是我删除一层卷积层的原因,因为到后面池化不了了
训练模型
1 2 3 |
model.compile(optimizer='adam',loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),metrics='accuracy') history=model.fit(train_images, train_labels, epochs=10, validation_data=(test_images, test_labels)) |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
Epoch 1/10 1563/1563 [==============================] - 11s 7ms/step - loss: 1.5650 - accuracy: 0.4183 - val_loss: 1.3051 - val_accuracy: 0.5351 Epoch 2/10 1563/1563 [==============================] - 10s 7ms/step - loss: 1.1022 - accuracy: 0.6071 - val_loss: 1.0242 - val_accuracy: 0.6373 Epoch 3/10 1563/1563 [==============================] - 10s 7ms/step - loss: 0.8881 - accuracy: 0.6858 - val_loss: 0.9532 - val_accuracy: 0.6645 Epoch 4/10 1563/1563 [==============================] - 11s 7ms/step - loss: 0.7554 - accuracy: 0.7325 - val_loss: 0.8001 - val_accuracy: 0.7246 Epoch 5/10 1563/1563 [==============================] - 10s 7ms/step - loss: 0.6523 - accuracy: 0.7707 - val_loss: 0.8201 - val_accuracy: 0.7166 Epoch 6/10 1563/1563 [==============================] - 10s 7ms/step - loss: 0.5700 - accuracy: 0.7990 - val_loss: 0.8234 - val_accuracy: 0.7247 Epoch 7/10 1563/1563 [==============================] - 10s 7ms/step - loss: 0.4890 - accuracy: 0.8260 - val_loss: 0.8609 - val_accuracy: 0.7231 Epoch 8/10 1563/1563 [==============================] - 10s 7ms/step - loss: 0.4236 - accuracy: 0.8495 - val_loss: 0.8407 - val_accuracy: 0.7398 Epoch 9/10 1563/1563 [==============================] - 10s 7ms/step - loss: 0.3664 - accuracy: 0.8699 - val_loss: 0.8994 - val_accuracy: 0.7366 Epoch 10/10 1563/1563 [==============================] - 10s 7ms/step - loss: 0.3265 - accuracy: 0.8831 - val_loss: 0.9323 - val_accuracy: 0.7352 |
使用adam作为优化器(优化器决定了梯度下降的方式),使用SparseCategoricalCrossentropy作为损失函数(损失函数用于评估模型输出与真实数据之间的误差)
使用测试数据测试模型准确度
1 2 3 |
test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2) print('\nTest accuracy:', test_acc) |
1 2 3 |
313/313 - 1s - loss: 0.9323 - accuracy: 0.7352 - 807ms/epoch - 3ms/step Test accuracy: 0.7351999878883362 |
使用softmax作为激活函数可以将输出的数据转换为更为容易理解的范围
1 2 3 |
probability_model = tf.keras.Sequential([model, tf.keras.layers.Softmax()]) predictions = probability_model.predict(test_images) |
为绘图做准备
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
def plot_image(i, predictions_array, true_label, img): predictions_array, true_label, img = predictions_array, true_label[i], img[i] plt.grid(False) plt.xticks([]) plt.yticks([]) plt.imshow(img, cmap=plt.cm.binary) predicted_label = np.argmax(predictions_array) if predicted_label == true_label: color = 'blue' else: color = 'red' plt.xlabel("{} {:2.0f}% ({})".format(class_names[predicted_label], 100*np.max(predictions_array), class_names[true_label]), color=color) def plot_value_array(i, predictions_array, true_label): predictions_array, true_label = predictions_array, true_label[i] plt.grid(False) plt.xticks(range(10)) plt.yticks([]) thisplot = plt.bar(range(10), predictions_array, color="#777777") plt.ylim([0, 1]) predicted_label = np.argmax(predictions_array) thisplot[predicted_label].set_color('red') thisplot[true_label].set_color('blue') |
1 2 3 4 5 6 7 |
i = 0 plt.figure(figsize=(6,3)) plt.subplot(1,2,1) plot_image(i, predictions[i], test_labels, test_images) plt.subplot(1,2,2) plot_value_array(i, predictions[i], test_labels) plt.show() |

1 2 3 4 5 6 7 |
i = 1 plt.figure(figsize=(6,3)) plt.subplot(1,2,1) plot_image(i, predictions[i], test_labels, test_images) plt.subplot(1,2,2) plot_value_array(i, predictions[i], test_labels) plt.show() |

1 2 3 4 5 6 7 8 9 10 11 |
num_rows = 5 num_cols = 3 num_images = num_rows*num_cols plt.figure(figsize=(2*2*num_cols, 2*num_rows)) for i in range(num_images): plt.subplot(num_rows, 2*num_cols, 2*i+1) plot_image(i, predictions[i], test_labels, test_images) plt.subplot(num_rows, 2*num_cols, 2*i+2) plot_value_array(i, predictions[i], test_labels) plt.tight_layout() plt.show() |

从上面的测试结果中可以看出,模型对于大部分的图片预测还是很准确的