2020 How do machines understand the world (2)

2020 How do machines understand the world (2)

This is the text I was involved in a more challenging day 24, the event details see: more text challenge

In the analysis of Alexnet by Professor Li Hongyi, some patterns and colors were identified in the figure. In fact, we input the image in the first layer. If the second layer of filter takes the first layer of filter as input, it is input to the feature layer.

from keras.applications Import VGG16 from keras Import backend AS K Import matplotlib.pyplot AS plt copy the code

Keras provides classic predictive models such as VGG16. Most of our neural networks are trained on these classic neural networks.

For a layer convolution filter of block3, 150 x 150, we find what kind of image can be reflected by the feature of block conv1 numbered 0, we need to define the function, and the maximum sum average of all pixels of the feature indicates that the feature map is activated.

VGG16 The object of our research is the first filter (number 1) of the block3_conv1 layer.150X150150/times150 input to VGG what kind of pictures to make the first filter (No. 1 block3_conv1 activation layer, this filter is most sensitive to such images, then how do we explain and mathematical models to measure it, it also That is, the average value of all values of the calculated value of the image and the filter is the largest, which means that the filter is most sensitive to this image. Then this is our objective function. What we have to do is to find a picture and input to make the average value of the filter the largest . Use x to indicate the picture we are looking for, Maxxf(x)Max_x f(x) df(x)dx\frac{df(x)}{dx}w we use gradient ascent to replace the differential process, we enter the picture here, that is, x is150X150X3150/times 150/times 3 matrix, we take the derivative of each component of this matrix

model = VGG16(weights = ' imagenet ' ,include_top = False ) = layer_name, the 'block3_conv1' filter_index = 0 duplicated code

Gradient rise

w1 w0+ L ww_1/leftarrow w_0 +/eta/frac{\partial L}{\partial w} However, if you want the step to be slightly larger, if the step is large, it may cause instability, and the gradient vector is regularized by L2.

layer_output = model.get_layer(layer_name).output loss = K.mean(layer_output[:,:,:,filter_index]) Copy code
layer_output code
<tf.Tensor 'block1_conv1/Relu: 0 ' (?? ?,,, 64) shape = dtype = float32> copy the code
print (loss) Copy code
Tensor("Mean_1170:0", shape=(), dtype=float32) shape = (), dtype = float32) copying the code
grads = K.gradients(loss,model. input )[ 0 ] to copy the code
grads code
<tf.Tensor'gradients_585/block1_conv1/convolution_grad/Conv2DBackpropInput:0' shape=(?, ?, ?, 3) dtype=float32> /Conv2DBackpropInput: 0' shape = (?,,, 3??) dtype = float32> copy the code
grads/= (K.sqrt(K.mean(K.square(grads))) + 1e-5 ) copying the code
iterate = K.function([model. input ],[loss,grads]) import numpy as np loss_value, the iterate grads_value = ([np.zeros (( . 1 , 150 , 150 , . 3 ))]) Copy the code
input_img_data = np.random.random(( 1 , 150 , 150 , 3 )) * 20 + 128. step = 1. for i in range ( 40 ): loss_value,grads_value = iterate([input_img_data]) input_img_data += grads_value * step Copy code

This function is used to compress the picture. The so-called compression work is to first compress the picture

def depress_image ( x ): x -= x.mean() x/= (x.std() + 1e-5 ) x *= 0.1 x += 0.5 x = np.clip(x, 0 , 1 ) * = X 255 X = np.clip (X, 0 , 255 ) .astype ( 'uint8' ) return X duplicated code

This function

def generate_pattern ( layer_name, filter_index, size= 150 ): layer_output = model.get_layer(layer_name).output loss = K.mean(layer_output[:,:,:,filter_index]) grads = K.gradients(loss,model. input )[ 0 ] grads/= (K.sqrt(K.mean(K.square(grads))) + 1e-5 ) iterate = K.function([model. input ],[loss,grads]) input_img_data = np.random.random(( 1 ,size,size, 3 )) * 20 + 128. step = 1. for i in range ( 40 ): loss_value,grads_value = iterate([input_img_data]) input_img_data += grads_value * step = input_img_data IMG [ 0 ] return depress_image (IMG) copying the code
plt.imshow(generate_pattern( 'block1_conv1' , 0 )) plt.show() Copy code

plt.imshow(generate_pattern( 'block3_conv1' , 0 )) plt.show() Copy code

plt.imshow(generate_pattern( 'block4_conv1' , 1 )) plt.show() Copy code

for layer_name in [ 'block1_conv1' , 'block2_conv1' , 'block3_conv1' , 'block4_conv1' ]: size = 64 margin = 5 results = np.zeros(( 8 * size + 7 * margin, 8 * size + 7 * margin, 3 )) for i in range ( 8 ): for j in range ( 8 ): filter_img = generate_pattern(layer_name, i + (j* 8 ),size=size) horizontal_start = i * size + i * margin horizontal_end = horizontal_start + size vertical_start = j * size + j * margin vertical_end = vertical_start + size results[horizontal_start:horizontal_end,vertical_start:vertical_end,:] = filter_img plt.figure(figsize=( 20 , 20 )) plt.imshow(results) plt.show() Copy code
model.summary() Copy code
_________________________________________________________________ Layer (type) Output Shape Param # ================================================= =============== input_1 (InputLayer) (None, None, None, 3) 0 _________________________________________________________________ block1_conv1 (Conv2D) (None, None, None, 64) 1792 _________________________________________________________________ block1_conv2 (Conv2D) (None, None, None, 64) 36928 _________________________________________________________________ block1_pool (MaxPooling2D) (None, None, None, 64) 0 _________________________________________________________________ block2_conv1 (Conv2D) (None, None, None, 128) 73856 _________________________________________________________________ block2_conv2 (Conv2D) (None, None, None, 128) 147584 _________________________________________________________________ block2_pool (MaxPooling2D) (None, None, None, 128) 0 _________________________________________________________________ block3_conv1 (Conv2D) (None, None, None, 256) 295168 _________________________________________________________________ block3_conv2 (Conv2D) (None, None, None, 256) 590080 _________________________________________________________________ block3_conv3 (Conv2D) (None, None, None, 256) 590080 _________________________________________________________________ block3_pool (MaxPooling2D) (None, None, None, 256) 0 _________________________________________________________________ block4_conv1 (Conv2D) (None, None, None, 512) 1180160 _________________________________________________________________ block4_conv2 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block4_conv3 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block4_pool (MaxPooling2D) (None, None, None, 512) 0 _________________________________________________________________ block5_conv1 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_conv2 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_conv3 (Conv2D) (None, None, None, 512) 2359808 _________________________________________________________________ block5_pool (MaxPooling2D) (None, None, None, 512) 0 ================================================= =============== Total params: 14,714,688 Trainable params: 14,714,688 Non-trainable params: 0 _________________________________________________________________ Copy code