# 2020 How do machines understand the world (2)

This is the text I was involved in a more challenging day 24, the event details see: more text challenge

In the analysis of Alexnet by Professor Li Hongyi, some patterns and colors were identified in the figure. In fact, we input the image in the first layer. If the second layer of filter takes the first layer of filter as input, it is input to the feature layer.

from keras.applications Import VGG16
from keras Import backend AS K
Import matplotlib.pyplot AS plt
copy the code

Keras provides classic predictive models such as VGG16. Most of our neural networks are trained on these classic neural networks.

For a layer convolution filter of block3, 150 x 150, we find what kind of image can be reflected by the feature of block conv1 numbered 0, we need to define the function, and the maximum sum average of all pixels of the feature indicates that the feature map is activated.

VGG16 The object of our research is the first filter (number 1) of the block3_conv1 layer.$150/times150$ input to VGG what kind of pictures to make the first filter (No. 1 block3_conv1 activation layer, this filter is most sensitive to such images, then how do we explain and mathematical models to measure it, it also That is, the average value of all values of the calculated value of the image and the filter is the largest, which means that the filter is most sensitive to this image. Then this is our objective function. What we have to do is to find a picture and input to make the average value of the filter the largest . Use x to indicate the picture we are looking for, $Max_x f(x)$ $\frac{df(x)}{dx}$w we use gradient ascent to replace the differential process, we enter the picture here, that is, x is$150/times 150/times 3$ matrix, we take the derivative of each component of this matrix

model = VGG16(weights = ' imagenet ' ,include_top = False )
= layer_name, the 'block3_conv1'
filter_index = 0
duplicated code

$w_1/leftarrow w_0 +/eta/frac{\partial L}{\partial w}$ However, if you want the step to be slightly larger, if the step is large, it may cause instability, and the gradient vector is regularized by L2.

layer_output = model.get_layer(layer_name).output
loss = K.mean(layer_output[:,:,:,filter_index])
Copy code
layer_output
code
<tf.Tensor 'block1_conv1/Relu: 0
' (?? ?,,, 64) shape = dtype = float32> copy the code
print (loss)
Copy code
Tensor("Mean_1170:0", shape=(), dtype=float32)
shape = (), dtype = float32) copying the code
grads = K.gradients(loss,model. input )[ 0 ]
to copy the code
grads
code
<tf.Tensor'gradients_585/block1_conv1/convolution_grad/Conv2DBackpropInput:0' shape=(?, ?, ?, 3) dtype=float32>
/Conv2DBackpropInput: 0' shape = (?,,, 3??) dtype = float32> copy the code
grads/= (K.sqrt(K.mean(K.square(grads))) + 1e-5 )
copying the code
iterate = K.function([model. input ],[loss,grads])
import numpy as np
loss_value, the iterate grads_value = ([np.zeros (( . 1 , 150 , 150 , . 3 ))])
Copy the code
input_img_data = np.random.random(( 1 , 150 , 150 , 3 )) * 20 + 128.
step = 1.
for i in  range ( 40 ):
Copy code

This function is used to compress the picture. The so-called compression work is to first compress the picture

def  depress_image ( x ):
x -= x.mean()
x/= (x.std() + 1e-5 )
x *= 0.1

x += 0.5
x = np.clip(x, 0 , 1 )

* = X 255
X = np.clip (X, 0 , 255 ) .astype ( 'uint8' )
return X
duplicated code

This function

def  generate_pattern ( layer_name, filter_index, size= 150 ):
layer_output = model.get_layer(layer_name).output
loss = K.mean(layer_output[:,:,:,filter_index])

input_img_data = np.random.random(( 1 ,size,size, 3 )) * 20 + 128.
step = 1.
for i in  range ( 40 ):
= input_img_data IMG [ 0 ]
return depress_image (IMG)
copying the code
plt.imshow(generate_pattern( 'block1_conv1' , 0 ))
plt.show()
Copy code

plt.imshow(generate_pattern( 'block3_conv1' , 0 ))
plt.show()
Copy code

plt.imshow(generate_pattern( 'block4_conv1' , 1 ))
plt.show()
Copy code

for layer_name in [ 'block1_conv1' , 'block2_conv1' , 'block3_conv1' , 'block4_conv1' ]:
size = 64
margin = 5
results = np.zeros(( 8 * size + 7 * margin, 8 * size + 7 * margin, 3 ))
for i in  range ( 8 ):
for j in  range ( 8 ):

filter_img = generate_pattern(layer_name, i + (j* 8 ),size=size)
horizontal_start = i * size + i * margin
horizontal_end = horizontal_start + size

vertical_start = j * size + j * margin
vertical_end = vertical_start + size

results[horizontal_start:horizontal_end,vertical_start:vertical_end,:] = filter_img
plt.figure(figsize=( 20 , 20 ))
plt.imshow(results)
plt.show()
Copy code
model.summary()
Copy code
_________________________________________________________________
Layer (type) Output Shape Param #
================================================= ===============
input_1 (InputLayer) (None, None, None, 3) 0
_________________________________________________________________
block1_conv1 (Conv2D) (None, None, None, 64) 1792
_________________________________________________________________
block1_conv2 (Conv2D) (None, None, None, 64) 36928
_________________________________________________________________
block1_pool (MaxPooling2D) (None, None, None, 64) 0
_________________________________________________________________
block2_conv1 (Conv2D) (None, None, None, 128) 73856
_________________________________________________________________
block2_conv2 (Conv2D) (None, None, None, 128) 147584
_________________________________________________________________
block2_pool (MaxPooling2D) (None, None, None, 128) 0
_________________________________________________________________
block3_conv1 (Conv2D) (None, None, None, 256) 295168
_________________________________________________________________
block3_conv2 (Conv2D) (None, None, None, 256) 590080
_________________________________________________________________
block3_conv3 (Conv2D) (None, None, None, 256) 590080
_________________________________________________________________
block3_pool (MaxPooling2D) (None, None, None, 256) 0
_________________________________________________________________
block4_conv1 (Conv2D) (None, None, None, 512) 1180160
_________________________________________________________________
block4_conv2 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block4_conv3 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block4_pool (MaxPooling2D) (None, None, None, 512) 0
_________________________________________________________________
block5_conv1 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block5_conv2 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block5_conv3 (Conv2D) (None, None, None, 512) 2359808
_________________________________________________________________
block5_pool (MaxPooling2D) (None, None, None, 512) 0
================================================= ===============
Total params: 14,714,688
Trainable params: 14,714,688
Non-trainable params: 0
_________________________________________________________________
Copy code