I am making a convolutional Keras network that transforms one type of temporal signal to another. So far, I am happy with how it works, and so I decided to see what the network was doing behind the scenes. More specifically, I am trying to plot the 1D filter kernels of my convolutional layers (which are the first three layers in my network):
Model: "sequential_5"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1d_15 (Conv1D) (None, 1500, 10) 760
_________________________________________________________________
max_pooling1d_15 (MaxPooling (None, 750, 10) 0
_________________________________________________________________
conv1d_16 (Conv1D) (None, 750, 10) 7510
_________________________________________________________________
max_pooling1d_16 (MaxPooling (None, 375, 10) 0
_________________________________________________________________
conv1d_17 (Conv1D) (None, 375, 10) 7510
_________________________________________________________________
max_pooling1d_17 (MaxPooling (None, 187, 10) 0
_________________________________________________________________
flatten_5 (Flatten) (None, 1870) 0
_________________________________________________________________
dense_25 (Dense) (None, 100) 187100
_________________________________________________________________
dropout_20 (Dropout) (None, 100) 0
_________________________________________________________________
dense_26 (Dense) (None, 100) 10100
_________________________________________________________________
dropout_21 (Dropout) (None, 100) 0
_________________________________________________________________
dense_27 (Dense) (None, 100) 10100
_________________________________________________________________
dropout_22 (Dropout) (None, 100) 0
_________________________________________________________________
dense_28 (Dense) (None, 1000) 101000
_________________________________________________________________
dropout_23 (Dropout) (None, 1000) 0
_________________________________________________________________
dense_29 (Dense) (None, 1500) 1501500
=================================================================
Total params: 1,825,580
Trainable params: 1,825,580
Non-trainable params: 0
_________________________________________________________________
My code:
k.layers.Conv1D(filters=10, kernel_size=int(nSamples*0.05), padding="same", activation='relu', input_shape = (nSamples, 1)),
k.layers.MaxPooling1D(pool_size=2),
k.layers.Conv1D(filters=10, kernel_size=int(nSamples*0.05), padding="same", activation='relu'),
k.layers.MaxPooling1D(pool_size=2),
k.layers.Conv1D(filters=10, kernel_size=int(nSamples*0.05), padding="same", activation='relu'),
k.layers.MaxPooling1D(pool_size=2),
k.layers.Flatten(),
# other layers
When trying to plot the kernels, I realised that I was confused by the output shapes of the second and third Conv1D
layers. The first layer I understand: I have a time series with dimensions (None, 1500, 1)
, and after the first convolution with 10 different kernels, the output shape is (None, 1500, 10)
. Good. But then the second conv1D
layer also has 10 different kernels, so would expect the output shape after this stage to be (None, 750, 10, 10)
and not (None, 750, 10)
as shown. Why are the 10 kernels from the second Conv1D
layers not being applied to each feature map generated by the previous layer? Similarly, I would have expected the output after the third Conv1D
layer to be (None, 750, 10, 10, 10)
. Or is something else happening?
Furthermore, I found the following (my kernel length is 75 points):
model.layers[0].get_weights()[0].shape # (75, 1, 10)
model.layers[2].get_weights()[0].shape # (75, 10, 10)
model.layers[4].get_weights()[0].shape # (75, 10, 10)
The above seems to suggest that 100 kernels are applied in the second and third layers, 10 on each feature map, so again, why is the output shape (None, 750, 10)
?