Skip to Content
CSE510CSE510 Deep Reinforcement Learning (Lecture 8)

CSE510 Deep Reinforcement Learning (Lecture 8)

Convolutional Neural Networks

Another note in computer vision can be found here: CSE559A Lecture 10

Basically, it is a stack of different layers:

  • Convolutional layer
  • Non-linearity layer
  • Pooling layer (or downsampling layer)
  • Fully connected layer

Convolutional layer

Filtering: The math behind the matching.

  1. Line up the feature and the image patch.
  2. Multiply each image pixel by the corresponding feature pixel.
  3. Add them up.
  4. Divide by the total number of pixels in the feature.

Idea of a convolutional neural network, in some sense, is to let the network “learn” the right filters for a specific task.

Non-linearity Layer

Tip

This is irrelevant to the lecture, but consider the following term:

“Bounded rationality”

  • Convolution is a linear operation
  • Non-linearity layer creates an activation map from the feature map generated by the convolutional layer
  • Consisting an activation function (an element-wise operation)
  • Rectified linear units (ReLUs) is advantageous over the traditional sigmoid or tanh activation functions

Pooling layer

Shrinking the Image Stack

  • Motivation: the activation maps can be large
  • Reducing the spacial size of the activation maps
    • Often after multiple stages of other layers (i.e., convolutional and non-linear layers)
  • Steps:
    1. Pick a window size (usually 2 or 3).
    2. Pick a stride (usually 2).
    3. Walk your window across your filtered images.
    4. From each window, take the maximum value.

Pros:

  • Reducing the computational requirements
  • Minimizing the likelihood of overfitting

Cons:

  • Aggressive reduction can limit the depth of a network and ultimately limit the performance

Fully connected layer

  • Multilayer perceptron (MLP)
  • Mapping the activation volume from previous layers into a class probability distribution
  • Non-linearity is built in the neurons, instead of a separate layer
  • Viewed as 1x1 convolution kernels

For classification: Output layer is a regular, fully connected layer with softmax non-linearity

  • Output provides an estimate of the conditional probability of each class
Tip

The golden triangle of machine learning:

  • Data
  • Algorithm
  • Computation
Last updated on