CSE559A Lecture 3
Image formation
Degrees of Freedom
Impact of translation of camera
Projection of a vanishing point or projection of a point at infinity is invariant to translation.
Recover world coordinates from pixel coordinates
Key issue: where is the world origin ? Suppose
Projective Geometry
Orthographic Projection
Special case of perspective projection when
- Distance for the center of projection is infinite
- Also called parallel projection
- Projection matrix is
Continue in later part of the course
Image processing foundations
Motivation for image processing
Representational Motivation:
- We need more than raw pixel values
Computational Motivation:
- Many image processing operations must be run across many locations in a image
- A loop in python is slow
- High-level libraries reduce errors, developer time, and algorithm runtime
- Two common libraries:
- Torch+Torchvision: Focus on deep learning
- scikit-image: Focus on classical image processing algorithms
Operations on images
Point operations
Operations that are applied to one pixel at a time
Negative image
Power law transformation:
- is a constant
- is the gamma value
Contrast stretching
use function to stretch the range of pixel values
- is a function that stretches the range of pixel values
Image histogram
- Histogram of an image is a plot of the frequency of each pixel value
Limitations:
- No spatial information
- No information about the relationship between pixels
Linear filtering in spatial domain
Operations that are applied to a neighborhood at each position
Used to:
- Enhance image features
- Denoise, sharpen, resize
- Extract information about image structure
- Edge detection, corner detection, blob detection
- Detect image patterns
- Template matching
- Convolutional Neural Networks
Image filtering
Do dot product of the image with a kernel
def filter2d(image, kernel):
"""
Apply a 2D filter to an image, do not use this in practice
"""
for i in range(image.shape[0]):
for j in range(image.shape[1]):
image[i, j] = np.dot(kernel, image[i-1:i+2, j-1:j+2])
return imageComputational cost: , assume is the size of the kernel and and are the dimensions of the image
Do not use this in practice, use built-in functions instead.
Box filter
Smooths the image
Identity filter
Does not change the image
Sharpening filter
Enhances the image edges
Vertical edge detection
Detects vertical edges
Horizontal edge detection
Detects horizontal edges
Key property:
- Linear:
filter(I,f_1+f_2)=filter(I,f_1)+filter(I,f_2)
- Scale invariant:
filter(I,af)=a*filter(I,f)
- Shift invariant:
filter(I,shift(f))=shift(filter(I,f))
- Commutative:
filter(I,f_1)*filter(I,f_2)=filter(I,f_2)*filter(I,f_1)
- Associative:
filter(I,f_1)*(filter(I,f_2)*filter(I,f_3))=(filter(I,f_1)*filter(I,f_2))*filter(I,f_3)
- Distributive:
filter(I,f_1+f_2)=filter(I,f_1)+filter(I,f_2)
- Identity:
filter(I,f_0)=I
Important filter:
Gaussian filter
Smooths the image (Gaussian blur)
Common mistake: Make filter too large, visualize the filter before applying it (make the value on the edge )
Properties of Gaussian filter:
- Remove high frequency components
- Convolution with self is another Gaussian filter
- Separable kernel:
G(x,y)=G(x)G(y)(factorable into the product of two 1D Gaussian filters)
Filter Separability
- Separable filter:
f(x,y)=f(x)f(y)
Example:
Gaussian filter is separable
This reduces the computational cost of the filter from to