R-CNN
2019-07-07
Computer Vision
2671
Overview
Regions with CNN features:
Efficient Graph Based Image Segmentation
use disjoint set to speed up merge operation
Selective Search
HOG (Histogram of Oriented Gradient)
Multiple criterions (color, texture, size, shape) to merge regions
AlexNet/VGG16
R-CNN
Notice that many descriptions are replicated from the orignal sources directly.
Some Fundermental Conceptions
Batch Size
Stochastic Gradient Descent. Batch Size = 1
Batch Gradient Descent. Batch Size = Size of Training Set
Mini-Batch Gradient Descent. 1 < Batch Size < Size of Training Set
Regularization
A regression model that uses L1 Regularization technique is called Lasso Regression and model which uses L2 is called Ridge Regression.
Ridge Regularization
Ridge regression adds "squared magnitude" of coefficient as penalty term to the loss function.
The first sum is an example of loss function.
Lasso Regularization
Lasso Regression (Least Absolute Shrinkage and Selection Operator) adds "absolute value of magnitude" of coefficient as penalty term to the loss function.
SIFT
2019-07-03
Computer Vision
389
Useful Materials
Distinctive Image Features from Scale-Invariant Keypoints[1] by David G. Lowe.
SIFT(Scale-Invariant Feature Transform)[2] on Towards Data Science.
The SIFT (Scale Invariant Feature Transform) Detector and Descriptor[3].
Notes
Uses DoG (Difference of Gaussian) to approximate Scale-normalized LoG (Laplacian of Gaussian)[4].
where is the two dimensions Gaussian function, and is the input image.
[need more consideration] After each octave, the Gaussian image is down-sampled by a factor of 2, by resampling the Gaussian image that has twice the initial value of by taking every second pixel in each row and column. And we start on the new octave with .
Since the image size is reduced to 1/4, the sigma for the next octave becomes , which is equal to .
To understand it, frist consider this question: If the image size is reduced to 1\4, but the kernel size of
Image Processing - Noise and denoise
2019-03-20
Computer Vision
466
Types of Noise
Additive noise
Additive noise is independent from image signal. The image g with nosie can be considered as the sum of ideal image f and noise n.[1]
Multiplicative noise
Multifplicative noise is often dependent on image signal. The relation of image and noise is[1]:
Gaussian noise
Gaussian noise, named after Carl Friedrich Gauss, is statistical noise having a probability density function (PDF) equal to that of the normal distribution, aka. the Gaussian distribution. i.e. the values that the noise can take on are Gaussian-distributed.
The PDF of a Gaussian random variable is given by[2]:
Salt-and-pepper noise
Fat-tail distributed or "impulsive" noise is sometimes called salt-and-pepper nosie or spike noise. An image containing salt-and-pepper noise will have dark pixels in bright regions and bright pixels in dark regions.[2]
The PDF of (Bipolar) Impulse noise is given by:
if b > a, gray