Computer Vision
3

2019

3

# R-CNN

2687
Overview Regions with CNN features: Efficient Graph Based Image Segmentation use disjoint set to speed up merge operation Selective Search HOG (Histogram of Oriented Gradient) Multiple criterions (color, texture, size, shape) to merge regions AlexNet/VGG16 R-CNN Notice that many descriptions are replicated from the orignal sources directly. Some Fundermental Conceptions Batch Size Stochastic Gradient Descent. Batch Size = 1 Batch Gradient Descent. Batch Size = Size of Training Set Mini-Batch Gradient Descent. 1 < Batch Size < Size of Training Set Regularization A regression model that uses L1 Regularization technique is called Lasso Regression and model which uses L2 is called Ridge Regression. Ridge Regularization Ridge regression adds "squared magnitude" of coefficient as penalty term to the loss function. The first sum is an example of loss function. Lasso Regularization Lasso Regression (Least Absolute Shrinkage and Selection Operator) adds "absolute value of magnitude" of coefficient as penalty term to the loss function.

# SIFT

389
Useful Materials Distinctive Image Features from Scale-Invariant Keypoints by David G. Lowe. SIFT(Scale-Invariant Feature Transform) on Towards Data Science. The SIFT (Scale Invariant Feature Transform) Detector and Descriptor. Notes Uses DoG (Difference of Gaussian) to approximate Scale-normalized LoG (Laplacian of Gaussian). where is the two dimensions Gaussian function, and is the input image. [need more consideration] After each octave, the Gaussian image is down-sampled by a factor of 2, by resampling the Gaussian image that has twice the initial value of by taking every second pixel in each row and column. And we start on the new octave with . Since the image size is reduced to 1/4, the sigma for the next octave becomes , which is equal to . To understand it, frist consider this question: If the image size is reduced to 1\4, but the kernel size of
Types of Noise Additive noise Additive noise is independent from image signal. The image g with nosie can be considered as the sum of ideal image f and noise n. Multiplicative noise Multifplicative noise is often dependent on image signal. The relation of image and noise is: Gaussian noise Gaussian noise, named after Carl Friedrich Gauss, is statistical noise having a probability density function (PDF) equal to that of the normal distribution, aka. the Gaussian distribution. i.e. the values that the noise can take on are Gaussian-distributed. The PDF of a Gaussian random variable is given by: Salt-and-pepper noise Fat-tail distributed or "impulsive" noise is sometimes called salt-and-pepper nosie or spike noise. An image containing salt-and-pepper noise will have dark pixels in bright regions and bright pixels in dark regions. The PDF of (Bipolar) Impulse noise is given by: if b > a, gray-level 