Paper Information:
Gong, Yunchao, and Svetlana Lazebnik. "Iterative quantization: A
procrustean approach to learning binary codes." Computer Vision and
Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011.
Motivation:
This paper wants to solve the problem of learning similarity preserving
binary codes for efficient retrieval in large-scale image collections.
Contributions:
In
this paper, the performance of PCA-based binary coding schemes can be greatly
improved by simply rotating the projected data. Besides, an iterative
quantization method for refining this rotation that is very natural and
effective. Iterative quantization (ITQ), has connections to multi-class
spectral clustering and to the orthogonal Procrustes problem, and it can be used
both with unsupervised data embeddings such as PCA and supervised embeddings
such as canonical correlation analysis (CCA).
Technical summarization:
Unsupervised Code Learning:
The
major novelty of the method is that they try to preserve the locality structure
of the projected data by rotating it so as to minimize the discretization
error.
Binary Quantization:
Beginning with the random initialization of R, they adopt a k-means-like iterative quantization (ITQ) procedure to find a local minimum of the quantization loss (2). In each iteration, each data point is first assigned to the nearest vertex of the binary hypercube, and then R (any orthogonal c*c matrix) is updated to minimize the quantization loss given this assignment.
V = XW, W is
obtained by top c eigenvectors of X (data matrix)
They
alternate between updates to B and R for several iterations to find a locally
optimal solution.
Leveraging Label Information:
Their method can be used with any orthogonal basis projection method. Therefore, supervised dimensionality reduction method can be used to capture the semantic structure of the dataset.
They refine their codes in a supervised setting using Canonical Correlation Analysis (CCA), which has proven to be an effective tool for extracting a common latent space from two views and is robust to noise. The goal of CCA is to find projection directions for feature and label vectors to maximize the correlation between the projected data.
My comment:
Sine there are no ground truth class labels for a dataset, they defined the ground truth by Euclidean neighbors. It may be useful while facing same situation while doing research.
As we can see from the result, PCA really helps to preserve semantic consistency for the smallest code sizes. Therefore, it is important to apply dimensionality reduction to the data in order to capture the its class structure.




沒有留言:
張貼留言