2016年3月16日 星期三

[AMMAI] [Lecture 03] - "Iterative Quantization: A Procrustean Approach to Learning Binary Codes"

Paper Information:
  Gong, Yunchao, and Svetlana Lazebnik. "Iterative quantization: A procrustean approach to learning binary codes." Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on. IEEE, 2011.

Motivation:
  This paper wants to solve the problem of learning similarity preserving binary codes for efficient retrieval in large-scale image collections.


Contributions:
  In this paper, the performance of PCA-based binary coding schemes can be greatly improved by simply rotating the projected data. Besides, an iterative quantization method for refining this rotation that is very natural and effective. Iterative quantization (ITQ), has connections to multi-class spectral clustering and to the orthogonal Procrustes problem, and it can be used both with unsupervised data embeddings such as PCA and supervised embeddings such as canonical correlation analysis (CCA).

Technical summarization:
  Unsupervised Code Learning:
    The major novelty of the method is that they try to preserve the locality structure of the projected data by rotating it so as to minimize the discretization error.
 
  Binary Quantization:

 Beginning with the random initialization of R, they adopt a k-means-like iterative quantization (ITQ) procedure to find a local minimum of the quantization loss (2). In each iteration, each data point is first assigned to the nearest vertex of the binary hypercube, and then R (any orthogonal c*c matrix) is updated to minimize the quantization loss given this assignment.
 
  V = XW, W is obtained by top c eigenvectors of X (data matrix)
  They alternate between updates to B and R for several iterations to find a locally optimal solution.
  Leveraging Label Information:
    Their method can be used with any orthogonal basis projection method. Therefore, supervised dimensionality reduction method can be used to capture the semantic structure of the dataset.
    They refine their codes in a supervised setting using Canonical Correlation Analysis (CCA), which has proven to be an effective tool for extracting a common latent space from two views and is robust to noise. The goal of CCA is to find projection directions for feature and label vectors to maximize the correlation between the projected data. 
My comment:
  Sine there are no ground truth class labels for a dataset, they defined the ground truth by Euclidean neighbors. It may be useful while facing same situation while doing research.
  As we can see from the result, PCA really helps to preserve semantic consistency for the smallest code sizes. Therefore, it is important to apply dimensionality reduction to the data in order to capture the its class structure. 
   




沒有留言:

張貼留言