Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: zz-x2580
1 Part 1
Answer the following questions.
1. What do we mean by hand-crafted features? Give at least three examples of handcrafted features.
2. What do we mean by learned features? Give at least three examples of leanred
features.
3. Briefly discss some of the advantages and disadvantages of hand-crafted features compared to learned features.
4. What are two ways in which one can formulate PCA?
5. The LDA reduces dimensionality from the original number of features to how many
features?
6. What some of the limitations of LDA?
7. Give two drawbacks of PCA.
8. Many features in computer vision are represented in terms of histograms. Given two
histograms, what are some distance metrics that we can use to compare them? Give
at least three examples.
9. Why does `0-norm capture sparsity?
10. Why do we use `1-norm to approximate `0-norm?
11. What are some disadvantages of k-means clustering?
12. What is the di↵erence between Nearest Neighbor algorithm and k-Nearest Neighbor
algoritm?
13. Breifly describe how visual bag of words features are extracted.
14. Briefly describe cross-validation.
15. What is the di↵erence between sparse coding and dictionary learning?
2 Part 2
In this excercise, you will implement and evaluate the k-Nearest Neighbor algorithm that
we studied in class.
Grading: You will be graded based on the code you develop, plus your homework report
summarizing your findings. If possible, please write your report using LaTeX.
2.1 Extended YaleB dataset
1. The original images in the Extended YaleB dataset have been cropped and resized
to 32 ⇥ 32. This dataset has 38 individuals and around 64 near frontal images under
di↵erent illuminations per individual. Sample images from this dataset are shown
in Figure 1. Download the file YaleB-32×32.mat from the course locker. This file
contains variables ‘fea’ and ‘gnd’. Each row of ‘fea’ is a face and ‘gnd’ is the label.
Randomly select (m = 10, 20, 30, 40, 50) images per individual with labels to form the
training set, and use the remaining images in the dataset as the test set. Apply the
k-NN algorithm (with k = 1) on each of these five splits and record the corresponding
classification errors. Use the Euclidean distance metric, i.e., d(x, y) = kx yk2.