Computer Vision Assignment
0 Overview
This assignment has two tasks in total, is graded out of 20 marks, and is worth 15% of your final mark for this course.
0.1 Objectives
The goal of this assignment is to develop and assess proficiency at mid-level image processing, including corner detection, and deep learning techniques, including domain adaptation for deep neural networks.
0.2 Permitted Python libraries
The Python libraries that you may use in this assignment are
? OpenCV (cv2);
? NumPy;
? Matplotlib;
? scikit-image (skimage);
? scikit-learn (sklearn);
? SciPy;
? Pillow (PIL);
? PyTorch (torch); and
? torchvision.
Use of other Python libraries will result in a score of 0 for the task in which the library was used.
0.3 Advice
1. Before writing your report, we recommend you watch the video “how to write a good lab report” on Wattle. The markers do not want to just see a collection of experimental results, but rather a coherent explanation and interpretation of the results, and key parts of your source code with detailed comments. Note that these are suggestions from a previous version of the course, not all of which apply to this assignment. In particular, they do not override any requirements in this document.
2. The requirements for submission are at the end of this document. Please ensure your submission meets the requirements.
3. The report is to be uploaded to the Wattle site before the due time. This course does not allow late submissions. That is, any submission after the deadline will receive a mark of 0.
4. This is an individual assignment. All students must work individually when coding and writing the report.
0.4 Academic Integrity
You are expected to comply with the university policy on academic integrity and plagiarism. Please ensure you cite appropriately any resources that you use (lecture notes, papers, online documents, code), and complete all tasks independently. All academic integrity violations will be reported and can result insignificant penalties. More importantly, working through the problems yourself will help you learn the material and will set you up for success in the final exam.
1 Task 1: Harris Corner Detection (5 marks)
Read the partially-completed corner detection code in the file “harris.py”, as shown in Figure 1. Then perform. the following tasks.
1. Complete the missing sections in “harris.py” or in a Jupyter Notebook after transferring the contents of “harris.py”. Write the necessary functions with appropriate function signatures. (1 mark)
2. Add a comment on line #53 (starting “g = fspecial(”), and to every non-empty line of your solution after line #60, to make your code readable. (0.5 marks)
3. Test this function on the first four provided test images (Harris-{1,2,3,4}.jpg). Display each image with the detected corners overlaid as circles or crosses. (0.5 marks)
Note: Please make sure that your code can be run successfully on a local machine and will generate these results. If your submitted code cannot replicate your results, you will receive a mark of zero.
4. Compare your results with those obtained from using the library function cv2.cornerHarris for each of the test images. (0.5 marks)
5. Implement an inverse image warping function that takes an image, an (inverted) transforma- tion matrix, and the output imagesize as inputs, and returns the transformed image. (1 mark)
Note: For any inverse projection that does not lie on the original input image pixels, you should use bilinear interpolation to calculate the pixel values.
6. Select one of the four images (Harris-{1,2,3,4}.jpg), and rotate it by 0, 90, 180, and 270 degrees clockwise using your image warping function from the previous part. Then, apply your Harris corner detection algorithm to the resulting images. Record the coordinates of the detected corners, compare them across the rotations, and report your observations and explanations in your report. (0.5 marks)
7. Using Harris-5.jpg and Harris-6.jpg, in addition to the results already obtained, analyse and discuss the factors that affect the performance of Harris corner detection. Visualising the corner response scores may be helpful for this analysis. (1 mark)
In your PDF report, in addition to the text of your report, also include your complete source code with detailed comments (as per part 2) and display your corner detection results and comparisons for each test image.
Figure 1: Code listing for harris.py.
2 Task 2: Domain Adaptation (10 marks)
For this task, your objective is to implement domain adaptation techniques using the pretrained ResNet-34 network available in the torchvision library. No custom network architecture implementa- tion is required for this task. Nevertheless, you are required to develop your own code to load the dataset and to perform. training and validation of your network.
In this task, you will be using a subset of the DomainNet dataset [1]. The specific domains selected for this task are Real and Sketch. We use the Real domain as the source domain and Sketch domain as the target domain. To facilitate neural network training for students who do not have reliable access to GPUs, we have selected only ten classes from each domain, which include backpack, book, car, pizza, sandwich, snake, sock, tiger, tree, and watermelon. Images in this dataset do not have a uniform. shape, so they will need to be reshaped to 224 × 224 before being fed into the network.
Download the dataset zip file from hereor execute the dataset_downloader.py script.
1. Complete the following preparations steps.
(a) Download the dataset directly from here or execute the data downloading script.
dataset_downloader.py.
Note: There should be 3984 images in the real_train folder, 1712 in the real_test folder, 1956 in the sketch_train folder, and 841 in the sketch_test folder. (0 marks)
(b) Implement a PyTorch Dataset class for the dataset and load the data with appropriate transformations.
Hint: Your transformation should at a minimum include resizing the images to 224 × 224, and normalising the images using the means (0.485, 0.456, 0.406) and standard deviations (0.229, 0.224, 0.225). You are allowed to add data augmentation techniques to the transformation. (0.5 marks)
(c) Load the pretrained ResNet-34 model from torchvision and modify its final fully- connected layer to accommodate the specific number of classes required for this task. (0.5 marks)
2. Implement the code for fine-tuning a model.
(a) Implement the train and test functions for fine-tuning.
Hint: The two functions could, for example, have the following input parameters: (src_loader, tgt_loader, model, optimizer). This is not a restriction on your function interface, but some suggestions. You are free to include additional parameters as needed. (1 mark)
(b) Train theResNet-34 model on the Sketch domain (use images in sketch_train), assuming all training labels are given. Then evaluate the model on the Sketch domain (use images in sketch_test). Include a screenshot of your training curve in your report. (1 mark)
(c) Train the ResNet-34 model on the Real domain (use images in real_train), assuming all training labels are given. Then evaluate the model on the Sketch domain (use images in sketch_test). Include a screenshot of your training curve in your report. (1 mark)
(d) Report the classification accuracy obtained from steps 2(b) and 2(c) on the test set (i.e., sketch_test). Compare and contrast these results and provide an explanation for any observations. (1 mark)
3. Implement the code for domain adaptation.
(a) You have two options for the domain adaptation loss function: the mean discrepancy loss or CORAL loss. Choose one of the two options in your implementation. In your code, add line-by-line comments and include these in your report. (1 mark)