From Immersive Visualization Lab Wiki
Jump to: navigation, search

Replication Checkpoint #2

For Checkpoint 2 we are going to add deep learning to our toolkit, in order to eventually apply the Grad-CAM XAI techniques to the COCO dataset (but that will not happen until the final checkpoint).

PyTorch is, like TensorFlow, an open source Python API that uses graphs to perform numerical computation on data. PyTorch builds on top of Torch and has a Python wrapper.

If you aren't already familiar with PyTorch, we recommend that you start with the 60 minute Blitz tutorial. This tutorial consists of four units, which walk you through all the steps required to train a small neural network to classify images. We recommend that you use the option to "Run in Google Colab", which you can do with a button at the top of each tutorial unit.

You can develop and test your Checkpoint project in Google Colab or in a Jupyter Notebook if you prefer, but you need to submit your code as a Python project just like Checkpoint 1.

Code Portion

For this checkpoint you need to create a PyTorch application in Python to do the following things:

  • Import the COCO dataset using the torchvision package. This tutorial may be useful.
  • Teach a convolutional neural network image classification, using the COCO dataset. You can use the approach from the Blitz tutorial, only that you need to adapt it to the COCO dataset.
  • Implement a demo for your image classification algorithm. This can be similar to the Blitz tutorial, where they recognize objects and calculate the recognition accuracy. Or it can be something different.

Report Portion

  • Describe the classification problem you set out to solve.
  • Describe the approach of your solution: how did you solve the problem?
  • How accurate are the results of your image classification demo? (compare against ground truth)
  • Suggest at least 3 things one could try to get more accurate results. (eg, network design, training data, training parameters, etc.)