{ "cells": [ { "cell_type": "markdown", "id": "f99f87f3", "metadata": {}, "source": [ "# Exercise 9 (solution)\n", "\n", "If you get an error when importing the libraries, you might have to uncommend and execute the following" ] }, { "cell_type": "code", "execution_count": null, "id": "879a98d6", "metadata": {}, "outputs": [], "source": [ "# !pip install pillow==6.2.1" ] }, { "cell_type": "code", "execution_count": null, "id": "45ecaefc", "metadata": {}, "outputs": [], "source": [ "import torch\n", "from torch import nn\n", "from torch.utils.data import DataLoader\n", "import torchvision\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "id": "16832160", "metadata": {}, "source": [ "## Loading Data\n", "\n", "- Load the `training_data` of the MNIST dataset from torchvision\n", "- Load the `test_data` of the MNIST dataset from torchvision\n", "- Select the first x tensor from the training data\n", "\n", "In botch cases, make sure that the raw image files are transformed to a pytorch tensor." ] }, { "cell_type": "code", "execution_count": null, "id": "d381aca3", "metadata": {}, "outputs": [], "source": [ "training_data = torchvision.datasets.MNIST(\n", " root=\"data\", train=True, download=True, transform=torchvision.transforms.ToTensor()\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "7aed6195", "metadata": {}, "outputs": [], "source": [ "test_data = torchvision.datasets.MNIST(\n", " root=\"data\", train=False, download=True, transform=torchvision.transforms.ToTensor()\n", ")" ] }, { "cell_type": "code", "execution_count": null, "id": "75e6fba4", "metadata": {}, "outputs": [], "source": [ "training_data" ] }, { "cell_type": "code", "execution_count": null, "id": "4510f2a0", "metadata": {}, "outputs": [], "source": [ "x = training_data[0][0]\n", "x.shape" ] }, { "cell_type": "code", "execution_count": null, "id": "63607e1d", "metadata": {}, "outputs": [], "source": [ "fig = plt.imshow(x.squeeze(dim=0))" ] }, { "cell_type": "markdown", "id": "f6f19ff0", "metadata": {}, "source": [ "## Task 1: Specify a Model\n", "\n", "1. Write a model class that implements the same model as last week. To make it flexibel, the init function should take the dimension `n_in`, `n_hidden`, and `n_out` as arguments. Do not add a softmax activation at the end, i.e. stop after the last linear layer. \n", "2. Instantiate a model with the same dimensions as last week (`n_in` = 784, `n_hidden` = 16, `n_out` = 10)\n", "3. Evaluate the model on the first x variables of the training data.\n", "4. Loop over the model parameters and print the shape of each parameter tensor. Make sure you understand why the shapes are what they are." ] }, { "cell_type": "code", "execution_count": null, "id": "b65185d9", "metadata": {}, "outputs": [], "source": [ "class NeuralNetwork(nn.Module):\n", " def __init__(self, n_in, n_hidden, n_out):\n", " super().__init__()\n", " self.flatten = nn.Flatten()\n", " self.all_layers = nn.Sequential(\n", " nn.Linear(n_in, n_hidden),\n", " nn.ReLU(),\n", " nn.Linear(n_hidden, n_hidden),\n", " nn.ReLU(),\n", " nn.Linear(n_hidden, n_out),\n", " )\n", "\n", " def forward(self, x):\n", " x = self.flatten(x)\n", " logits = self.all_layers(x)\n", " return logits" ] }, { "cell_type": "code", "execution_count": null, "id": "c1f05a5f", "metadata": {}, "outputs": [], "source": [ "n_in = 28 * 28\n", "n_hidden = 16\n", "n_out = 10\n", "device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n", "\n", "model = NeuralNetwork(n_in, n_hidden, n_out).to(device)\n", "model" ] }, { "cell_type": "code", "execution_count": null, "id": "ae8c7890", "metadata": {}, "outputs": [], "source": [ "model(training_data[0][0])" ] }, { "cell_type": "code", "execution_count": null, "id": "a449f52d", "metadata": {}, "outputs": [], "source": [ "for p in model.parameters():\n", " print(p.shape)" ] }, { "cell_type": "markdown", "id": "1a83bfd2", "metadata": {}, "source": [ "## Task 2: Define DataLoaders\n", "\n", "1. Define a `train_dataloader` and `test_dataloader`\n", "2. Loop over batches and print the shape of every 100th batch\n", "3. Calculate logits by evaluating the model on the X batch from the last iteration of the loop\n", "\n", "Both dataloaders should have a batch size of 64; The train_dataloader should be shuffled. " ] }, { "cell_type": "code", "execution_count": null, "id": "2587670f", "metadata": {}, "outputs": [], "source": [ "batch_size = 64\n", "train_dataloader = DataLoader(\n", " training_data, batch_size=batch_size, shuffle=True, drop_last=True\n", ")\n", "test_dataloader = DataLoader(test_data, batch_size=batch_size, drop_last=True)" ] }, { "cell_type": "code", "execution_count": null, "id": "4b86eb6b", "metadata": {}, "outputs": [], "source": [ "for i, (X, y) in enumerate(train_dataloader):\n", " if i % 100 == 0:\n", " print(X.shape)" ] }, { "cell_type": "code", "execution_count": null, "id": "f391efd7", "metadata": {}, "outputs": [], "source": [ "logits = model(X)\n", "logits.shape" ] }, { "cell_type": "markdown", "id": "0f1541f3", "metadata": {}, "source": [ "## Task 3: Instantiate a loss function\n", "\n", "1. Create an instance of pytorch's cross_entropy_loss function\n", "2. Evaluate the loss function on the first observation in the training data" ] }, { "cell_type": "code", "execution_count": null, "id": "85d611ab", "metadata": {}, "outputs": [], "source": [ "loss_func = nn.CrossEntropyLoss()" ] }, { "cell_type": "code", "execution_count": null, "id": "c7d55429", "metadata": {}, "outputs": [], "source": [ "loss_func(logits, y)" ] }, { "cell_type": "markdown", "id": "8717ed78", "metadata": {}, "source": [ "## Task 4: Instantiate an optimizer" ] }, { "cell_type": "code", "execution_count": null, "id": "8ac52cfb", "metadata": {}, "outputs": [], "source": [ "learning_rate = 0.1\n", "optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)" ] }, { "cell_type": "markdown", "id": "808edea4", "metadata": {}, "source": [ "## Task 5: Write a function for the inner training loop\n", "\n", "Write a function called `train_loop` that takes the following arguments:\n", "- dataloader: The dataloader instance used in training\n", "- model: A model instance\n", "- loss_fn: A loss function\n", "- optimizer: An instance of a pytorch optimizer\n", "\n", "The function should then perform the following steps:\n", "1. Set the model in training mode\n", "2. Loop over batches in the dataset \n", "3. Evaluate the model and loss function\n", "4. Backpropagate the gradients\n", "5. Do an optimizer step\n", "6. Zero the gradients\n", "8. Print the loss every 200 batches\n", "\n", "Evaluate the function once on the inputs you already defined in earlier tasks to make sure that it works." ] }, { "cell_type": "code", "execution_count": null, "id": "5670f93b", "metadata": {}, "outputs": [], "source": [ "def train_loop(dataloader, model, loss_fn, optimizer):\n", " # Set the model to training mode (best practice)\n", " model.train()\n", " for i, (X, y) in enumerate(dataloader):\n", " pred = model(X)\n", " loss = loss_fn(pred, y)\n", " loss.backward()\n", " optimizer.step()\n", " optimizer.zero_grad()\n", "\n", " if i % 200 == 0:\n", " print(f\"Loss after {i} batches: {loss.item()}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "e49ea7d2", "metadata": {}, "outputs": [], "source": [ "train_loop(train_dataloader, model, loss_func, optimizer)" ] }, { "cell_type": "markdown", "id": "dbf9070a", "metadata": {}, "source": [ "## Task 6: Write a function for the inner test loop\n", "\n", "Write a function called `test_loop` that takes the following arguments:\n", "- dataloader: The dataloader for the testing data\n", "- model: A model instance\n", "\n", "The function should do the following steps:\n", "- Set the model in eval mode\n", "- Initialize `correct` to zero\n", "- Start a `torch.no_grad` context\n", "- loop over batches in the dataloader\n", "- Evaluate the model, and add the number of correct examples in the batch to `correct`\n", "- Calculate the accuracy by dividing `correct` by the length of the dataset" ] }, { "cell_type": "code", "execution_count": null, "id": "6446633e", "metadata": {}, "outputs": [], "source": [ "def test_loop(dataloader, model):\n", " # Set the model to evaluation mode (best practice)\n", " model.eval()\n", "\n", " correct = 0\n", " with torch.no_grad():\n", " for X, y in dataloader:\n", " pred = model(X)\n", " correct += (pred.argmax(axis=1) == y).to(torch.float).sum().item()\n", "\n", " acc = correct / len(dataloader.dataset)\n", " print(f\"Test Accuracy: {acc}\")" ] }, { "cell_type": "code", "execution_count": null, "id": "17009bf0", "metadata": {}, "outputs": [], "source": [ "test_loop(test_dataloader, model)" ] }, { "cell_type": "markdown", "id": "a8aacadb", "metadata": {}, "source": [ "## Task 7: Run the outer loop with different optimizers\n", "\n", "At the beginning of the next cell we create new instances of the model, optimizer, etc. so we know that you get the same output if you run the training loop multiple times. Otherwise, the second run would continue where the first has left\n", "\n", "1. Experiment with tuning parameters to get a better accuracy\n", "2. Switch out SGD by Adam or other optimizers and see if that speeds up the optimization or gives you better results" ] }, { "cell_type": "code", "execution_count": null, "id": "5f48140a", "metadata": {}, "outputs": [], "source": [ "# training hyperparameters\n", "n_epochs = 3\n", "batch_size = 64\n", "learning_rate = 0.01\n", "\n", "# initialization\n", "model = NeuralNetwork(n_in, n_hidden, n_out).to(device)\n", "loss_func = nn.CrossEntropyLoss()\n", "optimizer = torch.optim.Adam(\n", " model.parameters(),\n", " lr=learning_rate,\n", ")\n", "train_dataloader = DataLoader(\n", " training_data,\n", " batch_size=batch_size,\n", " shuffle=True,\n", " drop_last=True,\n", ")\n", "test_dataloader = DataLoader(\n", " test_data,\n", " batch_size=batch_size,\n", " drop_last=True,\n", ")\n", "\n", "# training loop\n", "for t in range(n_epochs):\n", " print(f\"Epoch {t+1}\\n-------------------------------\")\n", " train_loop(train_dataloader, model, loss_func, optimizer)\n", " test_loop(test_dataloader, model)\n", "print(\"Done!\")" ] }, { "cell_type": "code", "execution_count": null, "id": "6a47ee46", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.3" } }, "nbformat": 4, "nbformat_minor": 5 }