{ "cells": [ { "cell_type": "markdown", "id": "799724ca", "metadata": {}, "source": [ "# Exercise 2" ] }, { "cell_type": "code", "execution_count": null, "id": "5784c7b5", "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "from pathlib import Path\n", "from seaborn import load_dataset" ] }, { "cell_type": "markdown", "id": "56a0fc42", "metadata": {}, "source": [ "## Task 1: List and dict comprehensions\n", "\n", "Assume that we have a dictionary that stores some quality measures for models of different sizes:" ] }, { "cell_type": "code", "execution_count": null, "id": "568d7727", "metadata": {}, "outputs": [], "source": [ "results = {\n", " \"large\": {\"acc\": 0.9, \"f1\": 0.85},\n", " \"medium\": {\"acc\": 0.83, \"f1\": 0.87},\n", " \"small\": {\"acc\": 0.7, \"f1\": 0.5},\n", "}" ] }, { "cell_type": "markdown", "id": "c793df5e", "metadata": {}, "source": [ "Create a variable called `acc` that maps the models to accurracy. " ] }, { "cell_type": "code", "execution_count": null, "id": "280b285e", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "3b722008", "metadata": {}, "source": [ "Filter the results dictionary such that only the information for models with an accurracy over 0.8 is kept." ] }, { "cell_type": "code", "execution_count": null, "id": "84c9f6a1", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "b4ad4d8e", "metadata": {}, "source": [ "## Task 2: Create numpy arrays\n", "\n", "Create the following arrays:\n", "\n", "1. A three-dimensional array of shape `(2, 3, 4)` containing zeros\n", "2. A two-dimensional array with 4 rows and 5 columns that contain that is equivalent to the list `[[0.1, 0.2, 0.3, 0.4, 0.5], [0.6, 0.7, 0.8, 0.9, 1. ]]`. Do not just type in the numbers.\n", "3. Create a 3 x 3 identity matrix\n", "4. Create a 3 x 4 empty array (using `np.empty`) and compare it's entries with the ones your neighbor gets. " ] }, { "cell_type": "code", "execution_count": null, "id": "41f09872", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "3e52e90e", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "bd5425ee", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "a36a37b2", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "8ea4e457", "metadata": {}, "source": [ "## Task 3: Numpy indexing\n", "\n", "Through the entire task, work with the arrays a and b from the lecture slides" ] }, { "cell_type": "code", "execution_count": null, "id": "cd3b3045", "metadata": {}, "outputs": [], "source": [ "a = np.arange(5)\n", "b = np.arange(12).reshape(4, 3)" ] }, { "cell_type": "markdown", "id": "da372b36", "metadata": {}, "source": [ "Select the middle element of a" ] }, { "cell_type": "code", "execution_count": null, "id": "5a722b9e", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "a0061ff4", "metadata": {}, "source": [ "Select all of a (but you need to put something into the square brackets)" ] }, { "cell_type": "code", "execution_count": null, "id": "42bf1b56", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "b4b6d349", "metadata": {}, "source": [ "Select the last two rows of b" ] }, { "cell_type": "code", "execution_count": null, "id": "ee0bc6e8", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "4d1d8086", "metadata": {}, "source": [ "Select the last two columns of b" ] }, { "cell_type": "code", "execution_count": null, "id": "748945bf", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "418f1e37", "metadata": {}, "source": [ "Select the last two columns of the last two rows of b" ] }, { "cell_type": "code", "execution_count": null, "id": "d4da771e", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "62682bd1", "metadata": {}, "source": [ "## Task 4: Numpy calculations" ] }, { "cell_type": "code", "execution_count": null, "id": "72f58069", "metadata": {}, "outputs": [], "source": [ "x = np.array([[0.5, 1.5], [2.5, 3.5]])\n", "y = np.diag([2, 3])\n", "z = np.array([2, 3])" ] }, { "cell_type": "markdown", "id": "d8d7fc19", "metadata": {}, "source": [ "Do the following calculations with the arrays x, y, z\n", "\n", "1. Do a matrix multiplication of the two arrays x and y\n", "2. Do an elementwise multiplication of the matrices x and y\n", "3. Do an elementwise addition x and z\n", "4. Do an elementwise addition of x and `z.reshape(-1, 1)`\n", "5. Describe the difference between the last two tasks.\n", "6. Take the exponent of the array z\n", "7. Sum the two rows in x" ] }, { "cell_type": "code", "execution_count": null, "id": "11eeca30", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "ff1d5110", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "7385d4cd", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "b82328e7", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "9542ba4d", "metadata": {}, "source": [] }, { "cell_type": "code", "execution_count": null, "id": "f8a00b69", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "id": "b34514fc", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "d3ca5c28", "metadata": {}, "source": [ "## Task 5: File paths" ] }, { "cell_type": "markdown", "id": "efecb1a6", "metadata": {}, "source": [ "Define a path called `ROOT` that leads to the directory in which you store all materials for this course. Define the path relative to this notebook and then convert it to an absolute path. Note: The solution is different for everyone and depends on the directory structure you chose. " ] }, { "cell_type": "code", "execution_count": null, "id": "215b394b", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "57f1c16b", "metadata": {}, "source": [ "Define a path to this notebook and use it to proof that this notebook exists" ] }, { "cell_type": "code", "execution_count": null, "id": "1432e22c", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "69f8b814", "metadata": {}, "source": [ "## Task 6: Read and save DataFrames" ] }, { "cell_type": "code", "execution_count": null, "id": "469ed6e8", "metadata": {}, "outputs": [], "source": [ "iris = load_dataset(\"iris\")\n", "iris.head()" ] }, { "cell_type": "markdown", "id": "3ce13c47", "metadata": {}, "source": [ "Save this dataset in each of the file formats presented in the slides. Then re-load them into a DataFrame using the corresponding `read` function" ] }, { "cell_type": "code", "execution_count": null, "id": "e757c46d", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "56f19854", "metadata": {}, "source": [ "Look at the dataset you reloaded from the csv file. What do you see? " ] }, { "cell_type": "markdown", "id": "17f7aade", "metadata": {}, "source": [] }, { "cell_type": "markdown", "id": "bbceda7a", "metadata": {}, "source": [ "## Task 7: Create Variables" ] }, { "cell_type": "markdown", "id": "44d44a9c", "metadata": {}, "source": [ "Add the square and the log of each numerical variable (i.e. all but \"species\") in the dataset. Use \"NAME_squared\" and \"log_NAME\" as naming conventions. Do not type in variable lists." ] }, { "cell_type": "code", "execution_count": null, "id": "01dbd37a", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "0999da32", "metadata": {}, "source": [ "## Task 8: Select data" ] }, { "cell_type": "markdown", "id": "643c711f", "metadata": {}, "source": [ "Select all rows where the species is setosa and the sepal_length is greater or equal to 5" ] }, { "cell_type": "code", "execution_count": null, "id": "380489cc", "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "markdown", "id": "511844a9", "metadata": {}, "source": [ "## Task 9: Errors and Tracebacks" ] }, { "cell_type": "markdown", "id": "c2475316", "metadata": {}, "source": [ "Write me a message in zulip where you describe this error" ] }, { "cell_type": "code", "execution_count": null, "id": "06adc834", "metadata": { "scrolled": false }, "outputs": [], "source": [ "df = pd.DataFrame(data=np.ones(2, 2), columns=[\"a\", \"b\"])" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.3" } }, "nbformat": 4, "nbformat_minor": 5 }