Download the notebook here

Exercise 7 (solution)#

I recommend to run this notebook on google colab or on some other computer with access to a GPU. Otherwise, the fine-tuning will take very long and you cannot compare the runtime of GPU and CPU versions.

[ ]:
import os

IS_ON_COLAB = bool(os.getenv("COLAB_RELEASE_TAG"))

if IS_ON_COLAB:
    !pip install transformers==4.28.0
    !pip install tokenizers datasets sentencepiece huggingface_hub[cli] accelerate
[ ]:
from transformers import pipeline
from datasets import load_dataset
from transformers import logging
import torch
import pandas as pd
import numpy as np
from sklearn.metrics import classification_report
from time import time
from transformers import AutoTokenizer
import scipy

logging.set_verbosity_error()

Task 1: Create a text classification pipeline#

The following task uses concepts from lecture 3 (working with pipeline). Revisit the lecture slides if you get stuck.

  1. create a classifier using the pipeline function from transformers and the model "jsoutherland/distilbert-base-uncased-finetuned-emotion".

  2. Use the model to classify the validation-split of the emotion dataset

  3. Write down how long it took to do the classification

[ ]:
emotions = load_dataset("dair-ai/emotion", name="split")
[ ]:
classifier = pipeline(
    "text-classification",
    model="jsoutherland/distilbert-base-uncased-finetuned-emotion",
)
[ ]:
model_output = pd.DataFrame(classifier(emotions["validation"]["text"]))
model_output.head()

2 minutes

Task 2: Calculate sklearn scores for huggingface models#

  1. Translate the predictions into numerical values, using the following encoding: “sadness”: 0, “joy”: 1, “love”: 2, “anger”: 3, “fear”: 4,”surprise”: 5.

  2. Create a classification report by comparing the predictions against the true labels from the validation dataset.

  3. If you have time left, create a confusion matrix or other scores

[ ]:
replacements = {
    "sadness": 0,
    "joy": 1,
    "love": 2,
    "anger": 3,
    "fear": 4,
    "surprise": 5,
}
y_pred = model_output["label"].replace(replacements).to_numpy()
y_pred
[ ]:
y_test = np.array(emotions["validation"]["label"])
y_test
[ ]:
print(classification_report(y_test, y_pred))

Task 3: Compare CPU and GPU speed#

  1. Go to Edit -> Notebook Settings and select a GPU accelerator

  2. Create another random matrix of the same shape as a that lives on the gpu (if you have one)

  3. Measure the runtime of multiplying a with itself

  4. Measure the runtime of multiplying the gpu version with itself

  5. Calculate how much faster the GPU version was

  6. If you have time left, try different matrix sizes, especially tiny matrices

Note: For me, using %timeit did not work for the GPU version on google colab. Use time instead.

[ ]:
a = torch.randn(5_000, 5_000)
[ ]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device
[ ]:
a_gpu = torch.randn(5_000, 5_000).to(device)
[ ]:
start = time()
a @ a
cpu_time = time() - start
cpu_time
[ ]:
start = time()
a_gpu @ a_gpu
gpu_time = time() - start
gpu_time
[ ]:
cpu_time / gpu_time

Task 4: Pipeline with GPU#

  1. Create a new pipeline with device="cuda:0" and re-run the classification

  2. Compare the runtimes

[ ]:
classifier = pipeline(
    "text-classification",
    model="jsoutherland/distilbert-base-uncased-finetuned-emotion",
    device="cuda:0" if torch.cuda.is_available() else None,
)
[ ]:
model_output = pd.DataFrame(classifier(emotions["validation"]["text"]))
model_output.head()

The GPU made the classification 6 to 8 times faster

Task 5: Loading a model#

  1. Load a distilbert-base-uncased model for sequence classification that can be fine-tuned on the emotions dataset. Put it on the gpu.

[ ]:
model_name = "distilbert-base-uncased"

tokenizer = AutoTokenizer.from_pretrained(model_name)


def tokenize(batch):
    return tokenizer(batch["text"], padding=True, truncation=True)


emotions_encoded = emotions.map(tokenize, batched=True, batch_size=None)
emotions_encoded.set_format(
    "torch",
    columns=["input_ids", "attention_mask", "label"],
)
emotions_encoded.set_format("torch")
emotions_encoded
[ ]:
from transformers import AutoModelForSequenceClassification

num_labels = 6
model = AutoModelForSequenceClassification.from_pretrained(
    model_name, num_labels=num_labels
).to(device)

Task 6: Specifying the training options and training the model#

  1. Write a compute_metrics function or copy paste it from the slides

  2. Specify training arguments. Choose the same values as in the slides but only 1 epoch to make it faster.

  3. Create a Trainer instance

  4. Train the model

[ ]:
from sklearn.metrics import f1_score, accuracy_score


def compute_metrics(pred):
    logits, labels = pred
    preds = logits.argmax(axis=-1)
    f1 = f1_score(labels, preds, average="weighted")
    acc = accuracy_score(labels, preds)
    return {"accuracy": acc, "f1": f1}
[ ]:
from transformers import TrainingArguments

batch_size = 64
logging_steps = len(emotions_encoded["train"]) // batch_size

training_args = TrainingArguments(
    output_dir="results",
    optim="adamw_torch",
    per_device_train_batch_size=batch_size,
    num_train_epochs=1,
    load_best_model_at_end=True,
    metric_for_best_model="f1",
    evaluation_strategy="epoch",
    save_strategy="epoch",
    disable_tqdm=False,
    logging_steps=logging_steps,
)
[ ]:
from transformers import Trainer

trainer = Trainer(
    model=model,
    args=training_args,
    compute_metrics=compute_metrics,
    train_dataset=emotions_encoded["train"],
    eval_dataset=emotions_encoded["validation"],
)
[ ]:
trainer.train()

Task 7: Use the fine tuned model#

  1. Tokenize the text and run it through the model to get logits

  2. Use scipy.special.softmax to create probabilities

  3. Plot the probabilities of each class (optional)

[ ]:
custom_text = "I am glad the class is almost over."
[ ]:
input_tensor = tokenizer.encode(custom_text, return_tensors="pt").to(device)
with torch.no_grad():
    logits = model(input_tensor).logits.cpu()
[ ]:
logits
[ ]:
probs = scipy.special.softmax(logits.flatten())
probs
[ ]:
labels = ["sadness", "joy", "love", "anger", "fear", "surprise"]
pd.Series(probs, index=labels).plot.barh()
[ ]: