Skip to content

Latest commit

 

History

History
301 lines (272 loc) · 10.6 KB

6.mdx

File metadata and controls

301 lines (272 loc) · 10.6 KB

<FrameworkSwitchCourse {fw} />

End-of-chapter quiz[[end-of-chapter-quiz]]

Test what you learned in this chapter!

1. The emotion dataset contains Twitter messages labeled with emotions. Search for it in the Hub, and read the dataset card. Which of these is not one of its basic emotions?

<Question choices={[ { text: "Joy", explain: "Try again — this emotion is present in that dataset!" }, { text: "Love", explain: "Try again — this emotion is present in that dataset!" }, { text: "Confusion", explain: "Correct! Confusion is not one of the six basic emotions.", correct: true }, { text: "Surprise", explain: "Surprise! Try another one!" } ]} />

2. Search for the ar_sarcasm dataset in the Hub. Which task does it support?

<Question choices={[ { text: "Sentiment classification", explain: "That's right! You can tell thanks to the tags.", correct: true }, { text: "Machine translation", explain: "That's not it — take another look at the dataset card!" }, { text: "Named entity recognition", explain: "That's not it — take another look at the dataset card!" }, { text: "Question answering", explain: "Alas, this question was not answered correctly. Try again!" } ]} />

3. How does the BERT model expect a pair of sentences to be processed?

<Question choices={[ { text: "Tokens_of_sentence_1 [SEP] Tokens_of_sentence_2", explain: "A [SEP] special token is needed to separate the two sentences, but that's not the only thing!" }, { text: "[CLS] Tokens_of_sentence_1 Tokens_of_sentence_2", explain: "A [CLS] special token is required at the beginning, but that's not the only thing!" }, { text: "[CLS] Tokens_of_sentence_1 [SEP] Tokens_of_sentence_2 [SEP]", explain: "That's correct!", correct: true }, { text: "[CLS] Tokens_of_sentence_1 [SEP] Tokens_of_sentence_2", explain: "A [CLS] special token is needed at the beginning as well as a [SEP] special token to separate the two sentences, but that's not all!" } ]} />

{#if fw === 'pt'}

4. What are the benefits of the Dataset.map() method?

<Question choices={[ { text: "The results of the function are cached, so it won't take any time if we re-execute the code.", explain: "That is indeed one of the neat benefits of this method! It's not the only one, though...", correct: true }, { text: "It can apply multiprocessing to go faster than applying the function on each element of the dataset.", explain: "This is a neat feature of this method, but it's not the only one!", correct: true }, { text: "It does not load the whole dataset into memory, saving the results as soon as one element is processed.", explain: "That's one advantage of this method. There are others, though!", correct: true }, ]} />

5. What does dynamic padding mean?

<Question choices={[ { text: "It's when you pad the inputs for each batch to the maximum length in the whole dataset.", explain: "It does imply padding when creating the batch, but not to the maximum length in the whole dataset." }, { text: "It's when you pad your inputs when the batch is created, to the maximum length of the sentences inside that batch.", explain: "That's correct! The "dynamic" part comes from the fact that the size of each batch is determined at the time of creation, and all your batches might have different shapes as a result.", correct: true }, { text: "It's when you pad your inputs so that each sentence has the same number of tokens as the previous one in the dataset.", explain: "That's incorrect, plus it doesn't make sense to look at the order in the dataset since we shuffle it during training." }, ]} />

6. What is the purpose of a collate function?

<Question choices={[ { text: "It ensures all the sequences in the dataset have the same length.", explain: "A collate function is involved in handling individual batches, not the whole dataset. Additionally, we're talking about generic collate functions, not DataCollatorWithPadding specifically." }, { text: "It puts together all the samples in a batch.", explain: "Correct! You can pass the collate function as an argument of a DataLoader. We used the DataCollatorWithPadding function, which pads all items in a batch so they have the same length.", correct: true }, { text: "It preprocesses the whole dataset.", explain: "That would be a preprocessing function, not a collate function." }, { text: "It truncates the sequences in the dataset.", explain: "A collate function is involved in handling individual batches, not the whole dataset. If you're interested in truncating, you can use the truncate argument of tokenizer." } ]} />

7. What happens when you instantiate one of the AutoModelForXxx classes with a pretrained language model (such as bert-base-uncased) that corresponds to a different task than the one for which it was trained?

<Question choices={[ { text: "Nothing, but you get a warning.", explain: "You do get a warning, but that's not all!" }, { text: "The head of the pretrained model is discarded and a new head suitable for the task is inserted instead.", explain: "Correct. For example, when we used AutoModelForSequenceClassification with bert-base-uncased, we got warnings when instantiating the model. The pretrained head is not used for the sequence classification task, so it's discarded and a new head is instantiated with random weights.", correct: true }, { text: "The head of the pretrained model is discarded.", explain: "Something else needs to happen. Try again!" }, { text: "Nothing, since the model can still be fine-tuned for the different task.", explain: "The head of the pretrained model was not trained to solve this task, so we should discard the head!" } ]} />

8. What's the purpose of TrainingArguments?

<Question choices={[ { text: "It contains all the hyperparameters used for training and evaluation with the Trainer.", explain: "Correct!", correct: true }, { text: "It specifies the size of the model.", explain: "The model size is defined by the model configuration, not the class TrainingArguments." }, { text: "It just contains the hyperparameters used for evaluation.", explain: "In the example, we specified where the model and its checkpoints will be saved. Try again!" }, { text: "It just contains the hyperparameters used for training.", explain: "In the example, we used an evaluation_strategy as well, so this impacts evaluation. Try again!" } ]} />

9. Why should you use the 🤗 Accelerate library?

<Question choices={[ { text: "It provides access to faster models.", explain: "No, the 🤗 Accelerate library does not provide any models." }, { text: "It provides a high-level API so I don't have to implement my own training loop.", explain: "This is what we did with Trainer, not the 🤗 Accelerate library. Try again!" }, { text: "It makes our training loops work on distributed strategies.", explain: "Correct! With 🤗 Accelerate, your training loops will work for multiple GPUs and TPUs.", correct: true }, { text: "It provides more optimization functions.", explain: "No, the 🤗 Accelerate library does not provide any optimization functions." } ]} />

{:else}

4. What happens when you instantiate one of the TFAutoModelForXxx classes with a pretrained language model (such as bert-base-uncased) that corresponds to a different task than the one for which it was trained?

<Question choices={[ { text: "Nothing, but you get a warning.", explain: "You do get a warning, but that's not all!" }, { text: "The head of the pretrained model is discarded and a new head suitable for the task is inserted instead.", explain: "Correct. For example, when we used TFAutoModelForSequenceClassification with bert-base-uncased, we got warnings when instantiating the model. The pretrained head is not used for the sequence classification task, so it's discarded and a new head is instantiated with random weights.", correct: true }, { text: "The head of the pretrained model is discarded.", explain: "Something else needs to happen. Try again!" }, { text: "Nothing, since the model can still be fine-tuned for the different task.", explain: "The head of the pretrained model was not trained to solve this task, so we should discard the head!" } ]} />

5. The TensorFlow models from transformers are already Keras models. What benefit does this offer?

<Question choices={[ { text: "The models work on a TPU out of the box.", explain: "Almost! There are some small additional changes required. For example, you need to run everything in a TPUStrategy scope, including the initialization of the model." }, { text: "You can leverage existing methods such as compile(), fit(), and predict().", explain: "Correct! Once you have the data, training on it requires very little work.", correct: true }, { text: "You get to learn Keras as well as transformers.", explain: "Correct, but we're looking for something else :)", correct: true }, { text: "You can easily compute metrics related to the dataset.", explain: "Keras helps us with training and evaluating the model, not computing dataset-related metrics." } ]} />

6. How can you define your own custom metric?

<Question choices={[ { text: "By subclassing tf.keras.metrics.Metric.", explain: "Great!", correct: true }, { text: "Using the Keras functional API.", explain: "Try again!" }, { text: "By using a callable with signature metric_fn(y_true, y_pred).", explain: "Correct!", correct: true }, { text: "By Googling it.", explain: "That's not the answer we're looking for, but it should help you find it.", correct: true } ]} />

{/if}