Intent Classification and Slot Labeling¶
Reference: - Devlin, Jacob, et al. “Bert: Pre-training of deep bidirectional transformers for language understanding.” arXiv preprint arXiv:1810.04805 (2018). - Chen, Qian, et al. “BERT for Joint Intent Classification and Slot Filling.” arXiv preprint arXiv:1902.10909 (2019).
Joint Intent Classification and Slot Labelling¶
Intent classification and slot labelling are two essential problems in Natural Language Understanding (NLU). In intent classification, the agent needs to detect the intention that the speaker’s utterance conveys. For example, when the speaker says “Book a flight from Long Beach to Seattle”, the intention is to book a flight ticket. In slot labelling, the agent needs to extract the semantic entities that are related to the intent. In our previous example, “Long Beach” and “Seattle” are two semantic constituents related to the flight, i.e., the origin and the destination.
Essentially, intent classification can be viewed as a sequence classification problem and slot labelling can be viewed as a sequence tagging problem similar to Named-entity Recognition (NER). Due to their inner correlation, these two tasks are usually trained jointly with a multi-task objective function.
Here’s one example of the ATIS dataset, it uses the IOB2 format.
Sentence |
Tags |
Intent Label |
---|---|---|
are |
O |
atis_flight |
there |
O |
|
any |
O |
|
flight |
O |
|
from |
O |
|
long |
B-fromloc.city_name |
|
beach |
I-fromloc.city_name |
|
to |
O |
|
columbus |
B-toloc.city_name |
|
on |
O |
|
wednesday |
B-depart_date.day_name |
|
april |
B-depart_date.month_name |
|
sixteen |
B-depart_date.day_number |
In this example, we demonstrate how to use GluonNLP to fine-tune a pretrained BERT model for joint intent classification and slot labelling. We choose to finetune a pretrained BERT model. We use two datasets ATIS and SNIPS.
The training script requires the seqeval and tqdm packages:
$ pip3 install seqeval --user
$ pip3 install tqdm --user
For the ATIS dataset, use the following command to run the experiment:
$ python finetune_icsl.py --gpu 0 --dataset atis
It produces the final slot labelling F1 = 95.83% and intent classification accuracy = 98.66%
For the SNIPS dataset, use the following command to run the experiment:
$ python finetune_icsl.py --gpu 0 --dataset snips
It produces the final slot labelling F1 = 96.06% and intent classification accuracy = 98.71%
Also, we train the models with three random seeds and report the mean/std.
For ATIS
Models |
Intent Acc (%) |
Slot F1 (%) |
---|---|---|
98.77 |
96.52 |
|
97.42 |
95.62 |
|
97.5 |
96.1 |
|
Ours |
98.66±0.00 |
95.88±0.04 |
For SNIPS
Models |
Intent Acc (%) |
Slot F1 (%) |
---|---|---|
99.29 |
93.90 |
|
98.60 |
97.00 |
|
Ours |
98.81±0.13 |
95.94±0.10 |