Intent Classification and Slot Labeling¶

Download scripts

Reference: - Devlin, Jacob, et al. “Bert: Pre-training of deep bidirectional transformers for language understanding.” arXiv preprint arXiv:1810.04805 (2018). - Chen, Qian, et al. “BERT for Joint Intent Classification and Slot Filling.” arXiv preprint arXiv:1902.10909 (2019).

Joint Intent Classification and Slot Labelling¶

Intent classification and slot labelling are two essential problems in Natural Language Understanding (NLU). In intent classification, the agent needs to detect the intention that the speaker’s utterance conveys. For example, when the speaker says “Book a flight from Long Beach to Seattle”, the intention is to book a flight ticket. In slot labelling, the agent needs to extract the semantic entities that are related to the intent. In our previous example, “Long Beach” and “Seattle” are two semantic constituents related to the flight, i.e., the origin and the destination.

Essentially, intent classification can be viewed as a sequence classification problem and slot labelling can be viewed as a sequence tagging problem similar to Named-entity Recognition (NER). Due to their inner correlation, these two tasks are usually trained jointly with a multi-task objective function.

Here’s one example of the ATIS dataset, it uses the IOB2 format.

Sentence	Tags	Intent Label
are	O	atis_flight
there	O
any	O
flight	O
from	O
long	B-fromloc.city_name
beach	I-fromloc.city_name
to	O
columbus	B-toloc.city_name
on	O
wednesday	B-depart_date.day_name
april	B-depart_date.month_name
sixteen	B-depart_date.day_number

In this example, we demonstrate how to use GluonNLP to fine-tune a pretrained BERT model for joint intent classification and slot labelling. We choose to finetune a pretrained BERT model. We use two datasets ATIS and SNIPS.

The training script requires the seqeval and tqdm packages:

$ pip3 install seqeval --user
$ pip3 install tqdm --user

For the ATIS dataset, use the following command to run the experiment:

$ python finetune_icsl.py --gpu 0 --dataset atis

It produces the final slot labelling F1 = 95.83% and intent classification accuracy = 98.66%

For the SNIPS dataset, use the following command to run the experiment:

$ python finetune_icsl.py --gpu 0 --dataset snips

It produces the final slot labelling F1 = 96.06% and intent classification accuracy = 98.71%

Also, we train the models with three random seeds and report the mean/std.

For ATIS

Models	Intent Acc (%)	Slot F1 (%)
Intent Gating & self-attention, EMNLP 2018	98.77	96.52
BLSTM-CRF + ELMo, AAAI 2019,	97.42	95.62
Joint BERT, Arxiv 2019,	97.5	96.1
Ours	98.66±0.00	95.88±0.04

For SNIPS

Models	Intent Acc (%)	Slot F1 (%)
BLSTM-CRF + ELMo, AAAI 2019	99.29	93.90
Joint BERT, Arxiv 2019	98.60	97.00
Ours	98.81±0.13	95.94±0.10