Text Classification - Quick Start

Note: TextClassification is in preview mode and is not feature complete. While the tutorial described below is functional, using TextClassification on custom datasets is not yet supported. For an alternative, text data can be passed to TabularPrediction in tabular format which has text feature support.

We adopt the task of Text Classification as a running example to illustrate basic usage of AutoGluon’s NLP capability.

The AutoGluon Text functionality depends on the GluonNLP package. Thus, in order to use AutoGluon-Text, you will need to install GluonNLP via pip install gluonnlp==0.8.1

In this tutorial, we are using sentiment analysis as a text classification example. We will load sentences and the corresponding labels (sentiment) into AutoGluon and use this data to obtain a neural network that can classify new sentences. Different from traditional machine learning where we need to manually define the neural network, and specify the hyperparameters in the training process, with just a single call to AutoGluon’s fit function, AutoGluon will automatically train many models under thousands of different hyperparameter configurations and then return the best model.

We begin by specifying TextClassification as our task of interest:

import autogluon as ag
from autogluon import TextClassification as task

Create AutoGluon Dataset

We are using a subset of the Stanford Sentiment Treebank (SST). The original dataset consists of sentences from movie reviews and human annotations of their sentiment. The task is to classify whether a given sentence has positive or negative sentiment (binary classification).

dataset = task.Dataset(name='ToySST')

In the above call, we have the proper train/validation/test split of the SST dataset.

Use AutoGluon to fit Models

Now, we want to obtain a neural network classifier using AutoGluon. In the default configuration, rather than attempting to train complex models from scratch using our data, AutoGluon fine-tunes neural networks that have already been pretrained on large scale text dataset such as Wikicorpus. Although the dataset involves entirely different text, lower-level features captured in the representations of the pretrained network (such as edge/texture detectors) are likely to remain useful for our own text dataset.

While we primarily stick with default configurations in this Beginner tutorial, the Advanced tutorial covers various options that you can specify for greater control over the training process. With just a single call to AutoGluon’s fit function, AutoGluon will train many models with different hyperparameter configurations and return the best model.

However, neural network training can be quite time-costly. To ensure quick runtimes, we tell AutoGluon to obey strict limits: epochs specifies how much computational effort can be devoted to training any single network, while time_limits in seconds specifies how much time fit has to return a model (more precisely, training runs are started as long as time_limits is not reached). For demo purposes, we specify only small values for time_limits, epochs:

predictor = task.fit(dataset, epochs=1, time_limits=30)
TextClassification is in preview mode.Please feel free to request new features in issues if it is not covered in the current implementation. If your dataset is in tabular format, you could also try out our TabularPrediction module.
scheduler_options: Key 'training_history_callback_delta_secs': Imputing default value 60
scheduler_options: Key 'delay_get_config': Imputing default value True

Starting Experiments
Num of Finished Tasks is 0
Time out (secs) is 30
scheduler: FIFOScheduler(
DistributedResourceManager{
(Remote: Remote REMOTE_ID: 0,
    <Remote: 'inproc://172.31.45.231/6809/1' processes=1 threads=8, memory=33.24 GB>, Resource: NodeResourceManager(8 CPUs, 1 GPUs))
})
Using gradient accumulation. Effective batch size = batch_size * accumulate = 32
  0%|          | 0/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 15.93it/s]validation metrics:accuracy:0.6250

Using gradient accumulation. Effective batch size = batch_size * accumulate = 32
  0%|          | 0/1 [00:00<?, ?it/s]
100%|██████████| 1/1 [00:00<00:00, 15.66it/s]validation metrics:accuracy:0.7500

Using gradient accumulation. Effective batch size = batch_size * accumulate = 32
100%|██████████| 26/26 [00:14<00:00,  1.79it/s]
100%|██████████| 1/1 [00:00<00:00, 15.59it/s]
validation metrics:accuracy:0.3750
Using gradient accumulation. Effective batch size = batch_size * accumulate = 32
100%|██████████| 26/26 [00:14<00:00,  1.76it/s]
100%|██████████| 1/1 [00:00<00:00, 15.22it/s]
validation metrics:accuracy:0.7500

Within fit, the model with the best hyperparameter configuration is selected based on its validation accuracy after being trained on the data in the training split.

The best Top-1 accuracy achieved on the validation set is:

print('Top-1 val acc: %.3f' % predictor.results['best_reward'])
Top-1 val acc: 0.750

Within fit, this model is also finally fitted on our entire dataset (i.e., merging training+validation) using the same optimal hyperparameter configuration. The resulting model is considered as final model to be applied to classify new text.

We now construct a test dataset similarly as we did with the train dataset, and then evaluate the final model produced by fit on the test data:

test_acc = predictor.evaluate(dataset)
print('Top-1 test acc: %.3f' % test_acc)
Top-1 test acc: 0.750

Given an example sentence, we can easily use the final model to predict the label (and the conditional class-probability):

sentence = 'I feel this is awesome!'
ind = predictor.predict(sentence)
print('The input sentence sentiment is classified as [%d].' % ind.asscalar())
The input sentence sentiment is classified as [1].

The results object returned by fit contains summaries describing various aspects of the training process. For example, we can inspect the best hyperparameter configuration corresponding to the final model which achieved the above (best) results:

print('The best configuration is:')
print(predictor.results['best_config'])
The best configuration is:
{'lr': 4.442672673345054e-05, 'net▁choice': 0, 'pretrained_dataset▁choice': 0}