# gluonnlp.utils¶

GluonNLP Toolkit provides tools for easily setting up task specific loss.

## File Handling¶

 glob Return a list of paths matching a pathname pattern. mkdir Create a directory.

## Parameter and Training¶

 clip_grad_global_norm Rescales gradients of parameters so that the sum of their 2-norm is smaller than max_norm.

## Serialization and Deserialization¶

 load_parameters Load parameters from file previously saved by save_parameters. load_states Loads trainer states (e.g. save_parameters Save parameters to file. save_states Saves trainer states (e.g.

## Setting Seed¶

 set_seed Sets the seed for reproducibility

## API Reference¶

Module for utility functions.

class gluonnlp.utils.Parallelizable[source]

Base class for parallelizable unit of work, which can be invoked by Parallel. The subclass must implement the forward_backward method, and be used together with Parallel. For example:

class ParallelNet(Parallelizable):
def __init__(self):
self._net = Model()
self._loss = gluon.loss.SoftmaxCrossEntropyLoss()

def forward_backward(self, x):
data, label = x
out = self._net(data)
loss = self._loss(out, label)
loss.backward()
return loss

net = ParallelNet()
ctx = [mx.gpu(0), mx.gpu(1)]
parallel = Parallel(len(ctx), net)
# Gluon block is initialized after forwarding the first batch
initialized = False

for batch in batches:
parallel.put(x)
losses = [parallel.get() for _ in ctx]
trainer.step()

forward_backward(x)[source]

Forward and backward computation.

class gluonnlp.utils.Parallel(num_workers, parallizable, serial_init=True)[source]

Class for parallel processing with Parallelizables. It invokes a Parallelizable with multiple Python threads. For example:

class ParallelNet(Parallelizable):
def __init__(self):
self._net = Model()
self._loss = gluon.loss.SoftmaxCrossEntropyLoss()

def forward_backward(self, x):
data, label = x
out = self._net(data)
loss = self._loss(out, label)
loss.backward()
return loss

net = ParallelNet()
ctx = [mx.gpu(0), mx.gpu(1)]
parallel = Parallel(len(ctx), net)

for batch in batches:
parallel.put(x)
losses = [parallel.get() for _ in ctx]
trainer.step()

Parameters
• num_workers (int) – Number of worker threads. If set to 0, the main thread is used as the worker for debugging purpose.

• parallelizable – Parallelizable net whose forward and backward methods are invoked by multiple worker threads.

• serial_init (bool, default True) – Execute the first num_workers inputs in main thread, so that the Block used in parallizable is initialized serially. Initialize a Block with multiple threads may cause unexpected behavior.

get()[source]

Get an output of previous parallizable.forward_backward calls. This method blocks if none of previous parallizable.forward_backward calls have return any result.

put(x)[source]

Assign input x to an available worker and invoke parallizable.forward_backward with x.

gluonnlp.utils.grad_global_norm(parameters, max_norm=None)[source]

Calculate the 2-norm of gradients of parameters, and how much they should be scaled down such that their 2-norm does not exceed max_norm, if max_norm if provided.

If gradients exist for more than one context for a parameter, user needs to explicitly call trainer.allreduce_grads so that the gradients are summed first before calculating the 2-norm.

Note

This function is only for use when update_on_kvstore is set to False in trainer.

Example:

trainer = Trainer(net.collect_params(), update_on_kvstore=False, ...)
for x, y in mx.gluon.utils.split_and_load(X, [mx.gpu(0), mx.gpu(1)]):
y = net(x)
loss = loss_fn(y, label)
loss.backward()
...

Parameters
• parameters (list of Parameters) –

• max_norm (NDArray, optional) – The maximum L2 norm threshold. If provided, ratio and is_finite will be returned.

Returns

• NDArray – Total norm. Shape is (1,)

• NDArray – Ratio for rescaling gradients based on max_norm s.t. grad = grad / ratio. If total norm is NaN, ratio will be NaN, too. Returned if max_norm is provided. Shape is (1,)

• NDArray – Whether the total norm is finite, returned if max_norm is provided. Shape is (1,)

gluonnlp.utils.clip_grad_global_norm(parameters, max_norm, check_isfinite=True)[source]

Rescales gradients of parameters so that the sum of their 2-norm is smaller than max_norm. If gradients exist for more than one context for a parameter, user needs to explicitly call trainer.allreduce_grads so that the gradients are summed first before calculating the 2-norm.

Note

This function is only for use when update_on_kvstore is set to False in trainer. In cases where training happens on multiple contexts, this method should be used in conjunction with trainer.allreduce_grads() and trainer.update(). (not trainer.step())

Example:

trainer = Trainer(net.collect_params(), update_on_kvstore=False, ...)
for x, y in mx.gluon.utils.split_and_load(X, [mx.gpu(0), mx.gpu(1)]):
y = net(x)
loss = loss_fn(y, label)
loss.backward()
trainer.update(batch_size)
...

Parameters
• parameters (list of Parameters) –

• max_norm (float) –

• check_isfinite (bool, default True) – If True, check that the total_norm is finite (not nan or inf). This requires a blocking .asscalar() call.

Returns

Total norm. Return type is NDArray of shape (1,) if check_isfinite is False. Otherwise a float is returned.

Return type

NDArray or float

gluonnlp.utils.save_parameters(model, filename)[source]

Save parameters to file.

Saved parameters can only be loaded with Block.load_parameters. Note that this method only saves parameters, not model structure.

Both local file system path and S3 URI are supported. For example, ‘s3://mybucket/folder/net.params’, ‘./folder/net.params’.

Parameters
• model (mx.gluon.Block) – The model to save.

• uri (str) – Path to file.

gluonnlp.utils.save_states(trainer, fname)[source]

Saves trainer states (e.g. optimizer, momentum) to a file.

Both local file system path and S3 URI are supported. For example, ‘s3://mybucket/folder/net.states’, ‘./folder/net.states’.

Parameters
• trainer (mxnet.gluon.Trainer) – The trainer whose states will be saved.

• fname (str) – Path to output states file.

Note

optimizer.param_dict, which contains Parameter information (such as lr_mult and wd_mult) will not be saved.

gluonnlp.utils.load_parameters(model, filename, ctx=None, allow_missing=False, ignore_extra=False, cast_dtype=None)[source]

Load parameters from file previously saved by save_parameters.

Both local file system path and S3 URI are supported. For example, ‘s3://mybucket/folder/net.params’, ‘./folder/net.params’.

Parameters
• filename (str) – Path to parameter file.

• ctx (Context or list of Context, default cpu()) – Context(s) to initialize loaded parameters on.

• allow_missing (bool, default False) – Whether to silently skip loading parameters not represents in the file.

• ignore_extra (bool, default False) – Whether to silently ignore parameters from the file that are not present in this Block.

• cast_dtype (bool, default False) – Cast the data type of the NDArray loaded from the checkpoint to the dtype provided by the Parameter if any.

gluonnlp.utils.load_states(trainer, fname)[source]

Loads trainer states (e.g. optimizer, momentum) from a file.

Both local file system path and S3 URI are supported. For example, ‘s3://mybucket/folder/net.states’, ‘./folder/net.states’.

Parameters
• trainer (mxnet.gluon.Trainer) – The trainer whose states will be loaded.

• fname (str) – Path to input states file.

Note

optimizer.param_dict, which contains Parameter information (such as lr_mult and wd_mult) will not be loaded from the file, but rather set based on current Trainer’s parameters.

gluonnlp.utils.mkdir(dirname)[source]

Create a directory.

Parameters

dirname (str) – The name of the target directory to create.

gluonnlp.utils.glob(url, separator=', ')[source]

Return a list of paths matching a pathname pattern.

The pattern may contain simple shell-style wildcards. Input may also include multiple patterns, separated by separator.

Parameters
• url (str) – The name of the files

• separator (str, default is ',') – The separator in url to allow multiple patterns in the input

gluonnlp.utils.remove(filename)[source]

Remove a file

Parameters

filename (str) – The name of the target file to remove

gluonnlp.utils.check_version(min_version, warning_only=False, library=None)[source]

Check the version of gluonnlp satisfies the provided minimum version. An exception is thrown if the check does not pass.

Parameters
• min_version (str) – Minimum version

• warning_only (bool) – Printing a warning instead of throwing an exception.

• library (optional module, default None) – The target library for version check. Checks gluonnlp by default

gluonnlp.utils.set_seed(seed=0)[source]

Sets the seed for reproducibility

Parameters

seed (int) – Value of the seed to set