v1.2.0rc1
Pre-releaseWhat's new
Added 🎉
- Added a warning when
batches_per_epoch
for the validation data loader is inherited from
the train data loader. - Added a
build-vocab
subcommand that can be used to build a vocabulary from a training config file. - Added
tokenizer_kwargs
argument toPretrainedTransformerMismatchedIndexer
. - Added
tokenizer_kwargs
andtransformer_kwargs
arguments toPretrainedTransformerMismatchedEmbedder
. - Added official support for Python 3.8.
- Added a script:
scripts/release_notes.py
, which automatically prepares markdown release notes from the
CHANGELOG and commit history. - Added a flag
--predictions-output-file
to theevaluate
command, which tells AllenNLP to write the
predictions from the given dataset to the file as JSON lines. - Added the ability to ignore certain missing keys when loading a model from an archive. This is done
by adding a class-level variable calledauthorized_missing_keys
to any PyTorch module that aModel
uses.
If defined,authorized_missing_keys
should be a list of regex string patterns. - Added
FBetaMultiLabelMeasure
, a multi-label Fbeta metric. This is a subclass of the existingFBetaMeasure
. - Added ability to pass additional key word arguments to
cached_transformers.get()
, which will be passed on toAutoModel.from_pretrained()
. - Added an
overrides
argument toPredictor.from_path()
. - Added a
cached-path
command. - Added a function
inspect_cache
tocommon.file_utils
that prints useful information about the cache. This can also
be used from thecached-path
command withallennlp cached-path --inspect
. - Added a function
remove_cache_entries
tocommon.file_utils
that removes any cache entries matching the given
glob patterns. This can used from thecached-path
command withallennlp cached-path --remove some-files-*
. - Added logging for the main process when running in distributed mode.
- Added a
TrainerCallback
object to support state sharing between batch and epoch-level training callbacks. - Added support for .tar.gz in PretrainedModelInitializer.
- Added classes:
nn/samplers/samplers.py
withMultinomialSampler
,TopKSampler
, andTopPSampler
for
sampling indices from log probabilities - Made
BeamSearch
registrable. - Added
top_k_sampling
andtype_p_sampling
BeamSearch
implementations. - Pass
serialization_dir
toModel
andDatasetReader
. - Added an optional
include_in_archive
parameter to the top-level of configuration files. When specified,include_in_archive
should be a list of paths relative to the serialization directory which will be bundled up with the final archived model from a training run.
Changed ⚠️
- Subcommands that don't require plugins will no longer cause plugins to be loaded or have an
--include-package
flag. - Allow overrides to be JSON string or
dict
. transformers
dependency updated to version 3.1.0.- When
cached_path
is called on a local archive withextract_archive=True
, the archive is now extracted into a unique subdirectory of the cache root instead of a subdirectory of the archive's directory. The extraction directory is also unique to the modification time of the archive, so if the file changes, subsequent calls tocached_path
will know to re-extract the archive. - Removed the
truncation_strategy
parameter toPretrainedTransformerTokenizer
. The way we're calling the tokenizer, the truncation strategy takes no effect anyways. - Don't use initializers when loading a model, as it is not needed.
- Distributed training will now automatically search for a local open port if the
master_port
parameter is not provided. - In training, save model weights before evaluation.
allennlp.common.util.peak_memory_mb
renamed topeak_cpu_memory
, andallennlp.common.util.gpu_memory_mb
renamed topeak_gpu_memory
,
and they both now return the results in bytes as integers. Also, thepeak_gpu_memory
function now utilizes PyTorch functions to find the memory
usage instead of shelling out to thenvidia-smi
command. This is more efficient and also more accurate because it only takes
into account the tensor allocations of the current PyTorch process.- Make sure weights are first loaded to the cpu when using PretrainedModelInitializer, preventing wasted GPU memory.
- Load dataset readers in
load_archive
. - Updated
AllenNlpTestCase
docstring to remove reference tounittest.TestCase
Removed 👋
- Removed
common.util.is_master
function.
Fixed ✅
- Fixed a bug where the reported
batch_loss
metric was incorrect when training with gradient accumulation. - Class decorators now displayed in API docs.
- Fixed up the documentation for the
allennlp.nn.beam_search
module. - Ignore
*args
when constructing classes withFromParams
. - Ensured some consistency in the types of the values that metrics return.
- Fix a PyTorch warning by explicitly providing the
as_tuple
argument (leaving
it as its default value ofFalse
) toTensor.nonzero()
. - Remove temporary directory when extracting model archive in
load_archive
at end of function rather than viaatexit
. - Fixed a bug where using
cached_path()
offline could return a cached resource's lock file instead
of the cache file. - Fixed a bug where
cached_path()
would fail if passed acache_dir
with the user home shortcut~/
. - Fixed a bug in our doc building script where markdown links did not render properly
if the "href" part of the link (the part inside the()
) was on a new line. - Changed how gradients are zeroed out with an optimization. See this video from NVIDIA
at around the 9 minute mark. - Fixed a bug where parameters to a
FromParams
class that are dictionaries wouldn't get logged
when an instance is instantiatedfrom_params
. - Fixed a bug in distributed training where the vocab would be saved from every worker, when it should have been saved by only the local master process.
- Fixed a bug in the calculation of rouge metrics during distributed training where the total sequence count was not being aggregated across GPUs.
- Fixed
allennlp.nn.util.add_sentence_boundary_token_ids()
to usedevice
parameter of input tensor. - Be sure to close the TensorBoard writer even when training doesn't finish.
- Fixed the docstring for
PyTorchSeq2VecWrapper
.
Commits
01644ca Pass serialization_dir to Model, DatasetReader, and support include_in_archive
(#4713)
1f29f35 Update transformers requirement from <3.4,>=3.1 to >=3.1,<3.5 (#4741)
6bb9ce9 warn about batches_per_epoch with validation loader (#4735)
00bb6c5 Be sure to close the TensorBoard writer (#4731)
3f23938 Update mkdocs-material requirement from <6.1.0,>=5.5.0 to >=5.5.0,<6.2.0 (#4738)
10c11ce Fix typo in PretrainedTransformerMismatchedEmbedder docstring (#4737)
0e64b4d fix docstring for PyTorchSeq2VecWrapper (#4734)
006bab4 Don't use PretrainedModelInitializer when loading a model (#4711)
ce14bdc Allow usage of .tar.gz with PretrainedModelInitializer (#4709)
c14a056 avoid defaulting to CPU device in add_sentence_boundary_token_ids() (#4727)
24519fd fix typehint on checkpointer method (#4726)
d3c69f7 Bump mypy from 0.782 to 0.790 (#4723)
cccad29 Updated AllenNlpTestCase docstring (#4722)
3a85e35 add reasonable timeout to gpu checks job (#4719)
1ff0658 Added logging for the main process when running in distributed mode (#4710)
b099b69 Add top_k and top_p sampling to BeamSearch (#4695)
bc6f15a Fixes rouge metric calculation corrected for distributed training (#4717)
ae7cf85 automatically find local open port in distributed training (#4696)
321d4f4 TrainerCallback with batch/epoch/end hooks (#4708)
001e1f7 new way of setting env variables in GH Actions (#4700)
c14ea40 Save checkpoint before running evaluation (#4704)
40bb47a Load weights to cpu with PretrainedModelInitializer (#4712)
327188b improve memory helper functions (#4699)
90f0037 fix reported batch_loss (#4706)
39ddb52 CLI improvements (#4692)
edcb6d3 Fix a bug in saving vocab during distributed training (#4705)
3506e3f ensure parameters that are actual dictionaries get logged (#4697)
eb7f256 Add StackOverflow link to README (#4694)
17c3b84 Fix small typo (#4686)
e0b2e26 display class decorators in API docs (#4685)
b9a9284 Update transformers requirement from <3.3,>=3.1 to >=3.1,<3.4 (#4684)
d9bdaa9 add build-vocab command (#4655)
ce604f1 Update mkdocs-material requirement from <5.6.0,>=5.5.0 to >=5.5.0,<6.1.0 (#4679)
c3b5ed7 zero grad optimization (#4673)
9dabf3f Add missing tokenizer/transformer kwargs (#4682)
9ac6c76 Allow overrides to be JSON string or dict (#4680)
55cfb47 The truncation setting doesn't do anything anymore (#4672)
990c9c1 clarify conda Python version in README.md
97db538 official support for Python 3.8 🐍 (#4671)
1e381bb Clean up the documentation for beam search (#4664)
11def8e Update bug_report.md
97fe88d Cached path command (#4652)
c9f376b Update transformers requirement from <3.2,>=3.1 to >=3.1,<3.3 (#4663)
e5e3d02 tick version for nightly releases
b833f90 fix multi-line links in docs (#4660)
d7c06fe Expose from_pretrained keyword arguments (#4651)
175c76b fix confusing distributed logging info (#4654)
fbd2ccc fix numbering in RELEASE_GUIDE
2d5f24b improve how cached_path extracts archives (#4645)
824f97d smooth out release process (#4648)
c7b7c00 Feature/prevent temp directory retention (#4643)
de5d68b Fix tensor.nonzero() function overload warning (#4644)
e8e89d5 add flag for saving predictions to 'evaluate' command (#4637)
e4fd5a0 Multi-label F-beta metric (#4562)
f0e7a78 Create Dependabot config file (#4635)
0e33b0b Return consistent types from metrics (#4632)
2df364f Update transformers requirement from <3.1,>=3.0 to >=3.0,<3.2 (#4621)
6d480aa Improve handling of **kwargs in FromParams (#4629)
bf3206a Workaround for Python not finding imports in spawned processes (#4630)