Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Commit

Permalink
Generalize T5 modules (#5166)
Browse files Browse the repository at this point in the history
* initial commit

* general self attn

* fixing bugs, adding tests, adding docs

* updating other modules

* refactor

* bug fix

* update changelog

* fix shape

* fix format

* address feedback

* small doc fix

* Update allennlp/modules/transformer/transformer_stack.py

Co-authored-by: Pete <petew@allenai.org>

* remove old file

Co-authored-by: epwalsh <epwalsh10@gmail.com>
Co-authored-by: Pete <petew@allenai.org>
  • Loading branch information
3 people authored Jun 2, 2021
1 parent 5b111d0 commit b0aa1d4
Show file tree
Hide file tree
Showing 13 changed files with 861 additions and 472 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Added a `min_steps` parameter to `BeamSearch` to set a minimum length for the predicted sequences.
- Added the `FinalSequenceScorer` abstraction to calculate the final scores of the generated sequences in `BeamSearch`.
- Added `shuffle` argument to `BucketBatchSampler` which allows for disabling shuffling.
- Added `allennlp.modules.transformer.attention_module` which contains a generalized `AttentionModule`. `SelfAttention` and `T5Attention` both inherit from this.
- Added a `Constraint` abstract class to `BeamSearch`, which allows for incorporating constraints on the predictions found by `BeamSearch`,
along with a `RepeatedNGramBlockingConstraint` constraint implementation, which allows for preventing repeated n-grams in the output from `BeamSearch`.
- Added `DataCollator` for dynamic operations for each batch.
Expand Down
2 changes: 1 addition & 1 deletion allennlp/modules/transformer/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ def forward(self, token_ids: torch.LongTensor, mask: torch.BoolTensor):
TransformerEmbeddings,
ImageFeatureEmbeddings,
)
from allennlp.modules.transformer.self_attention import SelfAttention
from allennlp.modules.transformer.attention_module import SelfAttention, T5Attention
from allennlp.modules.transformer.activation_layer import ActivationLayer
from allennlp.modules.transformer.transformer_layer import AttentionLayer, TransformerLayer
from allennlp.modules.transformer.transformer_stack import TransformerStack
Expand Down
Loading

0 comments on commit b0aa1d4

Please sign in to comment.