-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Insights: NVIDIA/NeMo
Overview
Could not load contribution data
Please try again later
65 Pull requests merged by 31 people
-
SDXL improvements (and support for Draft+)
#9654 merged
Jul 11, 2024 -
Cherry pick: LITA Integration
#9684 merged
Jul 11, 2024 -
chore: Pin branch in notebooks
#9697 merged
Jul 11, 2024 -
[TTS] Add fullband mel codec checkpoints
#9704 merged
Jul 11, 2024 -
NeMo performance feature documentation
#9482 merged
Jul 11, 2024 -
Alit/mamba
#9696 merged
Jul 11, 2024 -
[Nemo CICD] Add a bit more for timeout
#9702 merged
Jul 11, 2024 -
add documentation for reset_lr feature
#9639 merged
Jul 11, 2024 -
Release
2.0.0rc1
#9631 merged
Jul 11, 2024 -
Huvu/mcore t5
#9677 merged
Jul 11, 2024 -
Distributed checkpointing user guide
#9494 merged
Jul 11, 2024 -
[NeMo-UX] Fix imports so local configuration of runs works again
#9690 merged
Jul 11, 2024 -
Parametrize FPS group
#9669 merged
Jul 11, 2024 -
[Nemo-UX] Including all trainable-params in a PEFT-checkpoint
#9650 merged
Jul 11, 2024 -
LITA integration
#9578 merged
Jul 11, 2024 -
[NeMo-UX] Fix pipeline parallel bug
#9661 merged
Jul 10, 2024 -
Update llama-3 PEFT notebook to download model from NGC
#9667 merged
Jul 10, 2024 -
Cherry-pick megatron export fix from main
#9643 merged
Jul 10, 2024 -
[NeMo-UX] Make 'load_directly_on_device' configurable
#9657 merged
Jul 10, 2024 -
Added CPU offloading docs
#9479 merged
Jul 10, 2024 -
unpin transformers version
#9606 merged
Jul 10, 2024 -
Contrastive Reranker/Reward model
#9171 merged
Jul 10, 2024 -
Revert "enables default data step in megatron parallel to operate on a wider variety of tensors"
#9666 merged
Jul 10, 2024 -
Parametrize FPS group
#9648 merged
Jul 10, 2024 -
enables default data step in megatron parallel to operate on a wider variety of tensors
#9641 merged
Jul 10, 2024 -
llama CI fix
#9663 merged
Jul 10, 2024 -
Use FP8 in GPT TP2 test
#9451 merged
Jul 9, 2024 -
Fixing import error fior llama-index (RAG pipeline)
#9662 merged
Jul 9, 2024 -
[NeMo-UX] Fix pipeline parallel bug
#9637 merged
Jul 9, 2024 -
Triton deployment improvements for in-framework models
#9600 merged
Jul 9, 2024 -
SDXL improvements (and support for Draft+) [DRAFT PR]
#9543 merged
Jul 9, 2024 -
Fixing import error fior llama-index (RAG pipeline)
#9651 merged
Jul 9, 2024 -
[Nemo CICD] Docker temp files auto-cleanup
#9642 merged
Jul 9, 2024 -
[NeMo-UX] Fix when optimizers are setup for PEFT
#9619 merged
Jul 9, 2024 -
[Cherrypick] support lora when kv_channel != hidden_size / num_heads
#9644 merged
Jul 8, 2024 -
Support LoRA when kv_channel != hidden_size / num_heads
#9636 merged
Jul 8, 2024 -
Nemotron export - fixing megatron_export.py
#9625 merged
Jul 8, 2024 -
Improve error messaging during trt-llm export
#9638 merged
Jul 8, 2024 -
Adding support for mcore generate
#9566 merged
Jul 8, 2024 -
ci: Timeout per step, not job
#9635 merged
Jul 8, 2024 -
Mistral + Mixtral Support for NeVa
#9459 merged
Jul 8, 2024 -
Add REST API to deploy module
#9539 merged
Jul 8, 2024 -
Update NeMo Clip to Use MCore Modules
#9594 merged
Jul 8, 2024 -
[Nemo-UX] Expose transformer_layer_spec inside GPTConfig
#9592 merged
Jul 8, 2024 -
MCore T5 support for NeMo - Training
#9432 merged
Jul 8, 2024 -
Unwrap ckpt_io for model opt (async save) (#9622)
#9634 merged
Jul 8, 2024 -
Change default parallel_save to False
#9633 merged
Jul 8, 2024 -
Unwrap ckpt_io for model opt (async save)
#9622 merged
Jul 8, 2024 -
Change default parallel_save to False
#9632 merged
Jul 8, 2024 -
Fix the arguments of forward_for_export function in msdd_models
#9624 merged
Jul 8, 2024 -
[NeMo-UX] async checkpointing support
#9466 merged
Jul 8, 2024 -
nemo gemma to hf conversion
#9629 merged
Jul 7, 2024 -
Alit/mamba
#9575 merged
Jul 6, 2024 -
[NeMo-UX] fix pretrianing data sizes and weights
#9627 merged
Jul 6, 2024 -
NeVA Minor Fixes
#9608 merged
Jul 6, 2024 -
fix ckpt load bug
#9621 merged
Jul 6, 2024 -
Change mixtral moe key name for trt-llm
#9620 merged
Jul 5, 2024 -
Enable MCore checkpointing optimizations
#9505 merged
Jul 5, 2024 -
TitaNet Batch Verify Speaker
#9337 merged
Jul 5, 2024 -
Alit/mamba tmp
#9612 merged
Jul 5, 2024 -
fix: remove non_blocking from PTL's .cuda call
#9618 merged
Jul 5, 2024 -
Akoumparouli/nemo ux mixtral export
#9603 merged
Jul 5, 2024 -
fix converter defautl args
#9565 merged
Jul 5, 2024 -
Remove .cuda calls, use device isntead
#9602 merged
Jul 5, 2024 -
Akoumparouli/mistral import instruct chat template fix
#9567 merged
Jul 5, 2024
32 Pull requests opened by 22 people
-
Add long context recipe
#9623 opened
Jul 5, 2024 -
Adding mamba embedding model
#9646 opened
Jul 9, 2024 -
[NeMo-UX] Fix when optimizers are setup for PEFT
#9647 opened
Jul 9, 2024 -
Huvu/rag import fix
#9652 opened
Jul 9, 2024 -
Making TDT models support all-positive durations (previously duration must contain 0)
#9656 opened
Jul 9, 2024 -
In framework export
#9658 opened
Jul 9, 2024 -
[TTS] Add VietnameseCharsTokenizer
#9665 opened
Jul 10, 2024 -
Fix the serialization of partial functions in nemo 2.0
#9668 opened
Jul 10, 2024 -
Canary Adapters tutorial
#9670 opened
Jul 10, 2024 -
enables default data step in megatron parallel to operate on a wider variety of tensors - second try
#9671 opened
Jul 10, 2024 -
Gemma 2
#9672 opened
Jul 10, 2024 -
[NeMo-UX] Make 'load_directly_on_device' configurable
#9674 opened
Jul 10, 2024 -
Enabling bias_dropout_add_fused with no bias term
#9676 opened
Jul 10, 2024 -
Fix for `train.controlnet.controlnet_v1_5_1node_100steps`
#9678 opened
Jul 10, 2024 -
Adding support for mcore T5 Eval - SFT - PEFT
#9679 opened
Jul 10, 2024 -
Fix few issues and docs for neva and clip in r2.0.0rc1
#9681 opened
Jul 10, 2024 -
Speeds up copying of necessary artifact files with SaveRestoreConnector
#9682 opened
Jul 11, 2024 -
[NeMo-UX] Added Megatron loss function and LR scheduler
#9683 opened
Jul 11, 2024 -
Release automation
#9687 opened
Jul 11, 2024 -
add auto configurator to NeMo
#9688 opened
Jul 11, 2024 -
NeVa::forward - remove device syncs (torch.where) and vectorize over batch dimensions
#9689 opened
Jul 11, 2024 -
[Nemo-UX] Including all trainable-params in a PEFT-checkpoint
#9691 opened
Jul 11, 2024 -
Updating num_weights check in unit_test of RETRO
#9693 opened
Jul 11, 2024 -
[NeMo-UX] Fix imports so local configuration of runs works again
#9694 opened
Jul 11, 2024 -
Update the test checking for cooperative kernels in conditional nodes.
#9698 opened
Jul 11, 2024 -
add dummy vision and text transformer config (assumed mcore to be false)
#9699 opened
Jul 11, 2024 -
add documentation for reset_lr feature
#9700 opened
Jul 11, 2024 -
[NeMo-UX] Fix non-distributed optimizer save
#9703 opened
Jul 11, 2024 -
Integrate TRT-LLM v0.11
#9705 opened
Jul 11, 2024 -
chore: Pin branch in notebooks
#9706 opened
Jul 11, 2024 -
[NeMo-UX] Match nemo 1's default behavior for drop_last and pad_samples_to_global_batch_size
#9707 opened
Jul 12, 2024
9 Issues closed by 4 people
-
Error(s): ConfidenceConfig.__init__() got an unexpected keyword argument 'measure_cfg'
#9357 closed
Jul 12, 2024 -
`EncDecCTCModel.transcribe(audio=...)` changed to `EncDecCTCModel.transcribe(paths2audio_files=...)`
#9230 closed
Jul 11, 2024 -
Add sequence packing and proper attention masking support for LLM pretraining?
#9664 closed
Jul 10, 2024 -
Issue Resuming Training from Checkpoint with Small Validation Dataset
#9317 closed
Jul 9, 2024 -
Can we add emotions to the produced audio?
#9498 closed
Jul 8, 2024 -
Can't launch NeMo containers with CUDA support
#9268 closed
Jul 7, 2024 -
video input 'image_aspect_ratio=pad' not work
#9395 closed
Jul 6, 2024 -
Slow training on Mixtral-8x22B when DP size > 1
#9031 closed
Jul 6, 2024
9 Issues opened by 7 people
-
Question: Which decoder are we supposed to use on parakeet-tdt_ctc-1.1b model?
#9695 opened
Jul 11, 2024 -
Add KV-Cache for MegatronLMEncoderDecoderModel
#9686 opened
Jul 11, 2024 -
RuntimeError: Error(s) in loading state_dict for MegaMolBARTModel after ANY fine tuning
#9685 opened
Jul 11, 2024 -
Util for measuring MFU?
#9673 opened
Jul 10, 2024 -
When should mcore_gpt: True be used?
#9659 opened
Jul 9, 2024 -
More complete example of using S3CheckpointIO
#9645 opened
Jul 8, 2024 -
How to adapt myself speaker model into the diarization pipeline?
#9630 opened
Jul 7, 2024 -
RuntimeError "Unexpected key" when running checkpoint_converters script convert_got_nemo_to_mcore.py
#9626 opened
Jul 5, 2024
39 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
SlimIPL -Iterative pseudo labeling
#9193 commented on
Jul 11, 2024 • 33 new comments -
Yuya/add checkpoints section
#9329 commented on
Jul 12, 2024 • 15 new comments -
[PyTorch] Add context parallel support for packed dataset in THD format
#9540 commented on
Jul 10, 2024 • 13 new comments -
[NeMo-UX] Make TE and Apex dependencies optional
#9550 commented on
Jul 11, 2024 • 5 new comments -
Updating NeVA tutorial
#9588 commented on
Jul 8, 2024 • 3 new comments -
Wrap all fp8 extra states in LocalNonpersistentObject
#9422 commented on
Jul 8, 2024 • 3 new comments -
draft: bert cp support
#9491 commented on
Jul 5, 2024 • 2 new comments -
Allows non-strict load with distributed checkpoints
#9613 commented on
Jul 11, 2024 • 1 new comment -
Can you support DoRA?
#9520 commented on
Jul 5, 2024 • 0 new comments -
Use torch sdpa implementation in ASR mha
#9590 commented on
Jul 9, 2024 • 0 new comments -
Unable to disable validation
#9385 commented on
Jul 6, 2024 • 0 new comments -
Adding support for mcore T5 PEFT
#9584 commented on
Jul 10, 2024 • 0 new comments -
NeMo MoE docs
#9579 commented on
Jul 11, 2024 • 0 new comments -
Add "offline" data cache generation support
#9576 commented on
Jul 5, 2024 • 0 new comments -
Draft: Add MegatronNevaDeployable for serving multimodal models on Triton server
#9553 commented on
Jul 12, 2024 • 0 new comments -
support save tensorrt_llm checkpoint
#9552 commented on
Jul 12, 2024 • 0 new comments -
setuptools 70.0.0 results in ImportError: cannot import name 'packaging' from 'pkg_resources'
#9284 commented on
Jul 6, 2024 • 0 new comments -
Context parallel does not work in some cases which works well using megatron-lm directly
#8992 commented on
Jul 8, 2024 • 0 new comments -
Jpg2p jun18
#9538 commented on
Jul 12, 2024 • 0 new comments -
fix NameError: name 'ApexGuardDefaults' is not defined
#9497 commented on
Jul 11, 2024 • 0 new comments -
Why isn't FSDP supported by DistributedCheckpointIO?
#9394 commented on
Jul 9, 2024 • 0 new comments -
TE guard to avoid MPI dependency
#9489 commented on
Jul 12, 2024 • 0 new comments -
Tiledsiglip
#9441 commented on
Jul 8, 2024 • 0 new comments -
Draft: Add LoRA test with sequence parallelism
#9433 commented on
Jul 8, 2024 • 0 new comments -
CONF-TSASR
#8709 commented on
Jul 10, 2024 • 0 new comments -
Add T5TTS
#9406 commented on
Jul 10, 2024 • 0 new comments -
Riva and k2 ASR WFST decoding (2)
#9391 commented on
Jul 6, 2024 • 0 new comments -
Speaker Diarization - Extracting speaker embeddings for labels from files with multiple speakers from Cluster Diarizer models
#8171 commented on
Jul 10, 2024 • 0 new comments -
Speeds up copying of necessary artifact files with SaveRestoreConnector
#9299 commented on
Jul 6, 2024 • 0 new comments -
Remove some duplicate code.
#9280 commented on
Jul 10, 2024 • 0 new comments -
Speaker Diarization goes haywire due to small segments of audio
#9523 commented on
Jul 10, 2024 • 0 new comments -
Fixed chokepoint in diarization for longer audios
#9114 commented on
Jul 8, 2024 • 0 new comments -
New mcore transformer block spec
#9035 commented on
Jul 10, 2024 • 0 new comments -
Draft: triton inference server for NeMo ASR
#8673 commented on
Jul 9, 2024 • 0 new comments -
Flashlight and Pyctcdecode decoders
#8428 commented on
Jul 8, 2024 • 0 new comments -
Phonemes from file
#8054 commented on
Jul 8, 2024 • 0 new comments -
Fastconformer-CTC crashing with Watchdog caught collective operation timeout
#9563 commented on
Jul 11, 2024 • 0 new comments -
Canary model stuck in a loop? Just repeats the same phrases over and over.
#9030 commented on
Jul 11, 2024 • 0 new comments -
RAM memory leaks for EncDecCTCModelBPE at inference
#9428 commented on
Jul 11, 2024 • 0 new comments