-
Notifications
You must be signed in to change notification settings - Fork 298
Insights: modelscope/ms-swift
Overview
Could not load contribution data
Please try again later
4 Releases published by 1 person
129 Pull requests merged by 11 people
-
fix qwen2.5 template
#2081 merged
Sep 20, 2024 -
dynamic vit gradient_checkpointing
#2071 merged
Sep 20, 2024 -
Support Mistral-small-inst-2409
#2077 merged
Sep 20, 2024 -
fix RLHF & max_length
#2075 merged
Sep 19, 2024 -
Update qwen2-vl最佳实践.md
#2058 merged
Sep 19, 2024 -
fix rlhf zero3
#2072 merged
Sep 19, 2024 -
Fix yi template
#2067 merged
Sep 19, 2024 -
fix win32 quote
#2065 merged
Sep 18, 2024 -
update qwen2-vl docs
#2063 merged
Sep 18, 2024 -
fix notebook gradio
#2062 merged
Sep 18, 2024 -
support qwen2.5-coder
#2061 merged
Sep 18, 2024 -
vllm support mutli image
#2059 merged
Sep 18, 2024 -
support qwen2-vl -72b/qwen2.5-math/qwen2.5-coder
#2056 merged
Sep 18, 2024 -
Support qwen2.5
#2054 merged
Sep 18, 2024 -
support qwen2-vl-base
#2052 merged
Sep 18, 2024 -
fix qwen2vl position_ids
#2051 merged
Sep 18, 2024 -
update docs
#2050 merged
Sep 17, 2024 -
llama3 tool calling
#2048 merged
Sep 15, 2024 -
Fix multi coordinate grounding
#2047 merged
Sep 15, 2024 -
support multi bbox grounding
#2045 merged
Sep 15, 2024 -
fix mplug-owl3
#2042 merged
Sep 14, 2024 -
Add longwriter filtered dataset
#2037 merged
Sep 14, 2024 -
fix rlhf & zero3
#2034 merged
Sep 14, 2024 -
Fix olora and pissa saving files which will cause the second saving failed
#2032 merged
Sep 13, 2024 -
fix deploy eval kill
#2029 merged
Sep 12, 2024 -
update code
#2028 merged
Sep 12, 2024 -
refactor rlhf
#1975 merged
Sep 12, 2024 -
Florence use _post_encode & template support encoder-decoder
#2019 merged
Sep 11, 2024 -
Add FAQ Document
#2013 merged
Sep 11, 2024 -
fix lmdeploy qwen_vl
#2009 merged
Sep 11, 2024 -
Support llava1.6-llama3.1-8b-instruct
#2005 merged
Sep 10, 2024 -
Fix rlhf ref model
#2003 merged
Sep 10, 2024 -
compat lmdeploy==0.6
#2001 merged
Sep 10, 2024 -
fix EngineGenerationConfig importError of lmdeploy
#1990 merged
Sep 10, 2024 -
Support Deepseek 2.5
#1992 merged
Sep 10, 2024 -
fix
#1995 merged
Sep 10, 2024 -
fix patch
#1997 merged
Sep 10, 2024 -
fix model_mapping
#1982 merged
Sep 9, 2024 -
fix typo
#1980 merged
Sep 9, 2024 -
Add reflection model
#1973 merged
Sep 7, 2024 -
update docs
#1970 merged
Sep 7, 2024 -
support mplug_owl3
#1957 merged
Sep 7, 2024 -
fix bugs
#1959 merged
Sep 6, 2024 -
Fix the lora hook
#1963 merged
Sep 6, 2024 -
Fix data info print in rlhf
#1964 merged
Sep 6, 2024 -
Add lazy_tokenize to RLHF
#1956 merged
Sep 6, 2024 -
Support minicpm 3
#1952 merged
Sep 5, 2024 -
fix rlhf
#1949 merged
Sep 5, 2024 -
support dynamic_eos
#1947 merged
Sep 5, 2024 -
fix qwen2-vl & video
#1950 merged
Sep 5, 2024 -
fix file rename error in megatron when there are multi process
#1948 merged
Sep 5, 2024 -
refactor rlhf
#1885 merged
Sep 5, 2024 -
fix do_sample
#1946 merged
Sep 5, 2024 -
fix lmdeploy seed
#1945 merged
Sep 5, 2024 -
update yi-coder
#1942 merged
Sep 5, 2024 -
fix swift deploy
#1936 merged
Sep 4, 2024 -
fix typing
#1933 merged
Sep 4, 2024 -
Support deploy & logprobs
#1833 merged
Sep 4, 2024 -
[TorchAcc] fix: fix the judegement of fsdp_num
#1903 merged
Sep 4, 2024 -
update docs & fix bug
#1926 merged
Sep 4, 2024 -
update wechat
#1925 merged
Sep 3, 2024 -
[TorchAcc] perf: use xm.save instead of torch.save
#1916 merged
Sep 3, 2024 -
refactor docs
#1915 merged
Sep 3, 2024 -
Refactor docs
#1912 merged
Sep 3, 2024 -
fix web-ui push to hub strategy
#1909 merged
Sep 3, 2024 -
update docs
#1908 merged
Sep 3, 2024 -
deepspeed use cosine lr_schduler
#1907 merged
Sep 2, 2024 -
support logprobs
#1900 merged
Sep 2, 2024 -
fix push_to_ms
#1901 merged
Sep 2, 2024 -
support custom quantized dataset
#1893 merged
Sep 2, 2024 -
Fix push_to_hub when last-checkpoint
#1897 merged
Sep 2, 2024 -
Add some warnings and fix RLHF
#1890 merged
Sep 2, 2024 -
add vllm lmdeploy benchmark
#1889 merged
Sep 2, 2024 -
Fix push to hub logic
#1888 merged
Sep 2, 2024 -
Refactor push_to_hub
#1883 merged
Sep 2, 2024 -
support qwen2-vl gptq awq
#1884 merged
Sep 2, 2024 -
Support freeze vit
#1880 merged
Aug 31, 2024 -
use model.generation_config
#1850 merged
Aug 31, 2024 -
add duet
#1877 merged
Aug 31, 2024 -
Fix neftune doc
#1875 merged
Aug 31, 2024 -
Fix num_proc
#1874 merged
Aug 30, 2024 -
Add train record
#1873 merged
Aug 30, 2024 -
[TorchAcc] fix serveral bugs for torchacc FSDP.
#1872 merged
Aug 30, 2024 -
Support faster data map
#1871 merged
Aug 30, 2024 -
update docs qwen2-vl
#1869 merged
Aug 30, 2024 -
fix requirements
#1864 merged
Aug 30, 2024 -
fix qwen2-vl docs
#1861 merged
Aug 30, 2024 -
update qwen2-vl docs
#1858 merged
Aug 29, 2024 -
update qwen2-vl docs
#1856 merged
Aug 29, 2024 -
Update new datasets
#1855 merged
Aug 29, 2024 -
support qwen2-vl & video finetune
#1849 merged
Aug 29, 2024 -
Support qwen2 vl grounding
#1854 merged
Aug 29, 2024 -
Fix Pissa and OLoRA
#1852 merged
Aug 29, 2024 -
Fix some datasets for streaming
#1848 merged
Aug 29, 2024 -
Add internvl2 awq models
#1846 merged
Aug 29, 2024 -
support qwen2-vl
#1842 merged
Aug 29, 2024 -
Support eval_nproc
#1843 merged
Aug 29, 2024 -
fix internlm-xcomposer rlhf
#1838 merged
Aug 28, 2024 -
add ddp_timeout parameter
#1836 merged
Aug 27, 2024 -
support qwen2-pro dataset
#1834 merged
Aug 27, 2024 -
fix inject
#1835 merged
Aug 27, 2024 -
Fix code
#1824 merged
Aug 27, 2024 -
fix minicpm-v 2.6 infer device_map
#1832 merged
Aug 27, 2024 -
use default-lora
#1823 merged
Aug 27, 2024 -
Support register loss func
#1822 merged
Aug 26, 2024 -
fix dora deployment
#1821 merged
Aug 26, 2024 -
Support liger
#1819 merged
Aug 26, 2024 -
fix preprocess_num_proc
#1818 merged
Aug 26, 2024 -
fix mp+ddp & resume_from_checkpoint
#1815 merged
Aug 26, 2024 -
Support zero2 offload
#1814 merged
Aug 26, 2024 -
compat with vllm==0.5.5
#1812 merged
Aug 25, 2024 -
fix
#1811 merged
Aug 23, 2024 -
fix offline export
#1805 merged
Aug 23, 2024 -
Support Latex OCR dataset
#1810 merged
Aug 23, 2024 -
Support hd num
#1801 merged
Aug 23, 2024 -
fix megatron_patch_path
#1804 merged
Aug 23, 2024 -
fix CI
#1797 merged
Aug 23, 2024 -
fix mllm rlhf with full sft type
#1800 merged
Aug 22, 2024 -
fix history_roles
#1798 merged
Aug 22, 2024 -
fix imports
#1796 merged
Aug 22, 2024 -
fix bugs
#1794 merged
Aug 22, 2024 -
fix yi-vl template
#1793 merged
Aug 22, 2024 -
support qwen-vl & base64
#1790 merged
Aug 22, 2024 -
update doc
#1789 merged
Aug 22, 2024 -
ReFT
#1785 merged
Aug 21, 2024 -
support phi3.5-vision
#1780 merged
Aug 21, 2024 -
fix moe & gradient_checkpointing
#1782 merged
Aug 21, 2024 -
fix infer dataset_test_ratio
#1779 merged
Aug 21, 2024 -
Fix zero3 & minicpm-v/internvl2/xcomposer
#1772 merged
Aug 20, 2024
4 Pull requests opened by 4 people
-
Add Internvl2TemplateWithAngles template
#1784 opened
Aug 21, 2024 -
# 观察数据后,发现下面的代码会过滤掉一些没有问题的数据,如:sure, here are some tools and …
#1931 opened
Sep 4, 2024 -
[WIP]Feat/refactor3
#2030 opened
Sep 12, 2024 -
update qwen2.5 best practices
#2080 opened
Sep 20, 2024
283 Issues closed by 30 people
-
OOM when tokenizing datasets
#1971 closed
Sep 20, 2024 -
Swift DPO Template 格式问题
#2023 closed
Sep 20, 2024 -
模型推理保存路径 (result_path)
#2026 closed
Sep 20, 2024 -
qwen2 vl 72B 微调报错
#2068 closed
Sep 20, 2024 -
internvl2-8b lora微调后 merge lora出错(if llm_config['architectures'][0] == 'LlamaForCausalLM':)
#2073 closed
Sep 20, 2024 -
请问DPO微调 Prompt部分会计算Loss么?
#2070 closed
Sep 19, 2024 -
swift2.4.2版本更新支持qwen2.5之后,但是yi-1.5-34b推理报错(更新之前是支持的)
#2066 closed
Sep 19, 2024 -
llava video SFT BUG
#2060 closed
Sep 19, 2024 -
windows使用webui训练报错 invalid float value: "'1e-4'"
#2036 closed
Sep 18, 2024 -
换了多个模型,都报一个错误:sh:1:syntax
#1994 closed
Sep 17, 2024 -
dpo微调与zero3不兼容
#1899 closed
Sep 17, 2024 -
DPO微调InternVL2-2B时报错
#1979 closed
Sep 17, 2024 -
eval_acc 是如何计算的?
#2006 closed
Sep 17, 2024 -
Does DPO/RLHF tuning support internVL2 video models?
#2015 closed
Sep 17, 2024 -
使用cogvlm2在rlaif-v数据集上做DPO训练报错
#2025 closed
Sep 17, 2024 -
Merge Lora后为什么sft_type被设置为full,不应该是lora么
#2041 closed
Sep 17, 2024 -
internlm-xcomposer2-7b-chat 使用 use_flash_attn 出现错误
#2046 closed
Sep 17, 2024 -
transformers>=4.45.0.dev0
#2018 closed
Sep 14, 2024 -
DPO support resume_from_checkpoint
#2031 closed
Sep 14, 2024 -
Merge Lora后为什么stf被设置为full
#2040 closed
Sep 14, 2024 -
cannot import name 'ftp_head' from 'datasets.utils.file_utils'
#2021 closed
Sep 12, 2024 -
dpo internvl2存在mismatch
#1930 closed
Sep 12, 2024 -
Internlm-Xcomposer2.5 推理时输入多张图报错
#2020 closed
Sep 11, 2024 -
Uploading models to Hugging Face from MS Swift
#1981 closed
Sep 11, 2024 -
Issue: RuntimeError with Multiple GPUs in MS Swift 2.5.0-dev
#1984 closed
Sep 11, 2024 -
DPO training error `UnboundLocalError: local variable 'num_patches' referenced before assignment`
#1734 closed
Sep 11, 2024 -
O-lora 不可用
#1989 closed
Sep 11, 2024 -
dpo训练时,使用自带数据集,报错
#2008 closed
Sep 11, 2024 -
lr_scheduler_type
#1983 closed
Sep 10, 2024 -
lmdeploy的main分支已经移除了EngineGenerationConfig,目前使用swift调用main分支的lmdeploy会报错
#1985 closed
Sep 10, 2024 -
评测时出现超时错误
#1974 closed
Sep 10, 2024 -
使用deepspeed时,出现RuntimeError: 'weight' must be 2-D 的错误
#1996 closed
Sep 10, 2024 -
Can't do RLHF with InternVL2-4B
#1987 closed
Sep 9, 2024 -
qwen2-vl-7b-instruct 推理视频报错
#1988 closed
Sep 9, 2024 -
BUG:init_lora lead to the wrong distribute?
#1944 closed
Sep 9, 2024 -
InternVL2 全量微调时显存占用持续上涨
#1955 closed
Sep 9, 2024 -
Huggingface link is broken
#1976 closed
Sep 8, 2024 -
Cannot get model_type from the deploy service
#1904 closed
Sep 7, 2024 -
如何手动下载评测数据集,并且如何指定评测时的数据集路径
#1972 closed
Sep 7, 2024 -
ImportError: cannot import name 'LlavaOnevisionForConditionalGeneration' from 'transformers'
#1878 closed
Sep 7, 2024 -
环境镜像
#1966 closed
Sep 7, 2024 -
support minicpm3-4b
#1962 closed
Sep 6, 2024 -
GLM-4V-9B fine-tuning error: ValueError: 151339 is not in list
#1840 closed
Sep 5, 2024 -
Qwen2-VL-7B-instruct 微调报错:RuntimeError: CUDA error: too many resources requested for launch
#1927 closed
Sep 5, 2024 -
TypeError: Qwen2ForCausalLM.forward() got an unexpected keyword argument '_data'
#1929 closed
Sep 5, 2024 -
Qwen2-VL-7B-Instruct Video inference
#1920 closed
Sep 5, 2024 -
swift infer的时候,传递do_sample参数不起作用
#1943 closed
Sep 5, 2024 -
使用自定数据集DPO mllm时报错KeyError: 'prompt'
#1922 closed
Sep 5, 2024 -
Merge LoRA & 量化部分 支持bnb量化嘛
#1895 closed
Sep 5, 2024 -
dpo微调internvl2
#1886 closed
Sep 5, 2024 -
请教一下,图片、视频等多模态数据超长应该怎么截断?
#1876 closed
Sep 5, 2024 -
部分数据集处理时出现了超过max length的warining,但实际数据貌似并没有超过,前几个版本的分支没有这个问题
#1865 closed
Sep 5, 2024 -
Qwen2-VL-7B的微调out of memory
#1860 closed
Sep 5, 2024 -
Can't find forward pass entrypoint for InternVL2
#1851 closed
Sep 5, 2024 -
agent tools工具在调用时思维链已经决定了多个action在tools_call时只返回一个
#1826 closed
Sep 5, 2024 -
关于推理时获得logits结果
#1528 closed
Sep 4, 2024 -
swift backend使用vllm推理时,openai接口不支持logprobs输出概率
#1719 closed
Sep 4, 2024 -
LLaVA-OneVision 的支持
#1662 closed
Sep 4, 2024 -
lr_scheduler_type
#1894 closed
Sep 4, 2024 -
qwen2-vl-2b-instruct使用自定义数据集微调出现DatasetGenerationError
#1918 closed
Sep 4, 2024 -
internvl2-26b多卡训练报错“Expected all tensors to be on the same device...”
#1910 closed
Sep 3, 2024 -
AssertionError: DeepSpeed does not recognize LR scheduler WarmupCosineLR
#1906 closed
Sep 2, 2024 -
Failed to import swift.llm.sft because of the following error
#1898 closed
Sep 2, 2024 -
internvl-40b模型微调后推理时报错
#1881 closed
Aug 31, 2024 -
How to freeze the ViT part during full parameter fine-tuning of Qwen2-vl?
#1879 closed
Aug 31, 2024 -
zero-3微调internvl2-40B
#1845 closed
Aug 31, 2024 -
qwen2-vl fine-tuning error: module 'torch.nn' has no attribute 'RMSNorm'
#1870 closed
Aug 30, 2024 -
Qwen2-VL-7B的微调out of memory
#1859 closed
Aug 30, 2024 -
多模态模型DPO微调时的报错
#1853 closed
Aug 29, 2024 -
ValueError: model_type: 'internvl2-8b-awq' is not registered.
#1847 closed
Aug 29, 2024 -
是否支持Lora继续训练?有报错
#1828 closed
Aug 28, 2024 -
训练数据咨询
#1827 closed
Aug 28, 2024 -
多轮对话conversations只有一组怎么处理?
#1816 closed
Aug 28, 2024 -
MiniCPM-V2.6在自己数据上LoRA微调,训练过程中报错DataLoader worker (pid 184751) is killed by signal: Segmentation fault.
#1806 closed
Aug 28, 2024 -
添加packing之后自定义数据集报错KeyError: 'input_ids',不使用packing可以正常训练
#1795 closed
Aug 28, 2024 -
请问glm4v什么时候支持lmdeploy部署呢?
#1791 closed
Aug 28, 2024 -
感觉是用了fp16的原因. 你可以试试bf16或者fp32不
#1788 closed
Aug 28, 2024 -
为了支持zsh改动后,不支持windows安装
#1776 closed
Aug 28, 2024 -
internvl-chat-v1_5 DPO 报错
#1773 closed
Aug 28, 2024 -
关于用自定义agent数据集微调,能不能增加一些自定义数据集的样例
#1758 closed
Aug 28, 2024 -
token should only be of type types or str
#1757 closed
Aug 28, 2024 -
可以问一下我们如果自定义数据集的话会是什么样的格式被访问呢 是需要自己封装一下吗
#1750 closed
Aug 28, 2024 -
多工具调用问题
#1749 closed
Aug 28, 2024 -
internvl-8b imblanced GPU allocation with device_max_memory
#1735 closed
Aug 28, 2024 -
qwen 2.7B MOE 训练缓慢,GPU 利用率很低。另外,断点续训时,预估的剩余时间需要休修正下
#1732 closed
Aug 28, 2024 -
多机训练InternVL2报错。完全照着官网和赵哥的脚本,多机报错,但单机完全OK
#1728 closed
Aug 28, 2024 -
微调InternVL2特别占显存
#1726 closed
Aug 28, 2024 -
使用webui上传图片会报错
#1727 closed
Aug 28, 2024 -
args.preprocess_num_proc设置为8报错
#1718 closed
Aug 28, 2024 -
Multimodal Document Optimization: Custom Datasets & Inference Documents
#1716 closed
Aug 28, 2024 -
qwen2 7B 单机8卡 dpo报错
#1715 closed
Aug 28, 2024 -
纯文本微调GLM-4V报错: ValueError: 151339 is not in list
#1712 closed
Aug 28, 2024 -
Update docker image
#1711 closed
Aug 28, 2024 -
InternVL2 微调 报错 Message: 'Error occurs in lazy tokenize:'
#1704 closed
Aug 28, 2024 -
module 'wandb.proto.wandb_internal_pb2' has no attribute 'Result'
#1700 closed
Aug 28, 2024 -
单机多卡执行 LLaMA-2 全量 SFT 时 NCCL Timeout 报错
#1690 closed
Aug 28, 2024 -
运行swift-webui出错
#1687 closed
Aug 28, 2024 -
模型轻量化问题
#1683 closed
Aug 28, 2024 -
【求助】英伟达 L40 4卡无法微调通义千问2 72B INT4 模型
#1676 closed
Aug 28, 2024 -
sft是否支持指定模型的特定参数?
#1661 closed
Aug 28, 2024 -
部署后只能通过ip+端口访问,通过域名访问报错
#1642 closed
Aug 28, 2024 -
swift swift-ui error: unable to invoke subcommand: swift-web-ui (No such file or directory)
#1629 closed
Aug 28, 2024 -
torch._C._LinAlgError: linalg.cholesky 即使讲quant_n_samples提到了2048依然报错
#1627 closed
Aug 28, 2024 -
florence模型微调报错
#1624 closed
Aug 28, 2024 -
你好,想问下swift中template.py文件,怎么样和指令微调一起使用呢?
#1623 closed
Aug 28, 2024 -
不支持--model_type gemma2-2b-instruct
#1619 closed
Aug 28, 2024 -
加载本地下好的sd模型进行微调时,仍然需要去网页端下载模型
#1616 closed
Aug 28, 2024 -
GLM4V运行时报错
#1611 closed
Aug 28, 2024 -
请问下GLM4V支持多样本批量请求吗
#1608 closed
Aug 28, 2024 -
On the use of --rope_scaling in InternVL2
#1591 closed
Aug 28, 2024 -
DPO使用deepspeed出现报错
#1574 closed
Aug 28, 2024 -
微调, deepspeed出现报错
#1570 closed
Aug 28, 2024 -
DDP in DPO tuning on MLLM
#1549 closed
Aug 28, 2024 -
swift pt命令报错,把缓存的数据集全删了重新下载 还是报错
#1544 closed
Aug 28, 2024 -
device_map error for internVL2.0, trained on multi-node
#1538 closed
Aug 28, 2024 -
No module named 'swift.cli.main'问题
#1536 closed
Aug 28, 2024 -
微调deepseekcoderv2后推理明显变慢,如何只activate moe的参数?
#1535 closed
Aug 28, 2024 -
微调后模型进行量化出现错误
#1533 closed
Aug 28, 2024 -
InternVL2的grounding任务自定义数据集如何一个prompt里融合多个<box>或<ref-object>
#1532 closed
Aug 28, 2024 -
Florence-2: How to add custom tokens during fine-tuning training?
#1524 closed
Aug 28, 2024 -
在cogvlm2模型上支持多模态大模型的强化学习训练,例如DPO训练
#1523 closed
Aug 28, 2024 -
KTO训练数据集报错
#1513 closed
Aug 28, 2024 -
flashattention3
#1493 closed
Aug 28, 2024 -
目前支持并行工具调用嘛?
#1491 closed
Aug 28, 2024 -
swift可以训练量化之后的模型吗,比如modelscope里面的awq或者gptq量化之后的模型
#1472 closed
Aug 28, 2024 -
intervl 微调后lora合并推理和直接用原来demo推理结果不一致
#1471 closed
Aug 28, 2024 -
function call微调数据格式
#1458 closed
Aug 28, 2024 -
get_dataset 函数报错,疑似是加载数据太多导致
#1453 closed
Aug 28, 2024 -
DPO训练报错 AttributeError: 'Seq2SeqTrainingArguments' object has no attribute 'model_init_kwargs'
#1434 closed
Aug 28, 2024 -
支持对NPU AWQ量化的模型推理部署吗?
#1429 closed
Aug 28, 2024 -
使用lora进行resume from checkpoint时,无法加载模型
#1207 closed
Aug 28, 2024 -
Q-GaLore support
#1388 closed
Aug 28, 2024 -
如何预读取数据
#1379 closed
Aug 28, 2024 -
CogVLM2 Video
#1339 closed
Aug 28, 2024 -
不支持DDP多卡推理
#1377 closed
Aug 28, 2024 -
可以支持自定义评测集合与评测指标吗
#1374 closed
Aug 28, 2024 -
脚本使用问题
#1358 closed
Aug 28, 2024 -
NPU微调报错AssertionError: Torch not compiled with CUDA enabled ,模型时qwen-7b-chat
#1353 closed
Aug 28, 2024 -
agent部署如何实现openai function call的样式,而不是如文档中的react
#1340 closed
Aug 28, 2024 -
VLM是否支持并行推理?
#1330 closed
Aug 28, 2024 -
使用多进程处理数据时,报错TypeError: _LazyConfigMapping.__init__() missing 1 required positional argument: 'mapping'
#1328 closed
Aug 28, 2024 -
自定义数据集文档看不太懂
#1323 closed
Aug 28, 2024 -
请问为什么在训练和merge完lora推理测试的时候log没有打印出custom_val_dataset中的问题和回答
#1322 closed
Aug 28, 2024 -
可否缓存预处理后的数据集?
#1315 closed
Aug 28, 2024 -
微调时显存都占得很满
#1314 closed
Aug 28, 2024 -
--max_length 参数是否不对自定义数据集生效
#1297 closed
Aug 28, 2024 -
微调模型报错
#1296 closed
Aug 28, 2024 -
swift 接入数据集下载后还是会不断请求网络
#1285 closed
Aug 28, 2024 -
Quantization stops for no reason when GPTQ quantizing Qwen2-72b-Instruct
#1121 closed
Aug 28, 2024 -
MiniCPM-V多卡训练模型infer与单卡不一致
#1191 closed
Aug 28, 2024 -
SimPO不支持zero3_offload分布式训练
#1095 closed
Aug 28, 2024 -
无法运行swift
#1107 closed
Aug 28, 2024 -
GLM4V微调OOM
#1122 closed
Aug 28, 2024 -
请问有多模态模型可以实现few-shot(in-context)数据的微调吗?
#1094 closed
Aug 28, 2024 -
低成本更新swfit版本
#1089 closed
Aug 28, 2024 -
微调qwen1.5 110B chat后会重复说
#1084 closed
Aug 28, 2024 -
关于VLM的评测支持
#1117 closed
Aug 28, 2024 -
failed (exitcode: -9) local_rank: 0 (pid: 1760809) of binary: /home/xxx/miniconda3/envs/swift/bin/python
#1208 closed
Aug 28, 2024 -
Qwen-VL-Chat全参精调后无法输出eos_token
#1215 closed
Aug 28, 2024 -
Internvl-chat-v1.5使用多卡推理报错
#1228 closed
Aug 28, 2024 -
之前用命令进行chatglm3进行swif训练,能够训练成功现在训练一会就报错内存已经满了 是什么原因
#1230 closed
Aug 28, 2024 -
多模态模型如何多卡部署
#1236 closed
Aug 28, 2024 -
Why has the memory usage of the latest code increased? It seems there is now a data loading module.
#1253 closed
Aug 28, 2024 -
Why does the latest code seem to have slower training speed and increased memory usage?
#1255 closed
Aug 28, 2024 -
The acc is different when infering a glm4v model trained based on the Tutorial
#1258 closed
Aug 28, 2024 -
qwen2 系列模型后续是否会支持 fp16 微调?
#1262 closed
Aug 28, 2024 -
Finetuning Florence on the forgetting problem
#1267 closed
Aug 28, 2024 -
Not working with python virtualenv
#1266 closed
Aug 28, 2024 -
How to set different learning rate for different groups of parameters in fine-tuning?
#1271 closed
Aug 28, 2024 -
Internvl 微调后的模型,python代码在哪里加载自己模型的路径
#1281 closed
Aug 28, 2024 -
swift 能够在k8s中训练推理运行吗
#1283 closed
Aug 28, 2024 -
能否基于lora或llamapro实现增量学习
#1123 closed
Aug 28, 2024 -
Model is not saved when resume_from_checkpoint
#1829 closed
Aug 28, 2024 -
使用internlm-xcomposer2_5-7b-chat进行dpo训练,数据报错
#1831 closed
Aug 28, 2024 -
minicpm-v 2.5 sft训练时跑验证集会额外占用10GB显存是正常现象吗?
#1140 closed
Aug 28, 2024 -
GLM4v LORA微调后,断点训练失败
#1133 closed
Aug 28, 2024 -
请问swift训练MiniCPM-v2.0时应该如何制作带坐标的数据集
#1137 closed
Aug 28, 2024 -
用户如何自己实现新的参数高效微调算法?
#1147 closed
Aug 28, 2024 -
量化训练qwen2-72b-instruct 的权重突入中断
#1151 closed
Aug 28, 2024 -
GLM4v使用Lora模型微调后,merge模型后运行报错,
#1163 closed
Aug 28, 2024 -
预训练模板错误
#1166 closed
Aug 28, 2024 -
关于"自我认知微调最佳实践"和"自定义模型和数据集"的说明优化
#1176 closed
Aug 28, 2024 -
训练及导出显存消耗不正常
#1179 closed
Aug 28, 2024 -
glm4v微调到一半会显示网络问题,请问有其他加载本地输入的方法吗
#1185 closed
Aug 28, 2024 -
量化后的模型推理报错怎么解决
#940 closed
Aug 28, 2024 -
多节点训练报错
#945 closed
Aug 28, 2024 -
我想将模型保存到本地,怎么才能保存。同时怎么调用本地的模型?
#950 closed
Aug 28, 2024 -
使用zero3进行多机多卡全量微调,保存的模型权重不完整
#972 closed
Aug 28, 2024 -
用qwen-7b-int4和int8进行lora微调后,微调和推理没问题,但部署后,请求报错
#935 closed
Aug 28, 2024 -
训练qwen14b,前面lr一直为0
#943 closed
Aug 28, 2024 -
多机多卡推理
#927 closed
Aug 28, 2024 -
微调minicpmv2时cpu占用率超高
#1008 closed
Aug 28, 2024 -
对qwen2-72b-instruct-awq进行qlora微调之后,如何进行merge?
#1839 closed
Aug 28, 2024 -
react模板放在system和user的区别?
#1025 closed
Aug 28, 2024 -
bitsandbytes was compiled without GPU support
#1031 closed
Aug 28, 2024 -
swift训练minicpm-v如何设置grounding格式
#1077 closed
Aug 28, 2024 -
如何批量导出MS数据集为swift能加载的格式
#1041 closed
Aug 28, 2024 -
swift微调多模态大模型后,比如Intern VL 1.5,可以使用lmdeploy部署吗?
#1033 closed
Aug 28, 2024 -
是否支持idm-vton模型的微调呢?
#1040 closed
Aug 28, 2024 -
随机初始化预训练
#1045 closed
Aug 28, 2024 -
不太明白Lisa为何能够节省显存,希望能够帮忙解答
#1056 closed
Aug 28, 2024 -
[Help]请问MiniCPM2.5支持Web端交互推理吗?
#1060 closed
Aug 28, 2024 -
又测了下,您说的最新的代码是指加了if batch[0].get('pixel_values') is not None:这个判断语句吗,在batch>1的情况下,还是会出现keyerror的
#1061 closed
Aug 28, 2024 -
关于多图微调和推理问题
#1074 closed
Aug 28, 2024 -
请教一下 有没有办法扩展c4ai-command-r-plus的上下文长度呢
#903 closed
Aug 28, 2024 -
使用lora微调合并权重加载模型报错
#825 closed
Aug 28, 2024 -
Eval 模块使用时 C-Eval 进行测试时,让人不确定日志输出
#905 closed
Aug 28, 2024 -
基于上次提的问题#691,后续改进后似乎依旧不能按微调的情况回复。
#880 closed
Aug 28, 2024 -
VRAM requirement for full sft deepseek VL 7B
#860 closed
Aug 27, 2024 -
llama3-8b-instruct awq量化oom
#852 closed
Aug 27, 2024 -
使用DDP运行时显存不够,但是使用Model Parallel时可以正常finetune,耗时很大
#851 closed
Aug 27, 2024 -
请教一下,多节点多卡微调,用Slurm怎样运行?
#828 closed
Aug 27, 2024 -
win10训练qwen1.5-moe-A2.7B-chat-gptq-int4速度缓慢
#811 closed
Aug 27, 2024 -
Langchain-Chatchat部署训练后的模型后推理异常
#794 closed
Aug 27, 2024 -
Support QLoRA with HQQ quantization
#786 closed
Aug 27, 2024 -
ddp_timeout能否作为sft命令行参数
#773 closed
Aug 27, 2024 -
希望可以支持自我奖励的优化方法
#757 closed
Aug 27, 2024 -
swift 1.8. -> 2.0.0 原lora脚本报错
#713 closed
Aug 27, 2024 -
tuner_backend选swift的时候,rsLoRA训练中loss正常下降,但训练后模型推理输出空值
#707 closed
Aug 27, 2024 -
72B的模型首字延时如何减少
#689 closed
Aug 27, 2024 -
部署Qwen1.5-MoE-A2.7B-Chat进行推理问题
#678 closed
Aug 27, 2024 -
期望支持PiSSA微调
#665 closed
Aug 27, 2024 -
自定义推理文件无法使用两卡以上多卡推理
#638 closed
Aug 27, 2024 -
windows 系统训练失败,数据集字符编码问题
#686 closed
Aug 27, 2024 -
cogvlm推理无法选择大于1的num_beam
#636 closed
Aug 27, 2024 -
lora微调cogvlm打开量化报错
#635 closed
Aug 27, 2024 -
Unable to enable web UI
#1756 closed
Aug 27, 2024 -
lora_modules_to_save + deepspeed zero2 导致无法保存checkpoint
#628 closed
Aug 27, 2024 -
微调Qwen1___5-7B-Chat,4比特量化,为何还需要占用这么多显存?
#616 closed
Aug 27, 2024 -
Live inference of fine-tuned multimodal LLMs?
#615 closed
Aug 27, 2024 -
希望能实现训练和推理的时候开启RoPE外推
#610 closed
Aug 27, 2024 -
微调任务是,中断网页再次打开,找回运行时任务,运行状态无法可视化
#588 closed
Aug 27, 2024 -
断点续训时,显存要求增大了?
#570 closed
Aug 27, 2024 -
agent微调数据自制,header信息如何添加?
#544 closed
Aug 27, 2024 -
请问Qwen-audio的训练速度,阿里官方达到多少?
#511 closed
Aug 27, 2024 -
请问能给一点实现动态batch size的建议吗?
#489 closed
Aug 27, 2024 -
可以支持一下SPIN自我博弈微调的方法不
#466 closed
Aug 27, 2024 -
在网页版的demo中推理仅仅只能进行语言模型的对话,没有多模态的推理
#461 closed
Aug 27, 2024 -
not supported python3.8
#459 closed
Aug 27, 2024 -
LongLoRA微调llama2-7b报错
#449 closed
Aug 27, 2024 -
目前swif的lora微调和adapter微调,token embedding方式相同吗
#391 closed
Aug 27, 2024 -
web-ui - LLM训练的 system字段 会导致命令注入
#363 closed
Aug 27, 2024 -
使用vllm官方镜像部署swift finetune后的checkpoint 报错
#1820 closed
Aug 27, 2024 -
单机多卡 DDP+MP 断点续训(resume_from_checkpoint)报错
#596 closed
Aug 26, 2024 -
How to choose a template (or template_type)
#1813 closed
Aug 25, 2024 -
单卡训练qwen会卡一下
#1802 closed
Aug 23, 2024 -
要如何指定InternLM-XComposer2d5的hd_num参数
#1799 closed
Aug 23, 2024 -
About the parameter sft_type
#1803 closed
Aug 23, 2024 -
How to load a model from DPO checkpoint
#1786 closed
Aug 22, 2024 -
InternVL2-8b DPO Failed mapping
#1768 closed
Aug 22, 2024 -
AttributeError: module 'inspect' has no attribute 'getargspec'. Did you mean: 'getargs'?
#1387 closed
Aug 21, 2024 -
The deployment of llama3.1-405b-instruct
#1484 closed
Aug 21, 2024 -
评测Internvl76B时,完成了部署但是没有推理
#1486 closed
Aug 21, 2024 -
用改框架微调GLM-4V-9B模型,存在重复输出和任务无法拟合的问题。
#1502 closed
Aug 21, 2024 -
GLM 4V 9B RuntimeError: Expected all tensors to be on the same device, but found at least two devices
#1547 closed
Aug 21, 2024 -
全量微调full断点续训(resume_from_checkpoint)问题
#1577 closed
Aug 21, 2024 -
sft 增大显卡数,速度没有提升,修改了gradient_accumulation_steps, remaining_time还是不变
#1674 closed
Aug 21, 2024 -
The loss is always 0 when fine-tuning internlm-xcomposer2d5-7b
#1680 closed
Aug 21, 2024 -
How to enable visual module fine-tuning?
#1685 closed
Aug 21, 2024 -
qwen2-audio analysis为啥60s音频文件识别不出文字啊
#1691 closed
Aug 21, 2024 -
internlm-xcomposer2-vl-1_8b 训练支持
#1739 closed
Aug 21, 2024 -
qwen2-audio only trains the LLM part. How to modify target_regex?
#1747 closed
Aug 21, 2024 -
Support using lmdeploy for inference and deployment on videos (internvl2)
#1679 closed
Aug 21, 2024 -
微调InternVL2特别占显存
#1725 closed
Aug 21, 2024 -
Error during Finetune MiniCPM 微调MiniCPM时出错 & zero3
#1135 closed
Aug 20, 2024
80 Issues opened by 66 people
-
V100显卡 Ubuntu22.04系统 qwen2-vl-2b模型, 单卡测试脚本运行正常,双卡,三卡,四卡运行异常。
#2087 opened
Sep 20, 2024 -
Qwen2-VL 72B API推理错误
#2086 opened
Sep 20, 2024 -
昇腾910B单机多卡断点续训问题
#2085 opened
Sep 20, 2024 -
npu 推理和部署怎么设置多卡
#2084 opened
Sep 20, 2024 -
swift升级到2.4.0及以上版本后加载训练数据集报错
#2083 opened
Sep 20, 2024 -
dpo InternVL2-8B meets OOM
#2082 opened
Sep 20, 2024 -
希望支持训练user部分或者同时训练user和assistant部分
#2078 opened
Sep 20, 2024 -
MLLM支持KTO训练
#2076 opened
Sep 20, 2024 -
Training llama 3.1 70B using 4 A6000
#2074 opened
Sep 19, 2024 -
internvl2+lmdeploy部署推理视频,有没有控制采样频率,前处理的参数?目前的推理时间过长了
#2069 opened
Sep 19, 2024 -
Fine-tuning best practices for qwen2.5-72b-instruct and qwen2-vl-72b-instruct.
#2064 opened
Sep 18, 2024 -
swift更新到最新版后无法使用多个节点训练
#2057 opened
Sep 18, 2024 -
数据预处理报错
#2055 opened
Sep 18, 2024 -
Pixtral 12B
#2053 opened
Sep 17, 2024 -
请问如何freeze一部分pretrain的模型后,接入自定义的pytorch model并进行训练?
#2049 opened
Sep 16, 2024 -
LISA训练要么OOM,要么使用deepseed就报错
#2035 opened
Sep 14, 2024 -
关于 微调+量化 的一些经验求教
#2033 opened
Sep 13, 2024 -
internvl2-8b在docker里训练OOM
#2027 opened
Sep 12, 2024 -
How to track pooling stride and frame count of llava-next-video in Swift
#2024 opened
Sep 12, 2024 -
swift不能做模型的二次预训练?具体怎么做啊?
#2022 opened
Sep 12, 2024 -
Dataset format compatibility between LLaVA-NexT and Qwen-VL2 for custom JSON datasets
#2017 opened
Sep 11, 2024 -
internvl2-40b infer
#2016 opened
Sep 11, 2024 -
Deployment or Export
#2014 opened
Sep 11, 2024 -
Qwen2-VL-7B-Instruct训练爆显存
#2010 opened
Sep 11, 2024 -
Regular lora target module cannot use imdeploy
#2007 opened
Sep 10, 2024 -
train_on_input
#2004 opened
Sep 10, 2024 -
inference接口不支持设置返回多组结果
#2002 opened
Sep 10, 2024 -
无法评测LoRA微调后的llava1.5模型
#2000 opened
Sep 10, 2024 -
MiniCPM-V-2 lora微调后推理报错: AssertionError: Current sentence length exceeds the model max_length: 4096
#1999 opened
Sep 10, 2024 -
internlm2_5-20b-chat 量化模型不支持vllm推理
#1998 opened
Sep 10, 2024 -
internvl 40Bckp保存和推理加载的时候出错
#1993 opened
Sep 9, 2024 -
Inference Not Working on cuda:1
#1991 opened
Sep 9, 2024 -
LLaVA-NeXT-Video model configuration initialize error
#1986 opened
Sep 9, 2024 -
llava1_5-7b-instruct 评测开始时能正常输出,后面无法正常输出
#1978 opened
Sep 8, 2024 -
Dataset preparation for Object Detection with Florence2
#1977 opened
Sep 8, 2024 -
mplug-owl3-7b-chat fine-tuning document
#1969 opened
Sep 7, 2024 -
Dataset stucked when using --dataloader_num_workers 1 and --streaming true
#1968 opened
Sep 7, 2024 -
全量微调minicpm-V2.6,但是生成的sft_args.json里面仍然有lora
#1967 opened
Sep 6, 2024 -
如果我在训练Lora时也想训练head,应该如何设置
#1965 opened
Sep 6, 2024 -
用deploy部署qwen2vl,多个请求同时并发报错
#1961 opened
Sep 6, 2024 -
期望RLHF能支持序列并行(sequence_parallel)
#1958 opened
Sep 6, 2024 -
qwen2-vl-7b-instruct 以VLLM形式启动推理引擎失败“ assert "factor" in rope_scaling”
#1954 opened
Sep 5, 2024 -
swift 中如何设置模型内部参数,如internvl中的max_dynamic_patch
#1953 opened
Sep 5, 2024 -
NPU qwen2模型推理报错
#1951 opened
Sep 5, 2024 -
请问是否支持在NPU上训练多模态大模型
#1941 opened
Sep 5, 2024 -
微调速度非常慢
#1940 opened
Sep 5, 2024 -
streaming模式读取数据,显存利用率很低
#1939 opened
Sep 5, 2024 -
Training stops for `KTO` after model loads into memory.
#1938 opened
Sep 4, 2024 -
qwen2_audio_7b_instruct利用VLLM推理错误
#1937 opened
Sep 4, 2024 -
关于 Qwen2_VL-2B 微调时显存不足的问题
#1935 opened
Sep 4, 2024 -
微调glm4v, 给glm4v的视觉部分都添加了checkpoint, 但是还是显存溢出(lora_target_modules 设置为'ALL' )
#1934 opened
Sep 4, 2024 -
“训练推理界面”后点击“通过 API 使用”时报错。
#1932 opened
Sep 4, 2024 -
视觉模块支持gradient_checkpointing
#1928 opened
Sep 4, 2024 -
minicpm-v-v2.6评测出现结果为0情况
#1924 opened
Sep 3, 2024 -
Support for Fine-Tuning Best Practices with LLaVA-OV
#1923 opened
Sep 3, 2024 -
【新增功能需求】 Internvl2模型+VLLM 后端实现 异步客户端请求的Video 推理功能
#1921 opened
Sep 3, 2024 -
MooER audio support request
#1919 opened
Sep 3, 2024 -
llava-llama-3-8b-v1_1 AttributeError: 'NoneType' object has no attribute 'get_output_embeddings'
#1911 opened
Sep 3, 2024 -
qwen2-vl-chat-instruct示例数据格式
#1896 opened
Sep 2, 2024 -
internvl2-llama3-76b 微调报错
#1892 opened
Sep 2, 2024 -
如何调整训练损失函数类型?自定义损失函数?(Custom loss function)
#1891 opened
Sep 2, 2024 -
qwen2-vl微调使用flash_attn报错
#1887 opened
Sep 2, 2024 -
CUDA error: too many resources requested for launch (V100, qwen2-vl)
#1867 opened
Aug 30, 2024 -
Memory(not GPU RAM) exceeds when using 'swift deploy'
#1866 opened
Aug 30, 2024 -
minicpm-V-2最佳实践,执行推理时,模型不输出任何结果
#1863 opened
Aug 30, 2024 -
🎉Support for finetuning of Qwen2-VL-Chat series models
#1857 opened
Aug 29, 2024 -
希望对vllm的cpu运行模式增加支持
#1844 opened
Aug 29, 2024 -
How to use XLA/TPU? Potentially SPMD FSDP with InternVL model
#1837 opened
Aug 27, 2024 -
多机训练速度问题
#1825 opened
Aug 27, 2024 -
训练中途突然报错 NCCL watchdog thread terminated with exception
#1817 opened
Aug 26, 2024 -
Phi3.5-vision-instruct fine-tuning best practices. (Latex OCR Fine-tuning)
#1809 opened
Aug 23, 2024 -
增加推理加速器适配
#1808 opened
Aug 23, 2024 -
lr_scheduler_type
#1807 opened
Aug 23, 2024 -
[BUG] chatGLM4微调,设置predict_with_generate=True, 评估时间过长
#1792 opened
Aug 22, 2024 -
多模态评测数据下载失败
#1787 opened
Aug 22, 2024 -
使用Lora微调MiniCPM-V-2_6,合并后再Lora训练出现问题
#1778 opened
Aug 21, 2024
28 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
书生26B合并dpo后的适配器失败
#1667 commented on
Aug 23, 2024 • 0 new comments -
能否支持/v1/embeddings的api调用
#807 commented on
Aug 27, 2024 • 0 new comments -
export problem: get_model_tokenizer_with_flash_attn() got multiple values for keyword argument 'automodel_class'
#836 commented on
Aug 28, 2024 • 0 new comments -
2.0.4之后的版本的显存使用问题
#922 commented on
Aug 28, 2024 • 0 new comments -
是否支持自定义lr_scheduler
#1075 commented on
Aug 28, 2024 • 0 new comments -
Loss and acc drop to 0 after several steps
#1062 commented on
Aug 28, 2024 • 0 new comments -
可以支持一下InternLM2-Math-Plus-Mixtral8x22B的微调吗
#1019 commented on
Aug 28, 2024 • 0 new comments -
希望能应用TensorRT加速训练和推理
#942 commented on
Aug 28, 2024 • 0 new comments -
Process hang with futex(0x7f403c0199d0, FUTEX_WAIT, 14826, NULL
#1128 commented on
Aug 28, 2024 • 0 new comments -
minicpm-v-v2_5-chat 微调vpm显存溢出
#1286 commented on
Aug 28, 2024 • 0 new comments -
自定义评测数据集做评测时出现,模型用vllm.entrypoints.openai.api_server起的。运行评测脚本出现错误
#1295 commented on
Aug 28, 2024 • 0 new comments -
关于agent微调数据问题
#1351 commented on
Aug 28, 2024 • 0 new comments -
Representing results of Agent best practice with Qwen2-7b-instruct outputs unexpected <|endoftext|> and <|im_start|>
#1155 commented on
Aug 28, 2024 • 0 new comments -
Florence2 batched inference
#1441 commented on
Aug 28, 2024 • 0 new comments -
能否加上昇腾NPU上多卡推理
#1469 commented on
Aug 28, 2024 • 0 new comments -
smooth quant support
#1489 commented on
Aug 28, 2024 • 0 new comments -
RAG支持
#1548 commented on
Aug 28, 2024 • 0 new comments -
可以在moe的模型训练中 增加专家并行的参数吗
#1631 commented on
Aug 28, 2024 • 0 new comments -
是否考虑支持RLHV-V中提出的DDPO
#1639 commented on
Aug 28, 2024 • 0 new comments -
support llama3 megatron
#1736 commented on
Aug 28, 2024 • 0 new comments -
单机三卡微调50K数据OOM
#1729 commented on
Aug 30, 2024 • 0 new comments -
swift 量化多模态大模型internvl2-26B,报错
#1504 commented on
Aug 31, 2024 • 0 new comments -
魔搭NPU训练部署交流群
#1589 commented on
Sep 4, 2024 • 0 new comments -
AttributeError: module 'transformers_modules.InternVL2-2B-1epoch.tokenization_internlm2' has no attribute 'InternLM2Tokenizer'
#1663 commented on
Sep 8, 2024 • 0 new comments -
SWIFT 2.4 TO DO LIST
#1617 commented on
Sep 10, 2024 • 0 new comments -
Best practice for Qwen2-Audio
#1653 commented on
Sep 11, 2024 • 0 new comments -
Best Practices for Inference and Fine-Tuning with MiniCPM-V 2.6
#1613 commented on
Sep 15, 2024 • 0 new comments -
DPO训练的时候grad_norm出现nan值
#923 commented on
Sep 17, 2024 • 0 new comments