在网页版的demo中推理仅仅只能进行语言模型的对话，没有多模态的推理 #461

a2382625920 · 2024-02-27T06:09:46Z

例如：
我已经微调好了Yi-VL-6B，想使用网页版推理进行对话，但是推理页面没有专门的图片输入选项，只能进行chat，希望可以改进一下

a2382625920 · 2024-02-27T06:11:25Z

例如：我已经微调好了Yi-VL-6B，想使用网页版推理进行对话，但是推理页面没有专门的图片输入选项，只能进行chat，希望可以改进一下

不对，应该是yi-vl-6B没有图片的输入也无法进行chat，网页版推理仅对纯语言模型有效

Jintao-Huang · 2024-02-27T07:56:11Z

是的, 网页端暂时不支持多模态模型的推理, 可以使用swift infer进行命令行的推理

a2382625920 · 2024-02-28T07:18:41Z

是的, 网页端暂时不支持多模态模型的推理, 可以使用进行命令行的推理swift infer

好的谢谢，那能不能通过一个循环，将一个json文件中的关键字段全部提取出来，然后json中的每个数据都推理一遍

a2382625920 · 2024-02-28T07:20:17Z

类似于这样（这是Qwen-VL的批量预测的代码）：
def _load_model_tokenizer():
tokenizer = AutoTokenizer.from_pretrained(
DEFAULT_CKPT_PATH, trust_remote_code=True, resume_download=True,
)

device_map = "cuda"
model = AutoPeftModelForCausalLM.from_pretrained(
DEFAULT_CKPT_PATH, # path to the output directory
device_map="cuda",
trust_remote_code=True
).eval()
# model.generation_config = GenerationConfig.from_pretrained(
#     DEFAULT_CKPT_PATH, trust_remote_code=True, resume_download=True,
# )

return model, tokenizer

def parse_text(text):
lines = text.split("\n")
lines = [line for line in lines if line != ""]
count = 0
for i, line in enumerate(lines):
if "```" in line:
count += 1
items = line.split("") if count % 2 == 1: lines[i] = f'<pre><code class="language-{items[-1]}">' else: lines[i] = f"<br></code></pre>" else: if i > 0: if count % 2 == 1: line = line.replace("", r"`")
line = line.replace("<", "<")
line = line.replace(">", ">")
line = line.replace(" ", " ")
line = line.replace("*", "*")
line = line.replace("", "_")
line = line.replace("-", "-")
line = line.replace(".", ".")
line = line.replace("!", "!")
line = line.replace("(", "(")
line = line.replace(")", ")")
line = line.replace("$", "$")
lines[i] = "
" + line
text = "".join(lines)
return text

def predict(message):
start = time.time()
message = _parse_text(message)
print("用户: " + _parse_text(message))
history = []
response, history = model.chat(tokenizer, message, history=history)
full_response = _parse_text(response)
print("Qwen-VL-Chat: " + _parse_text(full_response))
print(f"耗时{time.time()-start}")
return full_response

model, tokenizer = _load_model_tokenizer()

if name == 'main':
with open(r'your/path', 'r', encoding='utf-8') as f:
valid_data = json.load(f)
data_list = []
start = time.time()
for index, i in enumerate(valid_data):
image_id = i.get("id")
print(f'进度{index + 1}/{len(valid_data)};{image_id}')
conversations = i.get("conversations")
query = conversations[0].get("value")
true_ = conversations[1].get("value")
full_response = predict(query)
print("实际: ", true_)
data = {"图像名称": image_id, "实际结果": true_, "预测结果": full_response, "查询": query}
data_list.append(data)
print(f"耗时{time.time()-start}")

output_path = 'your/predict.json'
with open(output_path, 'w', encoding='utf-8') as f:
    json.dump(data_list, f, ensure_ascii=False, indent=2)

print(f"输出保存在 {output_path}")

GUOGUO-lab · 2024-03-22T02:01:49Z

是的, 网页端暂时不支持多模态模型的推理, 可以使用swift infer进行命令行的推理

此前的1.5.4版本可以加载qwen-vl-chat，图片的推理也可以用“<img>xxx.jpg</img>“来进行输入，结果到了1.7.0这个也不支持了，请问这个能恢复吗？

Jintao-Huang assigned Jintao-Huang and tastelikefeet Feb 27, 2024

Jintao-Huang added the enhancement New feature or request label Feb 28, 2024

tastelikefeet closed this as completed Aug 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

在网页版的demo中推理仅仅只能进行语言模型的对话，没有多模态的推理 #461

在网页版的demo中推理仅仅只能进行语言模型的对话，没有多模态的推理 #461

a2382625920 commented Feb 27, 2024

a2382625920 commented Feb 27, 2024

Jintao-Huang commented Feb 27, 2024

a2382625920 commented Feb 28, 2024

a2382625920 commented Feb 28, 2024 •

edited

Loading

GUOGUO-lab commented Mar 22, 2024 •

edited

Loading

在网页版的demo中推理仅仅只能进行语言模型的对话，没有多模态的推理 #461

在网页版的demo中推理仅仅只能进行语言模型的对话，没有多模态的推理 #461

Comments

a2382625920 commented Feb 27, 2024

a2382625920 commented Feb 27, 2024

Jintao-Huang commented Feb 27, 2024

a2382625920 commented Feb 28, 2024

a2382625920 commented Feb 28, 2024 • edited Loading

GUOGUO-lab commented Mar 22, 2024 • edited Loading

a2382625920 commented Feb 28, 2024 •

edited

Loading

GUOGUO-lab commented Mar 22, 2024 •

edited

Loading