⚡️ perf: Optimize the image upload size for `gpt-4-vision` #669

mushan0x0 · 2023-12-15T07:01:05Z

💻 变更类型 | Change Type

🔀 变更说明 | Description of Change

设置图片最大宽度或者高度为 2k，再将图片格式转为 webp

Close #668
Close #646

📝 补充信息 | Additional Information

压缩之后

压缩之前

vercel · 2023-12-15T07:01:09Z

@mushan0x0 is attempting to deploy a commit to the LobeHub Team on Vercel.

A member of the Team first needs to authorize it.

lobehubbot · 2023-12-15T07:01:19Z

👍 @mushan0x0

Thank you for raising your pull request and contributing to our Community
Please make sure you have followed our contributing guidelines. We will review it as soon as possible.
If you encounter any problems, please feel free to connect with us.
非常感谢您提出拉取请求并为我们的社区做出贡献，请确保您已经遵循了我们的贡献指南，我们会尽快审查它。
如果您遇到任何问题，请随时与我们联系。

src/services/file.ts

arvinxx · 2023-12-15T07:06:41Z

此外，实现思路上可以考虑下，是只存压缩后的缩略图，还是 raw 也存，缩略图也存

lobehubbot · 2023-12-15T07:06:53Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

In addition, you can consider the implementation idea, whether to store only compressed thumbnails, or to store raw and thumbnails as well.

canisminor1990 · 2023-12-15T11:53:06Z

之前自定义头像有个压缩方法 https://github.com/lobehub/lobe-chat/blob/main/src/utils/imageToBase64.ts
感觉可以合一合

mushan0x0 · 2023-12-15T12:05:47Z

之前自定义头像有个压缩方法 https://github.com/lobehub/lobe-chat/blob/main/src/utils/imageToBase64.ts 感觉可以合一合

一开始是用的这个，但是里面有个图片居中逻辑，合并的话不如再抽个文件

codecov · 2023-12-15T12:32:10Z

Codecov Report

Attention: 23 lines in your changes are missing coverage. Please review.

Comparison is base (b142c17) 87.55% compared to head (be7a692) 87.36%.
Report is 1 commits behind head on main.

Files	Patch %	Lines
src/services/file.ts	30.30%	23 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #669      +/-   ##
==========================================
- Coverage   87.55%   87.36%   -0.19%     
==========================================
  Files         171      172       +1     
  Lines        8045     8107      +62     
  Branches      719      724       +5     
==========================================
+ Hits         7044     7083      +39     
- Misses       1001     1024      +23

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

mushan0x0 · 2023-12-15T14:45:19Z

src/services/file.ts

  async uploadFile(file: DB_File) {
+    // 跳过图片上传测试


图片压缩这里测试时只能跳过了，应该很难 mock

可以使用 LobeChat 测试工程师来帮你写单测：

现在 LobeChat 中很多单测都是它帮忙写的： https://shareg.pt/xHPM9NJ

mushan0x0 · 2023-12-15T14:47:08Z

此外，实现思路上可以考虑下，是只存压缩后的缩略图，还是 raw 也存，缩略图也存

原图就不用存了，占体积也麻烦，本身 webp 跟原图差别也不大

lobehubbot · 2023-12-15T14:47:20Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

In addition, you can consider in terms of implementation ideas whether to store only compressed thumbnails or raw and thumbnails as well.

There is no need to save the original image, and it takes up a lot of space. Webp itself is not much different from the original image.

canisminor1990 · 2023-12-15T15:33:59Z

还有一个思考点，目前压缩图片是通过分辨率判断，是不是通过 blob.size 最大值判断更好，理论上应该是在条件允许的可能下，保留更高清的图片的，只是一个想法，实现上可能会触发多次压缩循环，性能应该不太好

lobehubbot · 2023-12-15T15:34:15Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

There is another point to think about. Currently, compressed images are judged by resolution. Is it better to judge by the maximum value of blob.size?

Wxh16144 · 2023-12-15T16:12:22Z

我也想试着参与你们，在这个 pr 上我补充了一些测试用例。大佬们看看可以不 😁

lobehubbot · 2023-12-15T16:12:32Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

I also want to try to participate in you, and I have added some test cases in this PR. Guys, let’s see if you can 😁

🔧 chore: Add test cases

arvinxx · 2023-12-16T11:54:40Z

还有一个思考点，目前压缩图片是通过分辨率判断，是不是通过 blob.size 最大值判断更好，理论上应该是在条件允许的可能下，保留更高清的图片的，只是一个想法，实现上可能会触发多次压缩循环，性能应该不太好

@canisminor1990 我感觉不是，应该按照分辨率来。原因是 GPT-4v 模型是按分辨率收钱的：

refs: https://platform.openai.com/docs/guides/vision/low-or-high-fidelity-image-understanding

目前我的实现中是写死了 auto 模式，因此模型会自动识别分别率。比如一张 512x512 的图片，发过去就会是 65 token。

如果是一张分辨率是 12800x25600 的图片，如果限定尺寸，而不限定分辨率，可能是缩小到了 5120x 12800 ，满足了我们的尺寸要求。但发给 4v 之后容易造成更多的 token 浪费。毕竟有可能使用低分辨率就可以描述清楚了。

lobehubbot · 2023-12-16T11:54:51Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Another point to think about is that currently compressed images are judged by resolution. Is it better to judge by the maximum value of blob.size? In theory, higher-definition images should be retained when conditions permit. This is just an idea. The implementation may trigger multiple compression cycles, and the performance may not be very good.

@canisminor1990 I don’t think so, it should be based on the resolution. The reason is that the GPT-4v model charges based on resolution:

At present, the auto mode is hard-coded in my implementation, so the model will automatically recognize the resolution. For example, if a 512x512 picture is sent, it will cost 65 tokens.

If it is a picture with a resolution of 12800x25600, if the size is limited but not the resolution, it may be reduced to 5120x 12800, which meets our size requirements. However, it is easy to cause more waste of tokens after issuing it to 4v. After all, it is possible to describe clearly using low resolution.

canisminor1990 · 2023-12-16T13:40:19Z

还有一个思考点，目前压缩图片是通过分辨率判断，是不是通过 blob.size 最大值判断更好，理论上应该是在条件允许的可能下，保留更高清的图片的，只是一个想法，实现上可能会触发多次压缩循环，性能应该不太好

@canisminor1990 我感觉不是，应该按照分辨率来。原因是 GPT-4v 模型是按分辨率收钱的：

refs: https://platform.openai.com/docs/guides/vision/low-or-high-fidelity-image-understanding

目前我的实现中是写死了 auto 模式，因此模型会自动识别分别率。比如一张 512x512 的图片，发过去就会是 65 token。

如果是一张分辨率是 12800x25600 的图片，如果限定尺寸，而不限定分辨率，可能是缩小到了 5120x 12800 ，满足了我们的尺寸要求。但发给 4v 之后容易造成更多的 token 浪费。毕竟有可能使用低分辨率就可以描述清楚了。

这样啊之前没仔细了解过

lobehubbot · 2023-12-16T13:40:32Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

There is another point to think about. Currently, compressed images are judged by resolution. Is it better to judge by the maximum value of blob.size? In theory, it should be possible to retain higher-definition images when conditions permit. This is just an idea. , the implementation may trigger multiple compression cycles, and the performance should not be very good.

@canisminor1990 I don’t think so, it should be based on the resolution. The reason is that the GPT-4v model charges based on resolution:

refs: https://platform.openai.com/docs/guides/vision/low-or-high-fidelity-image-understanding

In my current implementation, the auto mode is hard-coded, so the model will automatically recognize the resolution. For example, if a 512x512 picture is sent, it will cost 65 tokens.

If it is a picture with a resolution of 12800x25600, if the size is limited but not the resolution, it may be reduced to 5120x 12800, which meets our size requirements. However, it is easy to cause more waste of tokens after issuing it to 4v. After all, it is possible to describe clearly using low resolution.

That's it. I haven't understood it carefully before.

arvinxx · 2023-12-17T13:49:27Z

@mushan0x0 merge 下 main ？有冲突

lobehubbot · 2023-12-17T13:49:38Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

@mushan0x0 merge under main? Conflict

mushan0x0 · 2023-12-17T14:23:36Z

好的，晚点处理一下

lobehubbot · 2023-12-17T14:23:47Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Okay, I'll deal with it later.

lobehubbot · 2023-12-18T12:51:43Z

❤️ Great PR @mushan0x0 ❤️

The growth of project is inseparable from user feedback and contribution, thanks for your contribution! If you are interesting with the lobehub developer community, please join our discord and then dm @arvinxx or @canisminor1990. They will invite you to our private developer channel. We are talking about the lobe-chat development or sharing ai newsletter around the world.
项目的成长离不开用户反馈和贡献，感谢您的贡献! 如果您对 LobeHub 开发者社区感兴趣，请加入我们的 discord，然后私信 @arvinxx 或 @canisminor1990。他们会邀请您加入我们的私密开发者频道。我们将会讨论关于 Lobe Chat 的开发，分享和讨论全球范围内的 AI 消息。

lobehubbot · 2023-12-18T12:54:08Z

🎉 This PR is included in version 0.114.3 🎉

The release is available on:

Your semantic-release bot 📦🚀

arvinxx requested changes Dec 15, 2023

View reviewed changes

src/services/file.ts Show resolved Hide resolved

mushan0x0 force-pushed the pref/compress_image branch from 5a7f9d7 to 6865f21 Compare December 15, 2023 07:30

mushan0x0 force-pushed the pref/compress_image branch from 6865f21 to 691c4ad Compare December 15, 2023 12:03

mushan0x0 force-pushed the pref/compress_image branch from 691c4ad to d62bafa Compare December 15, 2023 12:28

mushan0x0 force-pushed the pref/compress_image branch 2 times, most recently from 9fb1840 to 016acd9 Compare December 15, 2023 14:10

♻️ refactor: FileItem changed to ImageFileItem

af8cbf1

mushan0x0 force-pushed the pref/compress_image branch from 016acd9 to c67097b Compare December 15, 2023 14:14

mushan0x0 changed the title ~~⚡️ feat: Optimize the image upload size for gpt-4-vision~~ ⚡️ pref: Optimize the image upload size for gpt-4-vision Dec 15, 2023

mushan0x0 force-pushed the pref/compress_image branch from c67097b to 15f7ef3 Compare December 15, 2023 14:15

⚡️ perf: Optimize the image upload size for gpt-4-vision

a3ec74c

mushan0x0 force-pushed the pref/compress_image branch from 15f7ef3 to a3ec74c Compare December 15, 2023 14:42

mushan0x0 changed the title ~~⚡️ pref: Optimize the image upload size for gpt-4-vision~~ ⚡️ perf: Optimize the image upload size for gpt-4-vision Dec 15, 2023

mushan0x0 commented Dec 15, 2023

View reviewed changes

mushan0x0 requested a review from arvinxx December 15, 2023 14:47

Wxh16144 added 2 commits December 16, 2023 00:04

🔧 test: add Test Configuration

c7bef53

🧪 test: add case

64f9f64

Wxh16144 and others added 2 commits December 16, 2023 00:15

✏️ chore: typos

83f19dc

Merge pull request #1 from Wxh16144/pref/compress_image_patch01

7406e4b

🔧 chore: Add test cases

Merge remote-tracking branch 'upstream/main' into pref/compress_image

be7a692

mushan0x0 force-pushed the pref/compress_image branch from 413a219 to be7a692 Compare December 17, 2023 17:48

arvinxx merged commit d038d24 into lobehub:main Dec 18, 2023
2 of 5 checks passed

lobehubbot added the released label Dec 18, 2023

mushan0x0 deleted the pref/compress_image branch December 28, 2023 15:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚡️ perf: Optimize the image upload size for `gpt-4-vision` #669

⚡️ perf: Optimize the image upload size for `gpt-4-vision` #669

mushan0x0 commented Dec 15, 2023 •

edited

Loading

vercel bot commented Dec 15, 2023

lobehubbot commented Dec 15, 2023

arvinxx commented Dec 15, 2023

lobehubbot commented Dec 15, 2023

canisminor1990 commented Dec 15, 2023

mushan0x0 commented Dec 15, 2023

codecov bot commented Dec 15, 2023 •

edited

Loading

mushan0x0 Dec 15, 2023

arvinxx Dec 16, 2023 •

edited

Loading

mushan0x0 commented Dec 15, 2023

lobehubbot commented Dec 15, 2023

canisminor1990 commented Dec 15, 2023 •

edited

Loading

lobehubbot commented Dec 15, 2023

Wxh16144 commented Dec 15, 2023

lobehubbot commented Dec 15, 2023

arvinxx commented Dec 16, 2023 •

edited

Loading

lobehubbot commented Dec 16, 2023

canisminor1990 commented Dec 16, 2023

lobehubbot commented Dec 16, 2023

arvinxx commented Dec 17, 2023

lobehubbot commented Dec 17, 2023

mushan0x0 commented Dec 17, 2023

lobehubbot commented Dec 17, 2023

lobehubbot commented Dec 18, 2023

lobehubbot commented Dec 18, 2023

		async uploadFile(file: DB_File) {
		// 跳过图片上传测试

⚡️ perf: Optimize the image upload size for gpt-4-vision #669

⚡️ perf: Optimize the image upload size for gpt-4-vision #669

Conversation

mushan0x0 commented Dec 15, 2023 • edited Loading

💻 变更类型 | Change Type

🔀 变更说明 | Description of Change

📝 补充信息 | Additional Information

vercel bot commented Dec 15, 2023

lobehubbot commented Dec 15, 2023

arvinxx commented Dec 15, 2023

lobehubbot commented Dec 15, 2023

canisminor1990 commented Dec 15, 2023

mushan0x0 commented Dec 15, 2023

codecov bot commented Dec 15, 2023 • edited Loading

Codecov Report

mushan0x0 Dec 15, 2023

Choose a reason for hiding this comment

arvinxx Dec 16, 2023 • edited Loading

Choose a reason for hiding this comment

mushan0x0 commented Dec 15, 2023

lobehubbot commented Dec 15, 2023

canisminor1990 commented Dec 15, 2023 • edited Loading

lobehubbot commented Dec 15, 2023

Wxh16144 commented Dec 15, 2023

lobehubbot commented Dec 15, 2023

arvinxx commented Dec 16, 2023 • edited Loading

lobehubbot commented Dec 16, 2023

canisminor1990 commented Dec 16, 2023

lobehubbot commented Dec 16, 2023

arvinxx commented Dec 17, 2023

lobehubbot commented Dec 17, 2023

mushan0x0 commented Dec 17, 2023

lobehubbot commented Dec 17, 2023

lobehubbot commented Dec 18, 2023

lobehubbot commented Dec 18, 2023

⚡️ perf: Optimize the image upload size for `gpt-4-vision` #669

⚡️ perf: Optimize the image upload size for `gpt-4-vision` #669

mushan0x0 commented Dec 15, 2023 •

edited

Loading

codecov bot commented Dec 15, 2023 •

edited

Loading

arvinxx Dec 16, 2023 •

edited

Loading

canisminor1990 commented Dec 15, 2023 •

edited

Loading

arvinxx commented Dec 16, 2023 •

edited

Loading