When completions streaming mode enabled, the response no longer includes the usage
field.
#291
Open
3 tasks done
Labels
Clear and concise description of the problem
As the official cookbook How to stream completions cites:
Personally, I think it would be useful to implement that feature. Users wouldn't have to check the daily usage breakdown on their account page, and it would make for a more responsive and user-friendly experience.
Suggested solution
I have actually implemented that feature on the front-end already, using the @dqbd/tiktoken library, which is a third-party TypeScript version of the official tiktoken library. OpenAI also provides an example on how to count tokens with the tiktoken library. For specific implementation, please refer to my repo.
Alternative
Maybe there is a way to implement it on the back-end by providing an API, but I have not succeeded in achieving that so far because it seems impossible to load a wasm file when deploying on Vercel. I have followed the tutorial on Vercel docs and tried some plugins to load the wasm file but failed. If anyone knows about this, please let me know! 😁
Additional context
I have not optimized my code, but it suffices for now. There are some bugs, as shown below:
The first completion is primed with
\n\n
, and 20 tokens are used. After conducting some tests, I have observed that the number of tokens of the completion seems to be equal to the number of tokens of the completion content only, indicating that the special tokens and line breaks are not included in the count (Please refer to the code for more details).The second completion has exactly the same content as the first one, but is not primed with
\n\n
. As\n\n
is encoded in 271, it indicates that one token is used. Therefore, the result is 19, which is exactly what we expected.But the paradox is that
The daily usage gives me both 19. I have no ideas about this, it requires further testing.
If you know about this, please let me know! I would appreciate it.
In addition, I feel that my implementation method is still quite rough and only supports the 'gpt-3.5' model. I have not tested it on other models. Also, if you have any advice, please let me know too.
Validations
The text was updated successfully, but these errors were encountered: