You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Can you please share the inference code to generate the Stage 3 Dataset from ActivityNet/DiDeMo? Specifically, the inference configuration for model to generate the response. I tried the following, however, the responses are different from the shared Stage3 data.
tokenizer = AutoTokenizer.from_pretrained("checkpoints/vicuna-7b-v1.5/") model = AutoModelForCausalLM.from_pretrained("checkpoints/vicuna-7b-v1.5/")
Input Prompt
ActivityNet VideoID - v_QOlSCBRmfWY
Output in Shared Stage3 Json
User: What was the primary purpose of the video?
Assistant: The primary purpose of the video appears to be showcasing the young woman's dance routine, highlighting her movements and expression while executing them.
User: Can you pinpoint when the girl changes her pose or dance style?
Assistant: From to , the girl transitions from one pose or dance style to another as the camera captures her movements.
User: Is there any specific reason for the change in dance style or pose?
Assistant: Based on the video content, there is no evident reason for the change in dance style or pose. It could be a creative decision by the videographer or an expression of the dancer's artistic expression.
Generated Output
User: What is the primary action in the video?
Assistant: The primary action in the video is the young woman’s dance, which spans from to .
User: Can you pinpoint when she starts dancing?
Assistant: The dancing commences from to .
User: What is the woman’s expression during her dance?
Assistant: I’m unable to provide an answer as the video doesn’t contain any explicit expressions. However, the woman’s movements can be observed to gain insights into her emotions.
The text was updated successfully, but these errors were encountered:
Hi @huangb23,
Thanks for sharing code for a great work!
Can you please share the inference code to generate the Stage 3 Dataset from ActivityNet/DiDeMo? Specifically, the inference configuration for model to generate the response. I tried the following, however, the responses are different from the shared Stage3 data.
tokenizer = AutoTokenizer.from_pretrained("checkpoints/vicuna-7b-v1.5/")
model = AutoModelForCausalLM.from_pretrained("checkpoints/vicuna-7b-v1.5/")
Input Prompt
![image](https://private-user-images.githubusercontent.com/7984207/342442641-a0b72b0a-f259-4703-bf2b-14570737cd85.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjEwNTU3MzksIm5iZiI6MTcyMTA1NTQzOSwicGF0aCI6Ii83OTg0MjA3LzM0MjQ0MjY0MS1hMGI3MmIwYS1mMjU5LTQ3MDMtYmYyYi0xNDU3MDczN2NkODUucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDcxNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA3MTVUMTQ1NzE5WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9NzY1OTNmN2YwOGFkM2FjZjE3NWFmNTJhMjE0ZDkyNTliNGRjZDQ5MGQyMjk0NzExNTNjNmYxZmVkMTE3YzE0YyZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.UyKzebPgESp_12T9jazkWDO3doTOjtTU-BDPFStLiaE)
ActivityNet VideoID - v_QOlSCBRmfWY
Output in Shared Stage3 Json
Generated Output
The text was updated successfully, but these errors were encountered: