-
-
Notifications
You must be signed in to change notification settings - Fork 322
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Complete noob questions - 1) model purpose? 2) pre-trained weights? 3) other languages? #10
Comments
|
thank you @astorfi Suppose a spoofer was attempting to bypass a system that uses face and speech recognition. He would hold up a video that contains the victim's face and voice recorded on, say, an ipad. He would be hiding from the detection camera (to defeat facial recognition) and would use his own voice, not the voice from ipad (to defeat the the speech recocognition system). Simple solution might be to just look at time offset of the words and compare with the time offset of when the lips move. Of course, this isn't perfect, but at least somewhere to start from. Any idea what part of your code I can modify to detect this? |
No, I personally did not run it on Non-English dataset but the paper that I mentioned did it (Out of time: automated lip sync in the wild). About the question you are asking, unfortunately, I am not expert. |
thanks you |
Excuse my complete noob-ness
is the model trying to accurately determine if the video (i.e. shape of lips) and audio are sync'ed?
Any pre-trained weights I can download to run ?
Assuming my Q1 is correct.. has anyone tested to see if this model can accurately detect audio/video synchronization on non-english languages?
The text was updated successfully, but these errors were encountered: