Skip to content
This repository has been archived by the owner on Apr 19, 2023. It is now read-only.

Translate/transcribe audio on upload of video with AWS Lambda/CloudWatch #161

Open
kylekirkby opened this issue Nov 15, 2020 · 15 comments
Open
Assignees
Labels
enhancement New feature or request

Comments

@kylekirkby
Copy link

Is your feature request related to a problem? Please describe.
We are building a niche resources hub and have the need to transcribe the audio from a video to create captions. In addition, we would like to translate the captions to a small set of other languages.

Describe the solution you'd like
It would be great to be able to create an s3-triggered Lambda function to kick off a transcribe job for the given video using https://docs.amplify.aws/lib/predictions/transcribe/q/platform/js. The Amplify Predictions library also has the ability to translate the text to other languages so it would be ideal to fire off a translation job after the video has finished being transcribed. It's worth noting that the Amplify Transcribe API appears to be promise-based so I'm unsure if we can use this as the videos will be up to an hour in length so, more than likely, will not be ideal to run in lambda directly and await the promise.

Architecture
Image taken from this source.

Describe alternatives you've considered
Considered using the AWS Predictions API but seems like there is no support for kicking of a AWS Transcribe job and being notified of the result via CloudWatch and triggering a response lambda.

Additional context

It would be awesome if this could be implemented or if anyone has any suggestions as to how this can be achieved with the current AWS Amplify CLI/Amplify-video plugin.

Thank you in advance!

@wizage
Copy link
Contributor

wizage commented Dec 8, 2020

@smp would love your input on this one!

@wizage wizage added the enhancement New feature or request label Dec 8, 2020
@smp
Copy link
Contributor

smp commented Dec 13, 2020

@wizage There's a lot of "workflow" features like this one that we may want to start categorizing under one umbrella in order to create an epic for VOD. One approach would be to keep adding functions to the job kickoff lambda to support orchestration of tasks like captions, metadata extraction, etc. Another approach would be to implement some type of workflow abstraction in the VOD resource that allows users to define a state machine for processing similar to the VOD solution.

@kylekirkby The core requirement here is caption tracks - correct? Instead of leveraging other categories, we might choose to abstract the user from all the intricacies of transcribe/translate/json>caption conversion and just add a feature to Amplify Video VOD that takes care of this for you if you select an option in the resource configuration. Would that be acceptable?

@kylekirkby
Copy link
Author

@smp @wizage - thanks for your input on this one! I think the abstraction of the complexity is a good idea! As long as we can caption/translate tracks to a subset of languages then that would be ideal :)

@kylekirkby
Copy link
Author

@smp @wizage with the current state of amplify-video, what is the best way to add my own custom lambda functions for achieving this?

@danielvouch
Copy link

Any updates on best way to achieve this in the mean time?

@wizage
Copy link
Contributor

wizage commented Apr 14, 2021

Hey all,

In the mean time for achieving this you can use the two existing Lambdas added on your behalf. We have already written a hook for pre-transcoding and post-transcoding. You can modify these Lambdas by adding a new folder to the project.

  1. Navigate to amplify/backend/video/<projectname>/
  2. Create a new folder called custom
  3. Copy the lambda function (keeping the folder structure) from build to custom. Should look something like this: custom/LambdaFunctions/InputLambda
  4. Modify the Lambda code to send a request off to Transcode :)
  5. You might also need to copy over custom/InputTriggerLambda.template and modify the LambdaExecutionRole to include your new permissions.

If you get stuck let me know! (The file names provided should be the right ones just looking at the diagram but you might need to use the post-transcoded ones if your file is not supported by Transcode!)

Edit:
For clarification. The custom folder overwrites whatever is in the build folder. The build folder gets updated on every push and ever update. Which enables us to push updates out easier to customers so don't write any changes in the build folder. We are working to make this more easy to use and enable us to still update.

@danielvouch
Copy link

Thanks for the quick response @wizage !

@wizage
Copy link
Contributor

wizage commented Apr 14, 2021

Sorry about not answering sooner. Been toying with some new ideas, which I hope to share soon :)

@kylekirkby
Copy link
Author

Thanks for the update @wizage! I’m about to implement this for a serverless resource hub :)

@danielvouch
Copy link

danielvouch commented Apr 28, 2021

@wizage Just gotten to this now, I have created the custom directory but doesn't seem to be overwriting my build directory. Any ideas why?

My current build directory looks like amplify/video/<projectname>/build/vod-helpers/*

Just confirming that my custom directory should be amplify/video/<projectname>/custom/vod-helpers/* rather than amplify/video/<projectname>/custom/*

@arturocanalda
Copy link

@wizage I did exactly the same as @danielvouch. InputLambda.zip is created at build time, but build directory is not overwritten.

Any tips?

@danielvouch
Copy link

@arturocanalda, I had the same experience but the custom implementation actually overwrites the build implementation once pushed to the cloud (I was a bit confused about this).

I've gotten this to work, so if you need anything let me know!

@arturocanalda
Copy link

@danielvouch You're totally right. After pushing this last time it worked as expected. I'm sure I pushed multiple times and I couldn't see any changes on the cloud. Maybe it's because this time I deleted manually the build folder... I can't tell exactly why it worked now, but thanks for the heads up :)

@wizage
Copy link
Contributor

wizage commented Jun 25, 2021

Yeah sorry for the confusion.

build/ -> Will always be built and have the most recent code in it. We keep it around for those that want to do compares against their custom templates to do updates as we release updates. Good way to see in git that we made update to some templates that you might of modified so you can merge them possibly.

custom/ -> Live outside of build. Anything in custom will be pushed over anything in build. So even tho build/ will exist and might have the same lambda it will not be pushed and instead custom/ will be pushed. Sorry for the confusion.

@kylekirkby
Copy link
Author

I can confirm that the structure required for custom VOD resources is:
amplify/video/<projectname>/custom/vod-helpers/*

I did come across an error where the VOD stack was left in a UPDATE_COMPLETE_CLEANUP_IN_PROGRESS state. This resolved on it's own after wating around 5 mins... @wizage is there anything we can do to prevent this from happening?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants