Skip to content

calganaygun/YoutubeTranscriber

Repository files navigation

YoutubeTranscriber

A Python script to search strings in YouTube videos.

Uses Google Cloud Speech-to-Text API to generate transcripts. You can use Google Cloud Free Tier credits.

How to use?

  1. Clone this repo.
  2. Sign-in to GCP.
  3. Go to Speech-to-Text API select project and enable this API.
  4. Click "Credentials".
  5. Click "Create Credentials".
  6. Select "Service Account Key".
  7. Under "Service Account" select "New service account".
  8. Name service.
  9. Select Role: "Project" -> "Owner".
  10. Finish creating credential.
  11. Select your "Service Account" from list.
  12. Click "Add Key" button and select "Create New Key".
  13. Leave "JSON" option selected.
  14. Click "Create".
  15. Save generated API key file to repo's main directory.
  16. Rename file to "api-key.json" or, specify GC_CREDENTIAL env variable with your json file name while running docker image.

This project uses parallel processing. You can use NUM_OF_THREADS env variable to specify number of concurrent threads while running docker image. Unless you change program use 8 threads.

# Build Docker image
docker build . -t calganaygun/youtube-transcriber:latest

# Run program and search or generate text
docker run -it calganaygun/youtube-transcriber:latest -v <YouTube Video ID> \
-w <Search string | getAll: prints all of video content> \
-l <Language code Example: 'tr-TR'>

Examples

asciicast

asciicast

About

Easily find words in YouTube videos.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published