sermon-app is a collection of christian sermons (typically cantonese) in text or audio format. texts from audio sources are generated from raw recording using speech-to-text engine, like azure or whisper (whisper indeed performs a lot better than azure)
the outcome of this repo compiled from generate LaTeX source are listed below
sermon book series | file path in this repo | link |
---|---|---|
宣道傳意 講道講章 Alliance Communications Ministry | ./pdf/sermon_ACSMHK.pdf | link |
漢語聖經協會 講道講章 Chinese Bible International | ./pdf/sermon_CBI.pdf | link |
中國神學研究院 道講章 China Graduate School of Theology | ./pdf/sermon_CGST.pdf | link |
崇基神學院 崇拜講章 Div. Schl of Chung Chi College | ./pdf/sermon_DSCCC_2009-present.pdf | link |
流堂 崇拜講章 Flow Church | ./pdf/sermon_FLWC_2021-present.pdf | link |
宣道會錦繡堂 崇拜講章 Christian Missionary Alliance Fairview Church | ./pdf/sermon_FVC_2017-present.pdf | link |
港九培靈研經會講章 Hong Kong Bible Conference | ./pdf/sermon_HKBC_1928-2007.pdf | link |
./pdf/sermon_HKBC_2008-present.pdf | link | |
JohnsonNg Youtube Channel | ./pdf/sermon_JNG_2012-18.pdf | link |
./pdf/sermon_JNG_2019-20.pdf | link | |
./pdf/sermon_JNG_2021-22.pdf | link | |
./pdf/sermon_JNG_2023-24.pdf | link | |
播道會港福堂 崇拜講章 EFCC Kong Fok Church | ./pdf/sermon_KFC_2020-present.pdf | link |
The Porch, Dallas, TX 75251 | ./pdf/sermon_PORCH_2014-present.pdf | link |
沙田浸信會 Shatin Baptist Church | ./pdf/sermon_STBC_2020-present.pdf | link |
葡萄藤教會 The Vine Church | ./pdf/sermon_VINE_2020-present.pdf | link |
環球聖經公會 講道講章 Worldwide Bible Society | ./pdf/sermon_WWBS.pdf | link |
播道會恩福堂 崇拜講章 Yan Fook Church & Youth | ./pdf/sermon_YFCX_2020-2023.pdf | link |
./pdf/sermon_YFCX_2024-2027.pdf | link | |
中華宣道會友愛堂信培部 Yau Oi School | ./pdf/sermon_YOS.pdf | link |
sermon source | transcript total count | recent development activity |
---|---|---|
ACSMHK | 8.6% ( 948 / 10976) | 21.7% ( 564 / 2602) |
CBI | 0.2% ( 27 / 10976) | 1.6% ( 42 / 2602) |
CGST | 2.0% ( 218 / 10976) | 0.9% ( 24 / 2602) |
DSCCC | 6.5% ( 713 / 10976) | 1.1% ( 30 / 2602) |
FLWC | 1.9% ( 212 / 10976) | 3.3% ( 85 / 2602) |
FVC | 11.2% ( 1231 / 10976) | 4.2% ( 110 / 2602) |
HKBC | 14.2% ( 1559 / 10976) | 0.1% ( 3 / 2602) |
JNG | 27.3% ( 2997 / 10976) | 9.3% ( 243 / 2602) |
KFC | 7.4% ( 816 / 10976) | 9.1% ( 238 / 2602) |
PORCH | 4.5% ( 493 / 10976) | 3.3% ( 85 / 2602) |
STBC | 2.1% ( 234 / 10976) | 6.9% ( 180 / 2602) |
VINE | 2.6% ( 284 / 10976) | 23.5% ( 611 / 2602) |
WWBS | 0.6% ( 68 / 10976) | 1.3% ( 34 / 2602) |
YFCX | 10.4% ( 1139 / 10976) | 13.2% ( 344 / 2602) |
YOS | 0.3% ( 37 / 10976) | 0.3% ( 9 / 2602) |
This work containerizes a lot of python packages into one docker image named "datalab".
You need to play with the following essential elements
- Python3 (already in datalab)
- Jupyter (already in datalab)
- Docker (you shall install it on your host, see Usage-1)
- LaTeX (you shall install it on your host, see Usage-6)
If you are unfamiliar to these basics, please go to Immediately available sermon books section.
- automation-ready: new sermons could be found from destinated youtube channel
- compilation with sorting according preacher, book, etc.
- opening possibility for more channels source
- powered by Docker, Jupyter, and Spark
for sermon-app, the author currently dedicates his effort focusing on cantonese sermon compilation so that the valuable resources could be re-archived, re-distributed, re-presented, and served as reference for future opportunities.
the author uses this project to
- grab from youtube cantonese christian sermons from different accessible channels, audio voice files are retrieved;
- from audio voice file an Azure speech recognition engine is used for cantonese transcription (from audio speech file to raw text file)
- generate from transcriped sermon text a pdf compilation with proper sorting by preachers, bible book chapter, sermon title, and time
as it is written in NIV Psalm 127
Unless the Lord builds the house, the builders labor in vain. Unless the Lord watches over the city, the guards stand watch in vain.
This work you see here is truely a blessing from G-d.
refer to installation guide
docker pull michaelchanwahyan/datalab
you probably would need docker-volume
docker run -p 9999:9999 \
-v /your/path/to/app:/app \
--name=ds_workspace \
michaelchanwahyan/datalab:latest \
/usr/bin/bash /startup.sh
Upon successful execution, opening localhost:9999 from system browser shall bring you to a jupyterlab interface normally looks like
(if prompted to password, with reference to the startup script of the datalab platform, possibly the password 'dsteam' is already specified in the startup options "--ServerApp.token='dsteam'". try it whenever it is needed)
The code files for JohnsonNg Youtube Channel's sermon content are put under /app/projects/JNG/, so that in /app/projects/JNG, run the notebook file generate_index.ipynb using the launched jupyterlab.
please be reminded that the scripts may involve human-machine interaction so that you are not running generate_index.ipynb blindly. Do take attention to the inline comment in the source file.
inline description in generate_index.ipynb describe the use of yt-dlp to extraction raw audio from youtube.
in /app/projects/JNG, the core script is to run the notebook file generate_content.ipynb (or the python counterpart)
azure speech service is required and the azure subscription info is omitted in this repo
a pair of cv_runby_*.py
files can be found in the same directory. they serve as cocurrent python script to run the speech2text (by cv_runby_container.py) and text concatenation (by cv_runby_host.py) in an on-the-fly manner
as from 2023 OpenAI/whisper model became available, speech-to-text could become more effective.
also thanks to ggerganov/whisper.cpp who contributes on cpp porting for Apple Silicon integration, whisper runs very fast now.
currently whisper model size used is medium. ggerganov's ggml-medium.bin model file together with other sizes could be found from ggerganov's HaggingFace page.
in /app/projects/JNG, run the python script generate_sermonbook.py to generate the LaTeX source file under build/ folder
(LaTeX installation: see their page)
in /app/build/, run the build script build.sh with input argument detailed below:
./build.sh JNG # this is to compile JohnsonNg Youtube Channel sermon content
./build.sh HKBC # this is to compile Hong Kong Bible Conference sermon content
the core LaTeX software package required is XeLaTeX.
contact person : Michael via michaelchan_wahyan@yahoo.com.hk