sermon-app

A Collection of Christian Sermons From Different Sources

sermon-app is a collection of christian sermons (typically cantonese) in text or audio format. texts from audio sources are generated from raw recording using speech-to-text engine, like azure or whisper (whisper indeed performs a lot better than azure)

Immediately available sermon books

the outcome of this repo compiled from generate LaTeX source are listed below

sermon book series	file path in this repo	link
宣道傳意講道講章 Alliance Communications Ministry	./pdf/sermon_ACSMHK.pdf	link
漢語聖經協會講道講章 Chinese Bible International	./pdf/sermon_CBI.pdf	link
中國神學研究院道講章 China Graduate School of Theology	./pdf/sermon_CGST.pdf	link
崇基神學院崇拜講章 Div. Schl of Chung Chi College	./pdf/sermon_DSCCC_2009-present.pdf	link
流堂崇拜講章 Flow Church	./pdf/sermon_FLWC_2021-present.pdf	link
宣道會錦繡堂崇拜講章 Christian Missionary Alliance Fairview Church	./pdf/sermon_FVC_2017-present.pdf	link
港九培靈研經會講章 Hong Kong Bible Conference	./pdf/sermon_HKBC_1928-2007.pdf	link
	./pdf/sermon_HKBC_2008-present.pdf	link
JohnsonNg Youtube Channel	./pdf/sermon_JNG_2012-18.pdf	link
	./pdf/sermon_JNG_2019-20.pdf	link
	./pdf/sermon_JNG_2021-22.pdf	link
	./pdf/sermon_JNG_2023-24.pdf	link
播道會港福堂崇拜講章 EFCC Kong Fok Church	./pdf/sermon_KFC_2020-present.pdf	link
The Porch, Dallas, TX 75251	./pdf/sermon_PORCH_2014-present.pdf	link
沙田浸信會 Shatin Baptist Church	./pdf/sermon_STBC_2020-present.pdf	link
葡萄藤教會 The Vine Church	./pdf/sermon_VINE_2020-present.pdf	link
環球聖經公會講道講章 Worldwide Bible Society	./pdf/sermon_WWBS.pdf	link
播道會恩福堂崇拜講章 Yan Fook Church & Youth	./pdf/sermon_YFCX_2020-2023.pdf	link
	./pdf/sermon_YFCX_2024-2027.pdf	link
中華宣道會友愛堂信培部 Yau Oi School	./pdf/sermon_YOS.pdf	link

Statistics Overview on this project

sermon source	transcript total count	recent development activity
ACSMHK	8.6% ( 948 / 10976)	21.7% ( 564 / 2602)
CBI	0.2% ( 27 / 10976)	1.6% ( 42 / 2602)
CGST	2.0% ( 218 / 10976)	0.9% ( 24 / 2602)
DSCCC	6.5% ( 713 / 10976)	1.1% ( 30 / 2602)
FLWC	1.9% ( 212 / 10976)	3.3% ( 85 / 2602)
FVC	11.2% ( 1231 / 10976)	4.2% ( 110 / 2602)
HKBC	14.2% ( 1559 / 10976)	0.1% ( 3 / 2602)
JNG	27.3% ( 2997 / 10976)	9.3% ( 243 / 2602)
KFC	7.4% ( 816 / 10976)	9.1% ( 238 / 2602)
PORCH	4.5% ( 493 / 10976)	3.3% ( 85 / 2602)
STBC	2.1% ( 234 / 10976)	6.9% ( 180 / 2602)
VINE	2.6% ( 284 / 10976)	23.5% ( 611 / 2602)
WWBS	0.6% ( 68 / 10976)	1.3% ( 34 / 2602)
YFCX	10.4% ( 1139 / 10976)	13.2% ( 344 / 2602)
YOS	0.3% ( 37 / 10976)	0.3% ( 9 / 2602)

Steps to compile the books from scratch (painful !)

Pre-requisites

This work containerizes a lot of python packages into one docker image named "datalab".

You need to play with the following essential elements

Python3 (already in datalab)
Jupyter (already in datalab)
Docker (you shall install it on your host, see Usage-1)
LaTeX (you shall install it on your host, see Usage-6)

If you are unfamiliar to these basics, please go to Immediately available sermon books section.

Features

automation-ready: new sermons could be found from destinated youtube channel
compilation with sorting according preacher, book, etc.
opening possibility for more channels source
powered by Docker, Jupyter, and Spark

for sermon-app, the author currently dedicates his effort focusing on cantonese sermon compilation so that the valuable resources could be re-archived, re-distributed, re-presented, and served as reference for future opportunities.

the author uses this project to

grab from youtube cantonese christian sermons from different accessible channels, audio voice files are retrieved;
from audio voice file an Azure speech recognition engine is used for cantonese transcription (from audio speech file to raw text file)
generate from transcriped sermon text a pdf compilation with proper sorting by preachers, bible book chapter, sermon title, and time

as it is written in NIV Psalm 127

Unless the Lord builds the house, the builders labor in vain. Unless the Lord watches over the city, the guards stand watch in vain.

This work you see here is truely a blessing from G-d.

Usage

1. Get the [datalab] engine (by host)

1.1 Install Docker

refer to installation guide

1.2 Get datalab container image

docker pull michaelchanwahyan/datalab

2. Start and jupyterlab through [datalab] container (by host)

you probably would need docker-volume

docker run -p 9999:9999 \
           -v /your/path/to/app:/app \
           --name=ds_workspace \
           michaelchanwahyan/datalab:latest \
           /usr/bin/bash /startup.sh

Upon successful execution, opening localhost:9999 from system browser shall bring you to a jupyterlab interface normally looks like

(if prompted to password, with reference to the startup script of the datalab platform, possibly the password 'dsteam' is already specified in the startup options "--ServerApp.token='dsteam'". try it whenever it is needed)

3. Generate the sermon book table-of-content (toc) (by container)

a) the index file (a toc-like csv file)

The code files for JohnsonNg Youtube Channel's sermon content are put under /app/projects/JNG/, so that in /app/projects/JNG, run the notebook file generate_index.ipynb using the launched jupyterlab.

please be reminded that the scripts may involve human-machine interaction so that you are not running generate_index.ipynb blindly. Do take attention to the inline comment in the source file.

b) download the sermon audio according to index file

inline description in generate_index.ipynb describe the use of yt-dlp to extraction raw audio from youtube.

4. Convert from audio to text (speech-to-text part , by container)

~~in /app/projects/JNG, the core script is to run the notebook file generate_content.ipynb (or the python counterpart)~~

~~azure speech service is required and the azure subscription info is omitted in this repo~~

a pair of cv_runby_*.py files can be found in the same directory. they serve as cocurrent python script to run the speech2text (by cv_runby_container.py) and text concatenation (by cv_runby_host.py) in an on-the-fly manner

as from 2023 OpenAI/whisper model became available, speech-to-text could become more effective.

also thanks to ggerganov/whisper.cpp who contributes on cpp porting for Apple Silicon integration, whisper runs very fast now.

currently whisper model size used is medium. ggerganov's ggml-medium.bin model file together with other sizes could be found from ggerganov's HaggingFace page.

5. Compile the sermon texts into a single book source (by host/container)

in /app/projects/JNG, run the python script generate_sermonbook.py to generate the LaTeX source file under build/ folder

6. Compile the sermon texts into a single book pdf (by host, where LaTeX is required)

(LaTeX installation: see their page)

in /app/build/, run the build script build.sh with input argument detailed below:

./build.sh JNG # this is to compile JohnsonNg Youtube Channel sermon content
./build.sh HKBC # this is to compile Hong Kong Bible Conference sermon content

the core LaTeX software package required is XeLaTeX.

Editor

contact person : Michael via michaelchan_wahyan@yahoo.com.hk

Name		Name	Last commit message	Last commit date
Latest commit History 1,123 Commits
.ci		.ci
.vscode		.vscode
auditok_data/JNG		auditok_data/JNG
build		build
data		data
font		font
pdf		pdf
photos		photos
projects		projects
whisper		whisper
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
viewstat.sh		viewstat.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sermon-app

A Collection of Christian Sermons From Different Sources

Immediately available sermon books

Statistics Overview on this project

Steps to compile the books from scratch (painful !)

Pre-requisites

Features

Usage

1. Get the [datalab] engine (by host)

1.1 Install Docker

1.2 Get datalab container image

2. Start and jupyterlab through [datalab] container (by host)

3. Generate the sermon book table-of-content (toc) (by container)

a) the index file (a toc-like csv file)

b) download the sermon audio according to index file

4. Convert from audio to text (speech-to-text part , by container)

5. Compile the sermon texts into a single book source (by host/container)

6. Compile the sermon texts into a single book pdf (by host, where LaTeX is required)

Editor

About

Releases 2

Packages

Languages

michaelchanwahyan/sermon-app

Folders and files

Latest commit

History

Repository files navigation

sermon-app

A Collection of Christian Sermons From Different Sources

Immediately available sermon books

Statistics Overview on this project

Steps to compile the books from scratch (painful !)

Pre-requisites

Features

Usage

1. Get the [datalab] engine (by host)

1.1 Install Docker

1.2 Get datalab container image

2. Start and jupyterlab through [datalab] container (by host)

3. Generate the sermon book table-of-content (toc) (by container)

a) the index file (a toc-like csv file)

b) download the sermon audio according to index file

4. Convert from audio to text (speech-to-text part , by container)

5. Compile the sermon texts into a single book source (by host/container)

6. Compile the sermon texts into a single book pdf (by host, where LaTeX is required)

Editor

About

Resources

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages