KanTV("Kan", aka Chinese PinYin "Kan" or Chinese HanZi "看" or English "watch/listen") , an open source project focus on study and practise state-of-the-art AI technology in real scenario(such as online-TV playback and online-TV transcription(real-time subtitle) and online-TV language translation and online-TV video&audio recording works at the same time) on Android phone/device, derived from original , with much enhancements and new features:
-
Watch online TV and local media by my customized , source code of my customized FFmpeg 6.1 could be found in external/ffmpeg according to FFmpeg's license
-
Record online TV to automatically generate videos (useful for short video creators to generate short video materials but pls respect IPR of original content creator/provider); record online TV's video / audio content for gather video / audio data which might be required of/useful for AI R&D activity
-
AI subtitle(Real-time English subtitle for English online-TV(aka OTT TV) by the great & excellent & amazing whisper.cpp )(PoC finished on Xiaomi 14. Xiaomi 14 or other powerful Android mobile phone is HIGHLY required/recommended for real-time subtitle feature otherwise unexpected behavior would happen)
-
2D graphic performance
-
Set up a customized playlist and then use this software to watch the content of the customized playlist for R&D activity
-
UI refactor(closer to real commercial Android application and only English is supported in UI language currently)
-
Well-maintained "workbench" for ASR(Automatic Speech Recognition) researchers/developers who was interested in practise state-of-the-art AI tech(such as whisper.cpp) in real scenario on Android phone/device
-
Well-maintained "workbench" for LLM(Large Language Model) researchers/developers who was interested in practise state-of-the-art AI tech(such as llama.cpp) in real scenario on Android phone/device, or Run/experience LLM model(such as llama-2-7b, baichuan2-7b, qwen1_5-1_8b, gemma-2b) on Android phone/device using the magic llama.cpp
-
Well-maintained "workbench" for GGML beginners to study and practise GGML inference framework on Android phone/device
-
Well-maintained "workbench" for NCNN beginners to study and practise NCNN inference framework on Android phone/device
-
Android turn-key project for AI researchers(whom mightbe not familiar with regular Android software development)/developers/beginners focus on edge/device-side AI learning / R&D activity, some AI R&D activities (AI algorithm validation / AI model validation / performance benchmark in ASR, LLM, TTS, NLP, CV......field) could be done by Android Studio IDE + a powerful Android phone very easily
(depend on #121 and https://github.com/zhouwg/kantv/issues/176 )
git clone https://github.com/zhouwg/kantv.git
cd kantv
git checkout master
cd kantv
-
Build docker image
docker build build -t kantv --build-arg USER_ID=$(id -u) --build-arg GROUP_ID=$(id -g) --build-arg USER_NAME=$(whoami)
-
Run docker container
# map source code directory into docker container docker run -it --name=kantv --volume=`pwd`:/home/`whoami`/kantv kantv # in docker container . build/envsetup.sh ./build/prebuild-download.sh
-
Prerequisites
- tools & utilities
-
Android Studio
download and install Android Studio manually
-
vim settings
Host OS information:
uname -a Linux 5.8.0-43-generic #49~20.04.1-Ubuntu SMP Fri Feb 5 09:57:56 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux cat /etc/issue Ubuntu 20.04.2 LTS \n \l
sudo apt-get update sudo apt-get install build-essential -y sudo apt-get install cmake -y sudo apt-get install curl -y sudo apt-get install wget -y sudo apt-get install python -y sudo apt-get install tcl expect -y sudo apt-get install nginx -y sudo apt-get install git -y sudo apt-get install vim -y sudo apt-get install spawn-fcgi -y sudo apt-get install u-boot-tools -y sudo apt-get install ffmpeg -y sudo apt-get install openssh-client -y sudo apt-get install nasm -y sudo apt-get install yasm -y sudo apt-get install openjdk-17-jdk -y sudo dpkg --add-architecture i386 sudo apt-get install lib32z1 -y sudo apt-get install -y android-tools-adb android-tools-fastboot autoconf \ automake bc bison build-essential ccache cscope curl device-tree-compiler \ expect flex ftp-upload gdisk acpica-tools libattr1-dev libcap-dev \ libfdt-dev libftdi-dev libglib2.0-dev libhidapi-dev libncurses5-dev \ libpixman-1-dev libssl-dev libtool make \ mtools netcat python-crypto python3-crypto python-pyelftools \ python3-pycryptodome python3-pyelftools python3-serial \ rsync unzip uuid-dev xdg-utils xterm xz-utils zlib1g-dev sudo apt-get install python3-pip -y sudo apt-get install indent -y pip3 install meson ninja echo "export PATH=/home/`whoami`/.local/bin:\$PATH" >> ~/.bashrc
or run below script accordingly after fetch project's source code
./build/prebuild.sh
borrow from http://ffmpeg.org/developer.html#Editor-configuration
set ai set nu set expandtab set tabstop=4 set shiftwidth=4 set softtabstop=4 set noundofile set nobackup set fileformat=unix set undodir=~/.undodir set cindent set cinoptions=(0 " Allow tabs in Makefiles. autocmd FileType make,automake set noexpandtab shiftwidth=8 softtabstop=8 " Trailing whitespace and tabs are forbidden, so highlight them. highlight ForbiddenWhitespace ctermbg=red guibg=red match ForbiddenWhitespace /\s\+$\|\t/ " Do not highlight spaces at the end of line while typing on that line. autocmd InsertEnter * match ForbiddenWhitespace /\t\|\s\+\%#\@<!$/
-
Download android-ndk-r26c to prebuilts/toolchain, skip this step if android-ndk-r26c is already exist
. build/envsetup.sh ./build/prebuild-download.sh
-
Modify ggml/CMakeLists.txt and ncnn/CMakeLists.txt accordingly if target Android device is Xiaomi 14 or Qualcomm Snapdragon 8 Gen 3 SoC based Android phone
-
Modify ggml/CMakeLists.txt and ncnn/CMakeLists.txt accordingly if target Android phone is NOT Qualcomm SoC based Android phone
. build/envsetup.sh
-
Option 1: Build APK from source code by Android Studio IDE
-
Option 2: Build APK from source code by command line
. build/envsetup.sh lunch 1 ./build-all.sh android
This Android APK works well on any mainstream Android phone and the following four permissions are required:
- Access to storage is required to generate necessary temporary files
- Access to device information is required to obtain current phone network status information, distinguishing whether the current network is Wi-Fi or mobile when playing online TV
- Access to camera is needed for AI Agent
- Access to mic(audio recorder) is needed for AI Agent
here is a short video to demostrate AI subtitle by running the great & excellent & amazing whisper.cpp on a Xiaomi 14 device - fully offline, on-device.
realtime-subtitle-by-whispercpp-demo-on-xiaomi14-finetune-20240324.mp4
here is a screenshot to demostrate LLM inference by running the magic llama.cpp on a Xiaomi 14 device - fully offline, on-device.
here is a screenshot to demostrate ASR inference by running the excellent whisper.cpp on a Xiaomi 14 device - fully offline, on-device.
here are some screenshots to demostrate CV inference by running the excellent ncnn on a Xiaomi 14 device - fully offline, on-device.
-
Android multimodal AI agent(ASR, LLM, TTS, CV, NLP, ..., an open source GPT-4o style multimodal AI agent on Android phone) by GGML + NCNN
-
bugfix in UI layer(Java)
-
bugfix in native layer(C/C++)
Be sure to review the opening issues before contribute to project KanTV, We use GitHub issues for tracking requests and bugs, please see how to submit issue in this project .
Report issue in various Android-based phone or even submit PR to this project is greatly welcomed.
- How to setup customized KanTV server in local dev env
- How to create customized playlist for kantv apk
- How to integrate proprietary/open source codes to project KanTV for personal/proprietary/commercial R&D activity
- How to use whisper.cpp and ffmpeg to add subtitle to video
- Acknowledgement
- ChangeLog
- F.A.Q
- AI inference framework
- GGML by Georgi Gerganov
- NCNN by Ni Hui
- AI application engine
- ASR engine whisper.cpp by Georgi Gerganov
- LLM engine llama.cpp by Georgi Gerganov
- TTS engine bark.cpp by PABannier
- Text2Image engine stablediffusion.cpp by leejet
- ASR engine sherpa-ncnn(an open-source ASR engine using next-generation Kaldi with ncnn) by k2-fsa
- Students:understand calculus/linear algebra/mathematical statistics and probability theory, have a little / some experiences in C/C++/Java software development, want to learn Android software development(UI, NDK, streamming media, AI application); 学生:了解微积分,线性代数,数理统计与概率论, 且有一定的C/C++/Java开发经验, 希望学习Android开发找到一份月薪RMB 2万+的工作(UI, NDK, 音视频(FFmpeg), 流媒体(HLS, RTMP, WebRTC), 端侧AI应用);
- Programmers: have good experiences in C/C++ software development, know a little/nothing about real/hardcore AI technology, want to learn AI technology in depth(know how); 程序员:有丰富的C/C++开发经验,几乎不懂真正的AI技术,希望深入学习AI技术(know how);
- Authors/maintainers of AI inference framework: compare the advantages of ggml and ncnn on Android(why focus on ggml & ncnn); AI推理框架的开发人员: 对比两个端侧推理框架ggml与ncnn的优点(为啥只关注ggml与ncnn这两个AI推理框架);
- AI experts/algorithm engineers: validate/verify AI(ASR, TTS, CV, NLP, LLM...) algorithm on Android with framework provided in this project(how to validate AI algorithm/model on Android using this project); AI特定领域(ASR, TTS, CV, NLP, LLM,...)的专家/算法工程师:使用本项目提供的框架在Android设备上调试/验证AI特定领域算法/模型(如何使用本项目在Android设备上调试/验证AI特定领域算法/模型);
Copyright (c) 2021 - 2023 Project KanTV
Copyright (c) 2024 - Authors of Project KanTV
Licensed under Apachev2.0 or later