Skip to content
/ kantv Public

workbench for learing&practising AI tech in real scenario on Android device, powered by GGML(Georgi Gerganov Machine Learning) and NCNN(Tencent NCNN) and FFmpeg

License

Apache-2.0 and 2 other licenses found

Licenses found

Apache-2.0
LICENSE
MIT
LICENSE-llamacpp
Unknown
LICENSE-zh
Notifications You must be signed in to change notification settings

zhouwg/kantv

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KanTV

KanTV("Kan", aka Chinese PinYin "Kan" or Chinese HanZi "çś‹" or English "watch/listen") , an open source project focus on study and practise state-of-the-art AI technology in real application / real scenario(such as online-TV playback and online-TV transcription(real-time subtitle) and online-TV language translation and online-TV video&audio recording works at the same time) on mobile device, derived from original ijkplayer , with much enhancements and new features:

  • Watch online TV and local media by customized FFmpeg 6.1, source code of customized FFmpeg 6.1 could be found in external/ffmpeg according to FFmpeg's license

  • Record online TV to automatically generate videos (useful for short video creators to generate short video materials but pls respect IPR of original content creator/provider); record online TV's video / audio content for gather video / audio data which might be required of/useful for AI R&D activity

  • ASR(Automatic Speech Recognition, a subfiled of AI) study by the great whisper.cpp

  • LLM(Large Language Model, a subfiled of AI) study by the great llama.cpp

  • SD(Text to Image by Stable Diffusion, a subfiled of AI) study by the amazing stablediffusion.cpp

  • Real-time English subtitle for English online-TV(aka OTT TV) by the great & excellent & amazing whisper.cpp (PoC finished on Xiaomi 14. Xiaomi 14 or other powerful Android mobile phone is HIGHLY required/recommended for real-time subtitle feature otherwise unexpected behavior would happen)

  • Run/experience LLM(such as llama-2-7b, baichuan2-7b, qwen1_5-1_8b, gemma-2b) on Xiaomi 14 using the amazing llama.cpp

  • Set up a customized playlist and then use this software to watch the content of the customized playlist for R&D activity

  • UI refactor(closer to real commercial Android application and only English is supported in UI language currently)

Some goals of this project are:

  • Well-maintained "workbench" for ASR(Automatic Speech Recognition) researchers who was interested in practise state-of-the-art AI tech(like whisper.cpp) in real scenario on mobile device(focus on Android currently)

  • Well-maintained "workbench" for LLM(Large Language Model) researchers who was interested in practise state-of-the-art AI tech(like llama.cpp) in real scenario on mobile device(focus on Android currently)

  • Android turn-key project for AI experts/researchers(whom mightbe not familiar with regular Android software development) focus on device-side AI R&D activity, part of AI R&D activity(algorithm improvement, model training, model generation, algorithm validation, model validation, performance benchmark......) could be done by Android Studio IDE + a powerful Android phone very easily

Software architecture of KanTV Android

(this is proposal and depend on #121)

kantv-android-arch

How to build project

Prerequisites
    Host OS information:
    
    uname -a
    
    Linux 5.8.0-43-generic #49~20.04.1-Ubuntu SMP Fri Feb 5 09:57:56 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
    
    cat /etc/issue
    
    Ubuntu 20.04.2 LTS \n \l
    
    
    • tools & utilities
    sudo apt-get update
    sudo apt-get install build-essential -y
    sudo apt-get install cmake -y
    sudo apt-get install curl -y
    sudo apt-get install wget -y
    sudo apt-get install python -y
    sudo apt-get install tcl expect -y
    sudo apt-get install nginx -y
    sudo apt-get install git -y
    sudo apt-get install vim -y
    sudo apt-get install spawn-fcgi -y
    sudo apt-get install u-boot-tools -y
    sudo apt-get install ffmpeg -y
    sudo apt-get install openssh-client -y
    sudo apt-get install nasm -y
    sudo apt-get install yasm -y
    sudo apt-get install openjdk-17-jdk -y
    
    sudo dpkg --add-architecture i386
    sudo apt-get install lib32z1 -y
    
    sudo apt-get install -y android-tools-adb android-tools-fastboot autoconf \
            automake bc bison build-essential ccache cscope curl device-tree-compiler \
            expect flex ftp-upload gdisk acpica-tools libattr1-dev libcap-dev \
            libfdt-dev libftdi-dev libglib2.0-dev libhidapi-dev libncurses5-dev \
            libpixman-1-dev libssl-dev libtool make \
            mtools netcat python-crypto python3-crypto python-pyelftools \
            python3-pycryptodome python3-pyelftools python3-serial \
            rsync unzip uuid-dev xdg-utils xterm xz-utils zlib1g-dev
    
    sudo apt-get install python3-pip -y
    sudo apt-get install indent -y
    pip3 install meson ninja
    
    echo "export PATH=/home/`whoami`/.local/bin:\$PATH" >> ~/.bashrc
    
    

    or run below script accordingly after fetch project's source code

    
    ./build/prebuild.sh
    
    
    

    borrow from http://ffmpeg.org/developer.html#Editor-configuration

    set ai
    set nu
    set expandtab
    set tabstop=4
    set shiftwidth=4
    set softtabstop=4
    set noundofile
    set nobackup
    set fileformat=unix
    set undodir=~/.undodir
    set cindent
    set cinoptions=(0
    " Allow tabs in Makefiles.
    autocmd FileType make,automake set noexpandtab shiftwidth=8 softtabstop=8
    " Trailing whitespace and tabs are forbidden, so highlight them.
    highlight ForbiddenWhitespace ctermbg=red guibg=red
    match ForbiddenWhitespace /\s\+$\|\t/
    " Do not highlight spaces at the end of line while typing on that line.
    autocmd InsertEnter * match ForbiddenWhitespace /\t\|\s\+\%#\@<!$/
    
    

Fetch source codes


git clone https://github.com/zhouwg/kantv.git

cd kantv

git checkout master

cd kantv

Configure local development environment

  • download android-ndk-r26c to prebuilts/toolchain, skip this step if android-ndk-r26c is already exist
. build/envsetup.sh

./build/prebuild-download.sh

Build native codes


. build/envsetup.sh


Screenshot from 2024-04-07 09-45-04

Build Android APK

  • Option 1: Build APK from source code by Android Studio IDE

  • Option 2: Build APK from source code by command line

      . build/envsetup.sh
      lunch 1
      ./build-all.sh android
    

    Please attention some source codes in ASRResearchFragment.java which affect the running of the ASR demo and the size of the generated APK.

Run Android APK on real Android phone

This Android APK works well on any mainstream Qualcomm mobile SoC based Android phone.

The UI Layer of Project KanTV(this Android APK) follows the principles of 'minimum permissions' and 'do not collect unnecessary user data' or EU's GDPR principle. When installing/using for the first time on an Android phone, only the following two permissions are required:

  • Access to storage is required to generate necessary temporary files
  • Access to device information is required to obtain current phone network status information, distinguishing whether the current network is Wi-Fi or mobile when playing online TV

here is a short video to demostrate AI subtitle by running the great & excellent & amazing whisper.cpp on a Xiaomi 14 device - fully offline, on-device.
realtime-subtitle-by-whispercpp-demo-on-xiaomi14.mp4

better performance with better stability after finetune(sometimes whisper.cpp will produce meaningless repeat tokens) with new method which introduced in ggerganov/whisper.cpp#1951
realtime-subtitle-by-whispercpp-demo-on-xiaomi14-finetune-20240324.mp4

1697162123

some other screenshots

    784269893 205726588 1904016769

    1377769652

    1778831978

    Screenshot_2024_0304_131033

    154248860

    1118975128 Screenshot_20240301_000609_com cdeos kantv

    1966093505

    1179733910

    2138671817

    1634808790

    991182277

Hot topics

  • add Qualcomm mobile SoC native backend for GGML

  • improve quality of real-time English subtitle which powered by great and excellent and amazing whisper.cpp

  • real-time Chinese subtitle for online English TV by great and excellent and amazing whisper.cpp

  • bugfix in UI layer(Java)

  • bugfix in native layer(C/C++)

  • participate in improvement of whisper.cpp on Android device and feedback to upstream

Contribution

Be sure to review the opening issues before contribute to project KanTV, We use GitHub issues for tracking requests and bugs, please see how to submit issue in this project .

Report issue in various Android-based phone or even submit PR to this project is greatly welcomed.

English is preferred in this project(avoid similar comments in this project:torvalds/linux#818). thanks for cooperation and understanding.

Docs

Special Acknowledgement

the AI part of this project is heavily depend on ggml and whisper.cpp and llama.cpp by Georgi Gerganov.

License

Copyright (c) 2021 - 2023 Project KanTV

Copyright (c) 2024 -  Authors of Project KanTV

Licensed under Apachev2.0 or later

About

workbench for learing&practising AI tech in real scenario on Android device, powered by GGML(Georgi Gerganov Machine Learning) and NCNN(Tencent NCNN) and FFmpeg

Topics

Resources

License

Apache-2.0 and 2 other licenses found

Licenses found

Apache-2.0
LICENSE
MIT
LICENSE-llamacpp
Unknown
LICENSE-zh

Stars

Watchers

Forks

Packages

No packages published