Skip to content

Commit

Permalink
llamacpp:integrate ggml's excellent and amazing llama.cpp to kantv (#104
Browse files Browse the repository at this point in the history
)
  • Loading branch information
zhouwg authored Mar 26, 2024
1 parent a24dec5 commit 559d2ce
Show file tree
Hide file tree
Showing 228 changed files with 214,962 additions and 2,132 deletions.
42 changes: 29 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@ KanTV("Kan", aka Chinese PinYin "Kan" or Chinese HanZi "看" or English "watch/l

- Record online TV to automatically generate videos (useful for short video creators to generate short video materials but pls respect IPR of original content creator/provider); record online TV's video / audio content for gather video / audio data which might be required of/useful for AI R&D activity

- ASR(Automatic Speech Recognition, a sub-filed of AI) research by the great <a href="https://github.com/ggerganov/whisper.cpp"> whisper.cpp </a>
- ASR(Automatic Speech Recognition, a subfiled of AI) research by the great <a href="https://github.com/ggerganov/whisper.cpp"> whisper.cpp </a>

- LLM(Large Language Model, a sub-filed of AI) research by the great <a href="https://github.com/ggerganov/llama.cpp"> llama.cpp </a>
- LLM(Large Language Model, a subfiled of AI) research by the great <a href="https://github.com/ggerganov/llama.cpp"> llama.cpp </a>


- Real-time English subtitle for English online-TV(aka OTT TV) by the great & excellent & amazing<a href="https://github.com/ggerganov/whisper.cpp"> whisper.cpp </a>(<a href="https://github.com/zhouwg/kantv/issues/64">PoC finished on Xiaomi 14</a>. Xiaomi 14 or other powerful Android mobile phone is HIGHLY required/recommended for real-time subtitle feature otherwise unexpected behavior would happen)
Expand All @@ -27,7 +27,7 @@ Some goals of this project are:

- Well-maintained "workbench" for LLM(Large Language Model) researchers who was interested in practise state-of-the-art AI tech(like [llama.cpp](https://github.com/ggerganov/llama.cpp)) in real scenario on mobile device(Android)

- Android <b>turn-key project</b> for AI experts(whom mightbe not familiar with <b>regular Android software development</b>) focus on AI research activity, part of AI R&D activity(algorithm improvement, model training, model generation, algorithm validation, model validation, performance benchmark......) could be done by Android Studio IDE + a powerful Android phone very easily
- Android <b>turn-key project</b> for AI experts/researchers(whom mightbe not familiar with <b>regular Android software development</b>) focus on device-side AI R&D activity, part of AI R&D activity(algorithm improvement, model training, model generation, algorithm validation, model validation, performance benchmark......) could be done by Android Studio IDE + a powerful Android phone very easily


### How to build project
Expand Down Expand Up @@ -150,24 +150,38 @@ autocmd InsertEnter * match ForbiddenWhitespace /\t\|\s\+\%#\@<!$/
#### Fetch source codes

```
git clone https://github.com/zhouwg/kantv.git
cd kantv
git checkout master
cd kantv
```

#### Build native codes
#### Configure local development environment

modify <a href="https://github.com/zhouwg/kantv/blob/master/build/envsetup.sh#L85">build/envsetup.sh</a> accordingly before launch build
- download android-ndk-r26c to prebuilts/toolchain, skip this step if android-ndk-r26c is already exist

pay attention <a href="https://github.com/zhouwg/kantv/blob/master/external/whispercpp/CMakeLists.txt#L54">here and modify it accordingly</a> if build-target is kantv-android and running Android device is NOT Xiaomi 14
```
(TIP: a VERY powerful Linux PC / Linux workstation is HIGHLY recommended for this step)
./build/prebuild-download.sh
```

- modify <a href="https://github.com/zhouwg/kantv/blob/master/build/envsetup.sh#L85">build/envsetup.sh</a> accordingly before launch build

- moidfy <a href="https://github.com/zhouwg/kantv/blob/master/external/whispercpp/CMakeLists.txt#L54">whispercpp/CMakeLists.txt</a> accordingly if build-target is kantv-android and running Android device is NOT Xiaomi 14


#### Build native codes

```
. build/envsetup.sh
(download android-ndk-r26c to prebuilts/toolchain, skip this step if android-ndk-r26c is already exist)
./build/prebuild-download.sh
```
![Screenshot from 2024-03-21 21-41-41](https://github.com/zhouwg/kantv/assets/6889919/3e13946f-596b-44be-9716-5793ce0c7263)
Expand All @@ -184,7 +198,7 @@ pay attention <a href="https://github.com/zhouwg/kantv/blob/master/external/whis

### Run Android APK on real Android phone

The UI Layer of Project KanTV(this Android APK) is designed for R&D activity. and follows the principles of '**minimum permissions**' and '**do not collect unnecessary user data**' or EU's GDPR principle. When installing/using for the first time on an Android phone, only the following two permissions are required:
The UI Layer of Project KanTV(this Android APK) follows the principles of '**minimum permissions**' and '**do not collect unnecessary user data**' or EU's GDPR principle. When installing/using for the first time on an Android phone, only the following two permissions are required:

- Access to storage is required to generate necessary temporary files
- Access to device information is required to obtain current phone network status information, distinguishing whether the current network is Wi-Fi or mobile when playing online TV
Expand All @@ -201,15 +215,17 @@ https://github.com/zhouwg/kantv/assets/6889919/2fabcb24-c00b-4289-a06e-05b98ecd2

----

![778994889](https://github.com/zhouwg/kantv/assets/6889919/ef554f25-a7a5-4bd3-8db8-368af6e45702)

<details>
<summary>some English screenshots</summary>
<summary>some other screenshots</summary>
<ol>

![784269893](https://github.com/zhouwg/kantv/assets/6889919/8fe74b2a-21bc-452c-a6bb-5fb7fb2a567a)
![205726588](https://github.com/zhouwg/kantv/assets/6889919/16411854-c67b-4975-9ca1-fabcfe95a62b)
![1904016769](https://github.com/zhouwg/kantv/assets/6889919/a6b14cb1-8e3c-436d-89f1-b0c7adeaf00a)
![880686930](https://github.com/zhouwg/kantv/assets/6889919/fb2add6c-94d1-42c5-83f7-a0d3b0ec9f9b)
![2147012199](https://github.com/zhouwg/kantv/assets/6889919/2a2590f9-8343-4886-9ace-74a4880d9bed)
![778994889](https://github.com/zhouwg/kantv/assets/6889919/ef554f25-a7a5-4bd3-8db8-368af6e45702)
![1778831978](https://github.com/zhouwg/kantv/assets/6889919/92774cbc-c716-4819-a0c1-6bc0ae495d1d)



Expand Down
1 change: 1 addition & 0 deletions build/envsetup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ export ANDROID_NDK=${KANTV_TOOLCHAIN_PATH}/android-ndk-r26c

#modify following lines to adapt to local dev envs
export UPSTREAM_WHISPERCPP_PATH=~/whisper.cpp
export UPSTREAM_LLAMACPP_PATH=~/llama.cpp


. ${PROJECT_ROOT_PATH}/build/public.sh || (echo "can't find public.sh"; exit 1)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,37 +1,46 @@
//TODO: re-write entire whispercpp.java with standard Android JNI specification
// interaction between KANTVMgr.java and whispercpp.java
// TODO: 03-05-2024, re-write entire whispercpp.java with standard Android JNI specification
// TODO: 03-26-2024, rename this file to ggmljni to unify the JNI of whisper.cpp and llama.cpp, as these projects are all based on ggml

package org.ggml.whispercpp;

public class whispercpp {
private static final String TAG = whispercpp.class.getName();
private static final String TAG = whispercpp.class.getName();

public static final int WHISPER_ASR_MODE_NORMAL = 0;
public static final int WHISPER_ASR_MODE_PRESURETEST = 1;
public static final int WHISPER_ASR_MODE_BECHMARK = 2;
public static final int WHISPER_ASR_MODE_NORMAL = 0;
public static final int WHISPER_ASR_MODE_PRESURETEST = 1;
public static final int WHISPER_ASR_MODE_BECHMARK = 2;

public static native int asr_init(String strModelPath, int nThreadCounts, int nASRMode);
public static native int asr_init(String strModelPath, int nThreadCounts, int nASRMode);

public static native void asr_finalize();
public static native void asr_finalize();

public static native void asr_start();
public static native void asr_stop();
public static native int asr_reset(String strModelPath, int nThreadCounts, int nASRMode);
public static native void asr_start();

public static native String get_systeminfo();
public static native void asr_stop();

public static native int get_cpu_core_counts();
public static native int asr_reset(String strModelPath, int nThreadCounts, int nASRMode);

//TODO: not work as expected, just skip this during PoC stage
public static native void set_benchmark_status(int bExitBenchmark);
public static native String asr_get_systeminfo();

/**
*
* @param modelPath /sdcard/kantv/ggml-xxxxx.bin
* @param audioPath /sdcard/kantv/jfk.wav
* @param nBenchType 0: asr 1: memcpy 2: mulmat 3: full/whisper_encode
* @param nThreadCounts 1 - 8
* @return
*/
public static native String bench(String modelPath, String audioPath, int nBenchType, int nThreadCounts);
public static native int get_cpu_core_counts();

//TODO: not work as expected
public static native void asr_set_benchmark_status(int bExitBenchmark);

/**
* @param modelPath /sdcard/kantv/ggml-xxxxx.bin
* @param audioPath /sdcard/kantv/jfk.wav
* @param nBenchType 0: asr(transcription) 1: memcpy 2: mulmat 3: full/whisper_encode
* @param nThreadCounts 1 - 8
* @return
*/
public static native String asr_bench(String modelPath, String audioPath, int nBenchType, int nThreadCounts);


public static native String llm_get_systeminfo();


public static native String llm_bench(String modelPath, String prompt, int nBenchType, int nThreadCounts);

public static native String llm_inference(String modelPath, String prompt, int nBenchType, int nThreadCounts);
}
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ public void initView() {
}

CDELog.j(TAG, "load ggml's whisper model");
String systemInfo = whispercpp.get_systeminfo();
String systemInfo = whispercpp.asr_get_systeminfo();
String phoneInfo = "Device info:" + "\n"
+ "Brand:" + Build.BRAND + "\n"
+ "Hardware:" + Build.HARDWARE + "\n"
Expand Down Expand Up @@ -335,12 +335,12 @@ public void run() {
strBenchmarkInfo = "";

initKANTVMgr();
whispercpp.set_benchmark_status(0);
whispercpp.asr_set_benchmark_status(0);


while (isBenchmarking.get()) {
beginTime = System.currentTimeMillis();
strBenchmarkInfo = whispercpp.bench(
strBenchmarkInfo = whispercpp.asr_bench(
CDEUtils.getDataPath() + ggmlModelFileName,
CDEUtils.getDataPath() + ggmlSampleFileName,
benchmarkIndex,
Expand Down Expand Up @@ -402,7 +402,7 @@ public void run() {
public void onCancel(DialogInterface dialogInterface) {
if (mProgressDialog != null) {
CDELog.j(TAG, "stop GGML benchmark");
whispercpp.set_benchmark_status(1);
whispercpp.asr_set_benchmark_status(1);
isBenchmarking.set(false);
mProgressDialog.dismiss();
mProgressDialog = null;
Expand Down
Loading

2 comments on commit 559d2ce

@zhouwg
Copy link
Owner Author

@zhouwg zhouwg commented on 559d2ce Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't know/have no ide how llama.cpp is really used in project kantv like whisper.cpp based real-time subtitle in online tv.

a GPT-style chat tool?

sometimes the LLM model's answers are wildly incorrect.

@zhouwg
Copy link
Owner Author

@zhouwg zhouwg commented on 559d2ce Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another question:

how LLAMA handle picture/video? is LLAMA a multimodal large model?

Please sign in to comment.