Skip to content
This repository has been archived by the owner on Aug 28, 2024. It is now read-only.

ASL recognition demo #169

Merged
merged 27 commits into from
Aug 3, 2021
Merged

ASL recognition demo #169

merged 27 commits into from
Aug 3, 2021

Conversation

jeffxtang
Copy link
Contributor

@jeffxtang jeffxtang commented Jul 12, 2021

American Sign Language Recognition on Android

Introduction

American Sign Language (ASL) is a natural language used by deaf communities in many countries around the world. It has 26 signs corresponding to the 26 letters of the English language. This repo shows Python scripts that train a deep learning model to recognize the 26 ASL signs (and 3 additional signs for deletion, space, and nothing) and converts and optimizes the model to the Mobile Interpreter format, and an Android app that uses the model to recognize the 26 ASL signs.

Prerequisites

  • PyTorch 1.9.0 and torchvision 0.10.0 (Optional)
  • Python 3.8 or above (Optional)
  • Android Pytorch library pytorch_android_lite:1.9.0, pytorch_android_torchvision:1.9.0
  • Android Studio 4.0.1 or later

Quick Start

To Test Run the ASL recognition Android App, follow the steps below:

1. Train and Prepare the Model

If you don't have the PyTorch 1.9.0 and torchvision 0.10.0 installed, or if don't want to install them, you can skip this step. The trained, scripted and optimized model is already included in the repo, located at ASLRecognitionapp/src/main/assets.

Otherwise, open a terminal window, make sure you have torch 1.9.0 and torchvision 0.10.0 installed using command like pip list|grep torch, or install them using command like pip install torch torchvision, then run the following commands:

git clone https://github.com/pytorch/android-demo-app
cd android-demo-app/ASLRecognition/scripts

Download the ASL alphabet dataset here and unzip it into the ASLRecognition/scripts folder. Then run the scripts below, which are based on this tutorial, to pre-process the training images, train the model and convert and optimize the trained model to the mobile interpreter model:

python preprocess_image.py
python create_csv.py
python train.py --epochs 5 # on a machine without GPU this can take hours
python convert_lite.py

If all goes well, the model asl.ptl will be generated and you can copy it to ASLRecognition/app/src/main/assets.

You can also run python test.py to see the result of a test image located at ../app/src/main/assets/C1.jpg:

Predicted output: C
0.043 seconds

For more information on how to use a test script like the above to find out the expected model input and output and use them in an Android app, see Step 2 of the tutorial Image Segmentation DeepLabV3 on Android.

2. Use Android Studio

Open the ASLRecognition project using Android Studio. Note the app's build.gradle file has the following lines:

implementation 'org.pytorch:pytorch_android_lite:1.9.0'
implementation 'org.pytorch:pytorch_android_torchvision:1.9.0'

and in the MainActivity.java, the code below is used to load the model:

mModule = LiteModuleLoader.load(MainActivity.assetFilePath(getApplicationContext(), "asl.ptl"));

3. Run the App

Select an Android emulator or device and build and run the app. Some of the 26 test images of the ASL alphabet and their recognition results are as follows:



To test the live ASL alphabet gesture recognition, after you get familiar with the 26 ASL signs by tapping Next and Recognize, select the LIVE button and make some ASL gesture in front of the camera. A screencast of the app running is available here.

4. What's Next

With a different sign language dataset such as the RWTH-PHOENIX-Weather 2014 MS Public Hand Shape Dataset or the Continuous Sign Language Recognition Dataset and a state-of-the-art sign language transformer based model, more powerful sign language recognition Android app can be developed based on the app here.

@jeffxtang jeffxtang marked this pull request as ready for review July 26, 2021 19:04
@@ -0,0 +1,21 @@
# Add project specific ProGuard rules here.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File can be removed


public AnalysisResult(String results) {
mResults = results;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs formatting

Comment on lines 121 to 123
if (maxScoreIdx == DELETE) result = "DELETE";
else if (maxScoreIdx == NOTHING) result = "NOTHING";
else if (maxScoreIdx == SPACE) result = "SPACE";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Imo using blocks for every case will be more readable:

if (maxScoreIdx == DELETE) {
    result = "DELETE";
} else if (maxScoreIdx == NOTHING) {
    result = "NOTHING";
} else if (maxScoreIdx == SPACE) {
    result = "SPACE";
}

btnNext.setOnClickListener(new View.OnClickListener() {
public void onClick(View v) {
mStartLetterPos = (mStartLetterPos + 1) % 26;
if (mStartLetterPos == 0) mStartLetterPos = 26;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

if (mStartLetterPos == 0) {
    mStartLetterPos = 26;
}

@IvanKobzarev IvanKobzarev merged commit f09816a into pytorch:master Aug 3, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants