scChat: A Large Language Model-Powered Co-Pilot for Contextualized Single-Cell RNA Sequencing Analysis
Welcome to the scChat page. scChat is a pioneering AI assistant designed to enhance single-cell RNA sequencing (scRNA-seq) analysis by incorporating research context into the workflow. Powered by a large language model (LLM), scChat goes beyond standard tasks like cell annotation by offering advanced capabilities such as research context-based experimental analysis, hypothesis validation, and suggestions for future experiments.
Watch the demo of scChat in action below:
If you found this work useful, please cite this preprint as:
@misc{lu2024scchat,
title={scChat: A Large Language Model-Powered Co-Pilot for Contextualized Single-Cell RNA Sequencing Analysis},
author={Yen-Chun Lu and Ashley Varghese and Rahul Nahar and Hao Chen and Kunming Shao and Xiaoping Bao and Can Li},
year={2024},
eprint={2024.10.01.616063},
archivePrefix={bioRxiv},
doi={10.1101/2024.10.01.616063}
}
Data-driven methods such as unsupervised and supervised learning are essential tools in single-cell RNA sequencing (scRNA-seq) analysis. However, these methods often lack the ability to incorporate research context, which can lead to missed insights. scChat addresses this by integrating contextualized conversation with data analysis to provide a deeper understanding of experimental results. It supports the exploration of research hypotheses and generates actionable insights for future experiments.
Please read our scChat paper for more motivation and details about how the scChat works.
Model: scChat currently supports analysis using AnnData-formatted single-cell RNA sequencing datasets.
Capabilities: scChat integrates an LLM with specialized tools to enable tasks such as marker gene identification, UMAP clustering, and custom literature searches, all through conversational interactions.
To set up the project environment and run the server, follow these steps:
- Install the required dependencies:
pip3 install -r requirements.txt
Follow these steps to utilize the application effectively:
- run python3 manage.py runserver
- Go to http://127.0.0.1:8000/schatbot on a web browser.
- Upload adata file
- Upload sample mapping (.json file) (if required).
- request to generate UMAP for RNA Analysis.
- (3) Will return a python dictionary, type in that you want to label/annotate clusters for overall cells.
- Now you can ask to display annotated umap for overall cells or view the non-annotated umap for overall cells.
- You can ask for rationale or research questions specific to your dataset.
- If you want you can filter and process a specific cell type.
- (7) would return a python dictionary, type in that you want to label/annotate clusters for the processed cell type
- You can ask for reasoning, possible hypothesis and so on.
The datasets used for testing can be found at https://docs.google.com/spreadsheets/d/1NwN5GydHn0B3-W0DLcAfvnNtZVJEMUgBW9YyzXnS83A/edit?usp=sharing
@misc{lu2024scchat,
title={scChat: A Large Language Model-Powered Co-Pilot for Contextualized Single-Cell RNA Sequencing Analysis},
author={Yen-Chun Lu and Ashley Varghese and Rahul Nahar and Hao Chen and Kunming Shao and Xiaoping Bao and Can Li},
year={2024},
eprint={2024.10.01.616063},
archivePrefix={bioRxiv},
doi={10.1101/2024.10.01.616063}
}