Large-scale generative pretrain of single cell using transformer.
Project description
scGPT
This is the official codebase for scGPT: Towards Building a Foundation Model for Single-Cell Multi-omics Using Generative AI.
Installation
scGPT is available on PyPI. To install scGPT, run the following command:
$ pip install scgpt
[Optional] We recommend using wandb for logging and visualization.
$ pip install wandb
For developing, we are using the Poetry package manager. To install Poetry, follow the instructions here.
$ git clone this-repo-url
$ cd scGPT
$ poetry install
Note: The flash-attn
dependency usually requires specific GPU and CUDA version. If you encounter any issues, please refer to the flash-attn repository for installation instructions.
Pretrained scGPT checkpoints
Please download the pretrained scGPT checkpoints from here.
Fine-tune scGPT for scRNA-seq integration
Please see our example code in examples/finetune_integration.py. By default, the script assumes the scGPT checkpoint folder stored in the examples/save
directory.
To-do-list
- Upload the pretrained model checkpoint
- Publish to pypi
- Provide the pretraining code with generative attention masking
- Finetuning examples for multi-omics integration, cell tyep annotation, perturbation prediction, cell generation
- Example code for Gene Regulatory Network analysis
- Documentation website with readthedocs
- Bump up to pytorch 2.0
- New pretraining on larger datasets
- Reference mapping example
- Finetuning with LORA
- Publish to huggingface model hub
Contributing
We greatly welcome contributions to scGPT. Please submit a pull request if you have any ideas or bug fixes. We also welcome any issues you encounter while using scGPT.
Acknowledgements
We sincerely thank the authors of following open-source projects:
Citing scGPT
@article{cui2023scGPT,
title={scGPT: Towards Building a Foundation Model for Single-Cell Multi-omics Using Generative AI},
author={Cui, Haotian and Wang, Chloe and Maan, Hassaan and Wang, Bo},
journal={bioRxiv},
year={2023},
publisher={Cold Spring Harbor Laboratory}
}
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for scgpt-0.1.2.post1-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b2a096de3c75a33dca03d3e2228ba5394544cdc562a4713796677f490a2263ce |
|
MD5 | d3a83692c23983e31feac682ddc7ef05 |
|
BLAKE2b-256 | 18c578cde91ae516be5c9512fe43609b461d88d5a8d33cd2710f3797ce8d8a5e |