.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "python/gallery/beginner_guide.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_python_gallery_beginner_guide.py: .. _beginner-guide: Beginner guide ============== The `atlasapprox-disease `_ Python API provides access to over 600 disease-related single-cell datasets. Currently, it includes datasets from the CELLxGENE Census as its initial source, covering diseases such as COVID-19, diabetes, acute kidney failure, and gastritis, along with metadata like cell type, developmental stage, and sex. This API enables users to quickly explore cellular and gene expression patterns in disease contexts. Follow this tutorial to get started with the basics of using the API. .. GENERATED FROM PYTHON SOURCE LINES 19-34 Installation ------------ (Optional) To ensure consistent dependencies, we recommend setting up a virtual environment: .. code-block:: bash python -m venv ./venv source ./venv/bin/activate Then, install the ``atlasapprox-disease`` package using ``pip``: .. code-block:: bash pip install atlasapprox-disease .. GENERATED FROM PYTHON SOURCE LINES 36-40 Python quick start ------------------ Below are 2 examples of common operations you can do with the atlasapprox_disease Python API: .. GENERATED FROM PYTHON SOURCE LINES 40-46 .. code-block:: Python # Import the package and initialise the API import atlasapprox_disease api = atlasapprox_disease.API() .. rst-class:: sphx-glr-script-out .. code-block:: none Loading atlasapprox_disease from: /home/docs/checkouts/readthedocs.org/user_builds/cell-atlas-approximations-disease-api/checkouts/latest/Python/atlasapprox_disease/__init__.py .. GENERATED FROM PYTHON SOURCE LINES 47-53 Querying cell metadata ^^^^^^^^^^^^^^^^^^^^^^ The ``metadata`` function lets you explore cell metadata across datasets by applying filters on attributes like tissue, disease, or developmental stage. For example, the following filters cells from lung tissue at the adult stage: .. GENERATED FROM PYTHON SOURCE LINES 53-59 .. code-block:: Python api.metadata( tissue="lung", development_stage="adult" ) .. raw:: html
unique_id dataset_id cell_type tissue_general disease development_stage_general sex cell_count
0 98ac1a55676e61a854d68c3f3e5f791a 01209dce-3575-4bed-b1df-129f57fbc031 CD4-positive, alpha-beta T cell lung normal adult male 1993
1 01b9218f253b6c07a021b1b4f3954871 01209dce-3575-4bed-b1df-129f57fbc031 CD4-positive, alpha-beta thymocyte lung normal adult male 3056
2 bbff9e63b378470377555ed3f97bedb2 01209dce-3575-4bed-b1df-129f57fbc031 CD8-positive, alpha-beta T cell lung normal adult male 2391
3 2540425305dd88c7aedd91fcebea46d0 01209dce-3575-4bed-b1df-129f57fbc031 CD8-positive, alpha-beta thymocyte lung normal adult male 3350
4 b387d4572395da54759d1b9fdd9b4a66 01209dce-3575-4bed-b1df-129f57fbc031 immature alpha-beta T cell lung normal adult male 171
... ... ... ... ... ... ... ... ...
1413 2430bd747f2aad4f2ead5a33ff9ef3b8 f72958f5-7f42-4ebb-98da-445b0c6de516 type II pneumocyte lung normal adult male 30969
1414 548cac77e1039631066deb4d65bdaa39 f72958f5-7f42-4ebb-98da-445b0c6de516 unknown lung normal adult female 866
1415 bb0f087ae00b96f2552bd1e84e8fa105 f72958f5-7f42-4ebb-98da-445b0c6de516 unknown lung normal adult male 1358
1416 1c4e1440f60ee38746e5200b635337c7 f72958f5-7f42-4ebb-98da-445b0c6de516 vein endothelial cell lung normal adult female 2028
1417 077eee6802a2c541daf0fd33c379f677 f72958f5-7f42-4ebb-98da-445b0c6de516 vein endothelial cell lung normal adult male 5882

1418 rows × 8 columns



.. GENERATED FROM PYTHON SOURCE LINES 60-64 The output is a ``pandas.DataFrame`` with over 1400 unique combinations of cell types, diseases, and other columns such as ``sex``, ``cell_count`` and the ``dataset`` it comes from. You can get a quick overview of what cell types and conditions are available and use this information later for querying other API functions. .. GENERATED FROM PYTHON SOURCE LINES 66-71 Querying average gene expression ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The ``average`` function retrieves average gene expression across cell types, tissues, and diseases. For example, to query immune-related genes in COVID-19: .. GENERATED FROM PYTHON SOURCE LINES 71-77 .. code-block:: Python api.average( features="ACE2,TLR4,NLRP3,MBL2,IL6", disease="COVID-19" ) .. raw:: html
cell_count cell_type tissue_general disease dataset_id ACE2 TLR4 NLRP3 MBL2 IL6
0 42850 B cell blood COVID-19 01ad3cd7-3929-4654-84c0-6db05bd5fd59 0.000000 0.020387 0.014871 0.0 0.246985
1 111297 CD4-positive, alpha-beta T cell blood COVID-19 01ad3cd7-3929-4654-84c0-6db05bd5fd59 0.000000 0.011792 0.016089 0.0 0.000325
2 64766 CD8-positive, alpha-beta T cell blood COVID-19 01ad3cd7-3929-4654-84c0-6db05bd5fd59 0.000000 0.016499 0.018025 0.0 0.000705
3 113753 classical monocyte blood COVID-19 01ad3cd7-3929-4654-84c0-6db05bd5fd59 0.000000 0.636274 0.534583 0.0 0.006524
4 4776 conventional dendritic cell blood COVID-19 01ad3cd7-3929-4654-84c0-6db05bd5fd59 0.000000 0.123744 0.248952 0.0 0.002513
... ... ... ... ... ... ... ... ... ... ...
565 118 mast cell respiratory system COVID-19 f156606a-dd9a-49fd-bc40-0e069b6cf07c 0.000000 0.090158 0.000000 0.0 0.000000
566 1108 mature NK T cell respiratory system COVID-19 f156606a-dd9a-49fd-bc40-0e069b6cf07c 0.000000 0.042362 0.104304 0.0 0.013103
567 7438 myeloid cell respiratory system COVID-19 f156606a-dd9a-49fd-bc40-0e069b6cf07c 0.000000 0.736049 0.948961 0.0 0.164485
568 4 neutrophil respiratory system COVID-19 f156606a-dd9a-49fd-bc40-0e069b6cf07c 0.000000 0.000000 4.633920 0.0 0.000000
569 241 unknown respiratory system COVID-19 f156606a-dd9a-49fd-bc40-0e069b6cf07c 0.001793 0.250800 0.319913 0.0 0.081020

570 rows × 10 columns



.. GENERATED FROM PYTHON SOURCE LINES 78-82 The output is a ``pandas.DataFrame`` with columns such as ``cell_type``, ``tissue_general``, ``disease``, ``dataset_id``, and the expression levels of the queried genes (in counts per 10k). This helps you explore gene activity in specific conditions and identify key genes for further analysis. .. GENERATED FROM PYTHON SOURCE LINES 84-92 Next steps ---------- This tutorial introduced the basics of the ``atlasapprox-disease`` API. To learn more, explore additional functions like ``dotplot`` for visualizing gene expression, or query ``differential gene expression`` data. Visit the `official documentation `_ for further details. .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 27.095 seconds) .. _sphx_glr_download_python_gallery_beginner_guide.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: beginner_guide.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: beginner_guide.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: beginner_guide.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_