Python
The Python interface can be used to access the disease atlas approximation API from Python. It enables efficient querying of disease-related cell atlas approximations.
Requirements
You need the following Python packages:
requests
pandas
Installation
You can use pip to install the atlasapprox_disease package:
pip install atlasapprox-disease
Getting Started
To use the API, import the atlasapprox_disease package:
import atlasapprox_disease as aad
and initialise the API object:
api = aad.API()
Here’s an example of querying metadata for datasets related to COVID-19 in lung tissue and then using the unique_ids to query average gene expression:
# Step 1: Query metadata to get unique_ids
metadata = api.metadata(disease="covid", tissue="lung")
print(metadata.head())
# Step 2: Use a unique_id to query average expression of specific genes
unique_id = metadata["unique_id"].iloc[0] # Select the first unique_id
avg_expr = api.average(features="IGHG1,CXCL13,S100A8", unique_ids=unique_id)
print(avg_expr)
Note
When using unique_ids in methods like average, fraction_detected, or dotplot, only specify the features parameter alongside it. Do not include other metadata filters (disease, cell_type, tissue, sex, development_stage), as unique_ids already encapsulate these conditions. Combining them will raise a ParamsConflictError.
Examples
Explore practical examples of using the Python API to analyze disease-related single-cell data:
Reference API
Cell atlas approximations, Python API interface.
- class atlasapprox_disease.API(url=None)
Bases:
objectMain object used to access the atlasapprox-disease REST API.
- metadata(disease: str = None, cell_type: str = None, tissue: str = None, sex: str = None, development_stage: str = None)
Retrieves metadata records from the atlasapprox-disease API. Each record represents a unique combination of dataset, cell type, tissue, disease condition, sex, and developmental stage that meets the query criteria.
- Parameters:
disease (str, optional) – Filter by disease name (e.g., “covid”).
cell_type (str, optional) – Filter by cell type (e.g., “fibroblast”).
tissue (str, optional) – Filter by tissue (e.g., “lung”).
sex (str, optional) – Filter by sex (e.g., “male”, “female”).
development_stage (str, optional) – Filter by developmental stage (e.g., “adult”).
- Returns:
A DataFrame containing metadata records for datasets matching the filters.
- differential_cell_type_abundance(differential_axis: str = None, disease: str = None, cell_type: str = None, tissue: str = None, sex: str = None, development_stage: str = None)
Get differential cell type abundance between a specified condition (e.g., disease) and a baseline (e.g., normal) across datasets.
- Parameters:
differential_axis (str, optional) – Axis for comparison (default: “disease”). Options: “disease” (disease vs. normal), “sex” (male vs. female).
disease (str, optional) – Filter by disease name (e.g., “flu”).
cell_type (str, optional) – Filter by cell type (e.g., “macrophage”).
tissue (str, optional) – Filter by tissue (e.g., “lung”).
sex (str, optional) – Filter by sex (e.g., “male”, “female”).
development_stage (str, optional) – Filter by developmental stage (e.g., “adult”).
- Returns:
A DataFrame containing differential cell type abundance between conditions.
- differential_gene_expression(differential_axis: str = None, disease: str = None, cell_type: str = None, tissue: str = None, sex: str = None, development_stage: str = None, top_n: int = None, feature: str = None, method: str = None)
Get differential gene expression between conditions.
- Parameters:
differential_axis (str, optional) – Axis for comparison (default: “disease”). Options: “disease” (disease vs. normal), “sex” (male vs. female), “age” (e.g., adult vs. other).
disease (str, optional) – Filter by disease name (e.g., “COVID”).
cell_type (str, optional) – Filter by cell type (e.g., “T cell”).
tissue (str, optional) – Filter by tissue (e.g., “lung”).
sex (str, optional) – Filter by sex (e.g., “male”, “female”).
development_stage (str, optional) – Filter by developmental stage (e.g., “adult”).
top_n (int, optional) – Number of top up- and down-regulated genes to return (default: 10). Ignored if feature is provided.
feature (str, optional) – Specific gene to query (e.g., “IL6”). If provided, top_n is ignored.
method (str, optional) – Method to compute differential expression (default: “delta_fraction”). Options: “delta_fraction”, “ratio_average”.
- Returns:
A DataFrame containing differential gene expression results between conditions.
- highest_measurement(feature: str = None, number: int = None)
Retrieves the top N cell types and tissue combinations with the highest expression of a specified gene across datasets. It helps identify which cell types most highly express a gene of interest in different diseases and tissues.
- Parameters:
feature (str) – The gene to query (e.g., “IL6”).
number (int, optional) – Number of top-expressing cell types to return (default: 10).
- Returns:
A DataFrame containing the top cell types with the highest expression of the specified feature.
- average(features: str = None, disease: str = None, cell_type: str = None, tissue: str = None, sex: str = None, development_stage: str = None, unique_ids: str = None, include_normal: bool = None)
Get the average expression levels of one or more genes across cell types, tissues, sex, development stage and diseases.
- Parameters:
features (str) – A comma-separated list of genes to query (e.g., “IL6,AGT”).
disease (str, optional) – Filter by disease name (e.g., “diabete”).
cell_type (str, optional) – Filter by cell type (e.g., “T cell”).
tissue (str, optional) – Filter by tissue (e.g., “lung”).
sex (str, optional) – Filter by sex (e.g., “male”, “female”).
development_stage (str, optional) – Filter by developmental stage (e.g., “adult”).
unique_ids (str, optional) – A comma-separated list of unique IDs from metadata results to filter specific dataset entries.
include_normal (bool, optional) – If True, includes the corresponding normal condition alongside the queried disease (default: False). Only applicable when a disease filter is provided. If no disease is specified, results include both disease and normal conditions.
- Returns:
A DataFrame containing average expression data for the specified features across conditions.
Note
If unique_ids is provided, it should only be used with the features parameter. Other metadata filters (disease, cell_type, tissue, sex, development_stage) should not be specified, as the unique_ids already encapsulate these metadata conditions. Using both will raise a ParamsConflictError.
- fraction_detected(features: str, disease: str = None, cell_type: str = None, tissue: str = None, sex: str = None, development_stage: str = None, unique_ids: str = None, include_normal: bool = None)
Get the fraction of cells expressing the specified features across conditions.
- Parameters:
features (str) – A comma-separated list of genes to query.
disease (str, optional) – Filter by disease name.
cell_type (str, optional) – Filter by cell type.
tissue (str, optional) – Filter by tissue.
sex (str, optional) – Filter by sex.
development_stage (str, optional) – Filter by developmental stage.
unique_ids (str, optional) – A comma-separated list of unique IDs from metadata results to filter specific dataset entries.
include_normal (bool, optional) – If True, includes the corresponding normal condition alongside the queried disease (default: False). Only applicable when a disease filter is provided. If no disease is specified, results include both disease and normal conditions.
- Returns:
A DataFrame containing the fraction of cells expressing the specified features across conditions.
- dotplot(features: str, disease: str = None, cell_type: str = None, tissue: str = None, sex: str = None, development_stage: str = None, unique_ids: str = None, include_normal: bool = None)
Pepare data for a dot plot, including average expression and fraction detected. The data is suitable for visualizing in a dot plot format, where dot size represents fraction detected and color represents average expression.
- Parameters:
features (str) – A comma-separated list of genes to query.
disease (str, optional) – Filter by disease name.
cell_type (str, optional) – Filter by cell type.
tissue (str, optional) – Filter by tissue.
sex (str, optional) – Filter by sex.
development_stage (str, optional) – Filter by developmental stage.
unique_ids (str, optional) – A comma-separated list of unique IDs from metadata results to filter specific dataset entries.
include_normal (bool, optional) – If True, includes the corresponding normal condition alongside the queried disease (default: False). Only applicable when a disease filter is provided. If no disease is specified, results include both disease and normal conditions.
- Returns:
A DataFrame containing dot plot data with average expression and fraction detected for the specified features.