Python

The Python interface can be used to access the disease atlas approximation API from Python. It enables efficient querying of disease-related cell atlas approximations.

Requirements

You need the following Python packages:

  • requests

  • pandas

Installation

You can use pip to install the atlasapprox_disease package:

pip install atlasapprox-disease

Getting Started

To use the API, import the atlasapprox_disease package:

import atlasapprox_disease as aad

and initialise the API object:

api = aad.API()

Here’s an example of querying metadata for datasets related to COVID-19 in lung tissue and then using the unique_ids to query average gene expression:

# Step 1: Query metadata to get unique_ids
metadata = api.metadata(disease="covid", tissue="lung")
print(metadata.head())

# Step 2: Use a unique_id to query average expression of specific genes
unique_id = metadata["unique_id"].iloc[0]  # Select the first unique_id
avg_expr = api.average(features="IGHG1,CXCL13,S100A8", unique_ids=unique_id)
print(avg_expr)

Note

When using unique_ids in methods like average, fraction_detected, or dotplot, only specify the features parameter alongside it. Do not include other metadata filters (disease, cell_type, tissue, sex, development_stage), as unique_ids already encapsulate these conditions. Combining them will raise a ParamsConflictError.

Examples

Explore practical examples of using the Python API to analyze disease-related single-cell data:

Reference API

Cell atlas approximations, Python API interface.

class atlasapprox_disease.API(url=None)

Bases: object

Main object used to access the atlasapprox-disease REST API.

metadata(disease: str = None, cell_type: str = None, tissue: str = None, sex: str = None, development_stage: str = None)

Retrieves metadata records from the atlasapprox-disease API. Each record represents a unique combination of dataset, cell type, tissue, disease condition, sex, and developmental stage that meets the query criteria.

Parameters:
  • disease (str, optional) – Filter by disease name (e.g., “covid”).

  • cell_type (str, optional) – Filter by cell type (e.g., “fibroblast”).

  • tissue (str, optional) – Filter by tissue (e.g., “lung”).

  • sex (str, optional) – Filter by sex (e.g., “male”, “female”).

  • development_stage (str, optional) – Filter by developmental stage (e.g., “adult”).

Returns:

A DataFrame containing metadata records for datasets matching the filters.

differential_cell_type_abundance(differential_axis: str = None, disease: str = None, cell_type: str = None, tissue: str = None, sex: str = None, development_stage: str = None)

Get differential cell type abundance between a specified condition (e.g., disease) and a baseline (e.g., normal) across datasets.

Parameters:
  • differential_axis (str, optional) – Axis for comparison (default: “disease”). Options: “disease” (disease vs. normal), “sex” (male vs. female).

  • disease (str, optional) – Filter by disease name (e.g., “flu”).

  • cell_type (str, optional) – Filter by cell type (e.g., “macrophage”).

  • tissue (str, optional) – Filter by tissue (e.g., “lung”).

  • sex (str, optional) – Filter by sex (e.g., “male”, “female”).

  • development_stage (str, optional) – Filter by developmental stage (e.g., “adult”).

Returns:

A DataFrame containing differential cell type abundance between conditions.

differential_gene_expression(differential_axis: str = None, disease: str = None, cell_type: str = None, tissue: str = None, sex: str = None, development_stage: str = None, top_n: int = None, feature: str = None, method: str = None)

Get differential gene expression between conditions.

Parameters:
  • differential_axis (str, optional) – Axis for comparison (default: “disease”). Options: “disease” (disease vs. normal), “sex” (male vs. female), “age” (e.g., adult vs. other).

  • disease (str, optional) – Filter by disease name (e.g., “COVID”).

  • cell_type (str, optional) – Filter by cell type (e.g., “T cell”).

  • tissue (str, optional) – Filter by tissue (e.g., “lung”).

  • sex (str, optional) – Filter by sex (e.g., “male”, “female”).

  • development_stage (str, optional) – Filter by developmental stage (e.g., “adult”).

  • top_n (int, optional) – Number of top up- and down-regulated genes to return (default: 10). Ignored if feature is provided.

  • feature (str, optional) – Specific gene to query (e.g., “IL6”). If provided, top_n is ignored.

  • method (str, optional) – Method to compute differential expression (default: “delta_fraction”). Options: “delta_fraction”, “ratio_average”.

Returns:

A DataFrame containing differential gene expression results between conditions.

highest_measurement(feature: str = None, number: int = None)

Retrieves the top N cell types and tissue combinations with the highest expression of a specified gene across datasets. It helps identify which cell types most highly express a gene of interest in different diseases and tissues.

Parameters:
  • feature (str) – The gene to query (e.g., “IL6”).

  • number (int, optional) – Number of top-expressing cell types to return (default: 10).

Returns:

A DataFrame containing the top cell types with the highest expression of the specified feature.

average(features: str = None, disease: str = None, cell_type: str = None, tissue: str = None, sex: str = None, development_stage: str = None, unique_ids: str = None, include_normal: bool = None)

Get the average expression levels of one or more genes across cell types, tissues, sex, development stage and diseases.

Parameters:
  • features (str) – A comma-separated list of genes to query (e.g., “IL6,AGT”).

  • disease (str, optional) – Filter by disease name (e.g., “diabete”).

  • cell_type (str, optional) – Filter by cell type (e.g., “T cell”).

  • tissue (str, optional) – Filter by tissue (e.g., “lung”).

  • sex (str, optional) – Filter by sex (e.g., “male”, “female”).

  • development_stage (str, optional) – Filter by developmental stage (e.g., “adult”).

  • unique_ids (str, optional) – A comma-separated list of unique IDs from metadata results to filter specific dataset entries.

  • include_normal (bool, optional) – If True, includes the corresponding normal condition alongside the queried disease (default: False). Only applicable when a disease filter is provided. If no disease is specified, results include both disease and normal conditions.

Returns:

A DataFrame containing average expression data for the specified features across conditions.

Note

If unique_ids is provided, it should only be used with the features parameter. Other metadata filters (disease, cell_type, tissue, sex, development_stage) should not be specified, as the unique_ids already encapsulate these metadata conditions. Using both will raise a ParamsConflictError.

fraction_detected(features: str, disease: str = None, cell_type: str = None, tissue: str = None, sex: str = None, development_stage: str = None, unique_ids: str = None, include_normal: bool = None)

Get the fraction of cells expressing the specified features across conditions.

Parameters:
  • features (str) – A comma-separated list of genes to query.

  • disease (str, optional) – Filter by disease name.

  • cell_type (str, optional) – Filter by cell type.

  • tissue (str, optional) – Filter by tissue.

  • sex (str, optional) – Filter by sex.

  • development_stage (str, optional) – Filter by developmental stage.

  • unique_ids (str, optional) – A comma-separated list of unique IDs from metadata results to filter specific dataset entries.

  • include_normal (bool, optional) – If True, includes the corresponding normal condition alongside the queried disease (default: False). Only applicable when a disease filter is provided. If no disease is specified, results include both disease and normal conditions.

Returns:

A DataFrame containing the fraction of cells expressing the specified features across conditions.

dotplot(features: str, disease: str = None, cell_type: str = None, tissue: str = None, sex: str = None, development_stage: str = None, unique_ids: str = None, include_normal: bool = None)

Pepare data for a dot plot, including average expression and fraction detected. The data is suitable for visualizing in a dot plot format, where dot size represents fraction detected and color represents average expression.

Parameters:
  • features (str) – A comma-separated list of genes to query.

  • disease (str, optional) – Filter by disease name.

  • cell_type (str, optional) – Filter by cell type.

  • tissue (str, optional) – Filter by tissue.

  • sex (str, optional) – Filter by sex.

  • development_stage (str, optional) – Filter by developmental stage.

  • unique_ids (str, optional) – A comma-separated list of unique IDs from metadata results to filter specific dataset entries.

  • include_normal (bool, optional) – If True, includes the corresponding normal condition alongside the queried disease (default: False). Only applicable when a disease filter is provided. If no disease is specified, results include both disease and normal conditions.

Returns:

A DataFrame containing dot plot data with average expression and fraction detected for the specified features.