GraphDDP – a graph-embedding approach to detect differentiation pathways in single-cell-data using prior class knowledge

Cell types can be characterized by expression profiles derived from single-cell RNA-seq. Subpopulations are identified via clustering, yielding intuitive outcomes that can be validated by marker genes. Clustering, however, implies a discretization that cannot capture the continuous nature of differentiation processes. One could give up the detection of subpopulations and directly estimate the differentiation process from cell profiles. A combination of both types of information, however, is preferable. Crucially, clusters can serve as anchor points of differentiation trajectories.

Here researchers from the University of Exeter and the University of Freiburg present GraphDDP, which integrates both viewpoints in an intuitive visualization. GraphDDP starts from a user-defined cluster assignment and then uses a force-based graph layout approach on two types of carefully constructed edges: one emphasizing cluster membership, the other, based on density gradients, emphasizing differentiation trajectories. The researchers show on intestinal epithelial cells and myeloid progenitor data that GraphDDP allows the identification of differentiation pathways that cannot be easily detected by other approaches.

Steps in the visualization approach

rna-seq

a Each cell is initially assigned to the class as determined by the user-provided clustering; furthermore, additional pre-processing such as filtering and feature selection is done. b For each pair of cells the similarity of the expression profiles is calculated using different metrics (see Methods). c To emphasize class membership in layout, we add for each cell an edge to the k-nearest neighbors of the same class; each edge is annotated with the desired distance between the two cells. d To visualize differentiation pathways, we add another type of edge called k-shift-edges, which connects cells to the k′ densest neighbors of a different class. e A force layout algorithm interprets each edge as a spring. f The optimal 2D configuration is determined minimizing the total energy of the systems. g We determine the convex hull of a given class in the layout. h Ternary plots are provided to further investigate differentiation pathways. Using a multi-class prediction approach, cells that are clearly members of a class are close to the corners, cells on the differentiation pathway between two classes lie on the corresponding edges, and undetermined ones are placed in the center of the plot

Availability – The source code is available at: https://github.com/fabriziocosta/GraphEmbed.

Costa F, Grün D, Backofen R. (2018) GraphDDP: a graph-embedding approach to detect differentiation pathways in single-cell-data using prior class knowledge. Nat Commun 9(1):3685. [article]

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Time limit is exhausted. Please reload CAPTCHA.