Cell clustering is one of the most common routines in single cell RNA-seq data analyses, for which a number of specialized methods are available. The evaluation of these methods ignores an important biological characteristic that the structure for a population of cells is hierarchical, which could result in misleading evaluation results. In this work, researchers from Brown & Emory Universities develop two new metrics that take into account the hierarchical structure of cell types. The researchers illustrate the application of the new metrics in constructed examples as well as several real single cell datasets and show that they provide more biologically plausible results.
Illustrative examples for using RI/MI and wRI/wMI to evaluate the clustering results
a, b Two examples of hierarchical relationship between a group of A1, A2, B1, and B2 cells. Texts under the trees indicate cell types from R, reference; C1, clustering 1; and C2, clustering 2. c Confusion matrices of two clustering and measures of clustering performance under reference a or b
Availability – The implementation of the method presented is available at https://github.com/haowulab/Wind as an open source software under GPL license