Research Article
CAVaLRi: An Algorithm for Rapid Identification of Diagnostic Germline Variation
Figure 3
Calculating phenotype concordance between patient and disease phenotype sets. (a) Given a set of patient phenotypes () and a set of disease phenotypes (), CAVaLRi calculates phenotype-disease concordance by iterating through each patient phenotype term and comparing to . First, the ancestral closure is determined for by selecting all nodes separating from the root node (HP:0000118 in the case of the Human Phenotype Ontology, or HPO). Next, the ancestral closure of () is defined, which is a union of for . (b) A disease-phenotype frequency lookup table () is then populated by propagating annotated disease-phenotype frequencies up the HPO graph. When a node has more than one child node, the maximum disease-phenotype frequency amongst child nodes is assigned. Gene counts () are indexed for all and . (c) is calculated according to Supplemental Materials Equation 5 ( when , when ). For , the set of common ancestors () is determined by intersecting with . The common ancestor term with the highest () is identified and used to complete Equation 5 calculation.