Some superfamilies contain only a single conserved domain model singletonand these are not indexed in Entrez.

Only superfamilies that contain two or more conserved domain models are indexed in Entrez and will therefore appear in search results.

Superfamily members are clustered through an automated process that involves the following steps: Identify domain models that have overlapping hits on sequences in the Entrez Protein database from at least five different identical protein groups IPGs.

Store the overlapping domain models as pairwise associations, and use those pairwise associations to populate a similarity matrix.

NCBI CDD Curators attempt to split "children" nodes where they see evidence for ancient gene duplications resulting in orthologous groups, often occurring together with functional divergence.

An illustrated example of a subfamily hierarchy is provided below. A superfamily cluster is a set of conserved domain models that generate overlapping annotation on the same protein sequences.

It enables you to view a graphical display of the concise or full search result for any individual protein from your input list, or to download the results for the complete set of proteins.

