# About CDR Clustering

**What is CDR clustering?**

CDRs are known to form structurally similar groups termed canonical classes. Here we present our version of the clustering which was developed using the Chothia CDRs from the latest version of the SAbDab database.

**How do we calculate it?**

We calculate the RMSDs for each pair of CDRs with the same CDR type and length for a given definition (kabat, chothia or contact). This produces a distance matrix for each CDR type and length, which is then processed by UPGMA. The output of UPGMA is rooted tree with CDR structures with smaller RMSD distances being closer along branches in the tree.

**How do we calculate the clusters from the UPGMA tree?**

We produce the clusters by descending down the tree from the root and checking at each branching if the CDRs in the given sub-tree all have pairwise RMSDs less than a certain cut-off. In this database we use four cutoffs - 1.5A, 1.0A, 0.75A and 0.5A. Thus if one selects UPGMA cutoff of 1.0A, all the structures within any cluster are guaranteed to be no more than 1.0A RMSD from each other.

**How do we get the mapping to other methods?**

Since the canonical structures of CDRs were studies for a long time now, through work of Chothia (latest work carried out by Al Lazikani