SAbPred: ANARCII

Antigen receptor Numbering And Receptor ClassificatIIon.

ANARCII is a suite of language models able to IMGT number amino-acid sequences of antibody or T cell receptor variable domains.
Use the form below to identify domains (H, K, L or A, B, D, G) and annotate them according to the IMGT numbering scheme.
The suite of language models have been trained on IMGT numbered antibody or TCR sequences.
Models can be accessed by choosing the type of sequences to be numbered (Antibody, TCR or Shark) as well as the inference mode (Speed or Accuracy).
If another numbering scheme is required such as Kabat or Martin then this can be selected below. The sequence will be IMGT numbered first and then converted to the scheme of choice.
Each language model differs in either the number of hyperparameters or contents of its training data. Depending on the choice of model, differences may be seen in numbering and sequence scores.

The ANARCII Python package is freely available on Github.

For a full description of the pipeline, or if you use this software, please refer to: Greenshields-Watson et al. 2025