Welcome to OTS


The Observed T-cell Receptor Space database, or “OTS” for short, is a project to collect single-cell sequenced TCRs from immune repertoires in the public domain, facilitating their use in immunoinformatic studies. It currently contains around 5.33M redundant/1.63M non-redundant alpha/beta TCR sequences deriving from 50 separate studies. These repertoires are predominantly human (a few are murine) and cover diverse immune states and hundreds of individuals.

The data has been consistently processed: sorted, cleaned, annotated, translated, numbered in the IMGT scheme, and organised in an AIRR Community Standard compliant format.

All sequences are available as complete (full-length) variable domain reads. Study metadata has been manually curated for completeness, and the sequencing data is organised into data units (unique combinations of metadata, e.g. individual + T cell type) akin to the Paired Observed Antibody Space database to facilitate precise dataset curation. The OTS web application enables users to rapidly search through the data and download bespoke subsets of sequences.

We aim to update OTS on a quarterly basis, keeping track of the increasing amounts of paired V(D)J sequencing data of TCRs in the public domain.

Please consider citing our preprint if making use of the database for your research.

We note that the data contained within OTS is available under a CC-BY 4.0 license.

Matthew I. J. Raybould & Alexander Greenshields-Watson et al.
“The Observed T cell receptor Space database enables paired-chain repertoire mining, coherence analysis
and language modelling” 2024. [ link]

Download BibTex Reference