Sequencing of TCR repertoires is known to be a hard task due to the need to physically link or label natively paired alpha and beta chains. With the advent of 10xGenomics sequencing, full-length natively paired Alpha/Beta sequences can now be obtained, although at the expense of the T cell throughput number in comparison to unpaired Illumina sequencing.
The Observed TCR Space (OTS) database now provides access to annotated paired sequences from 10xGenomics studies. To date, OTS collates over 1.6M non redundant paired sequences from 50 different studies. The data is available for download or you can filter the sequences with respect to certain metadata parameters using our search form. To download the data go to the Search page.
Paired sequences in OTS can be filtered according to attributes such as species, disease, treatment etc. The fields are non-exclusive, meaning that the user could choose a combination of fields that does not exist in our database.
Similarly to the unpaired version of OAS, all datasets are organized into studies, that are in turn subdivided into data-units. A single data-unit is a set of sequences uniquely identified by its metadata. The range of meta-parameters are:
In the ideal case scenario, 10xGenomics sequencing would yield 1-to-1 alpha/beta chain pairings for each interrogated T cell. However, in many cases more than one alpha and/or beta chain sequences harbour identical 10xGenomics cell barcodes. Linking such alpha and beta chain sequences can lead to combinatorial inflation of the real sequence number and incorrect estimation of the repertoire diversity - see example relating to antibody heavy and light chains (Figure 2).
One solution is to filter out sequences whose 10xGenomics barcodes are shared between more than one unique alpha and beta V(D)J recombination events. This step has already been performed for each data unit in OTS.
If you would like to contact us or to submit your study to OTS, please drop an email to opig@stats.ox.ac.uk.
Matthew I. J. Raybould & Alexander Greenshields-Watson et al.
“The Observed T cell receptor Space database enables paired-chain repertoire mining, coherence analysis
and language modelling” 2024. [ link]
Download BibTex Reference