pairSEQ breaks down another barrier to fixing, well, everything.

There are a ton of things that make Adaptive a special place — but for me, the real magic is our team’s unique ability to blur the line between biology and computation. A year ago when I was trying to decide if I should join the company, it was Harlan describing pairSEQ that tipped me over the edge. I’ve wanted to write about it here ever since, and now that we’ve published the paper I can finally do it. Woo hoo!

pairSEQ is one of the key advancements driving our expansion into therapeutics. And that’s great, because it puts us one step closer to actually fixing immune-related diseases. But it’s also just a mind-blowing festival of geeky awesome brain crack. That’s what I’m focused on today. 😉

Single-chain immunosequencing is super-useful…

If you’ve been paying attention, you’ll remember that our core technology uses next-generation sequencing to determine the “thumbprints” for millions of adaptive immune cells in a sample of blood, bone marrow or other tissue. These thumbprints, and the way they group together, tell us some incredible things about the state of the immune system. We can use them to track the progression and recurrence of certain blood cancers, predict the likelihood that immunotherapy will be effective, and much more. It’s cool stuff.

But it’s also only part of the story. T-Cells and B-Cells are “heterodimers” … a word that kind of sounds dirty but really just means that they’re composed of a pair of distinct protein chains. T-Cells have “alpha” and “beta” (or “delta” and “gamma”) chains; B-Cells have “heavy” and “light”.

Our core immunosequencing process measures the gene sequences for these chains independently. That is, we can tell you which alpha clones are in a sample, and which beta clones are in the same sample, but we historically haven’t been able to tell you which alphas were paired with which betas.

For diagnostic purposes, this isn’t a big deal. The TCR Beta sequence is incredibly diverse, and its CDR3 “thumbprint” is more than enough to use as a marker for disease (and similar for heavy-chain sequences in B-Cells). This is our bread and butter and frankly we’re rocking the world with it.

… but sometimes you just gotta find the pairs.

Still, as useful as the individual chains are, they only get you so far. It turns out that the “shape” of a T-Cell receptor is determined by the alpha and beta chains together — and it’s that unique shape that enables the cell to precisely target one specific antigen.

This targeting is the basis of some of the most exciting work in immunotherapy: identify a T-Cell that attacks a particular bad guy, then copy that receptor shape to create a therapeutic (the idea behind CAR T-Cell Therapies). Simple in concept, but until you can identify paired chains, basically impossible.

Past attempts focused on the physical biology of single cells. For example: extract the genes from a single cell, paste them together using bridge PCR, and then sequence them as a unit. It works — but only one cell at a time. A more recent approach tries to automate the process by isolating single cells in tiny droplets of oil. This seems to work better, and has identified thousands of pairs, but is cumbersome to manage at scale.

Hooray for math!

pairseqThis is where the Adaptive magic — combining biology and computation — makes the difference. Harlan and his team realized that we didn’t need to isolate the individual cells at all. Because the sequences are so highly diverse, we can instead use probability and combinatorics to do the hard work for us. Here’s how:

  1. Take a sample and distribute it randomly across N (we used 96) wells.
  2. Amplify the alpha and beta chains within each well, just as we’d do for traditional immunosequencing.
  3. Use standard barcode adapters to tag each chain with a unique identifier corresponding to the well it was placed in.
  4. Mix the whole soup back together and run it through the sequencer.

Now, say we’ve found alpha sequence A in wells A1, B5 and E3. We then find beta sequence B in the same wells. Because we know the number of wells and the total number of cells we started with, we say with X% confidence that these chains must have come from a pair. Want to be more than X% sure? Just add more wells.

Of course, the math is a bit more complicated that than, because there are a bunch of confounding factors. Like, even though sequence B may have actually been present in a well, our PCR process may have missed it. So the paper is, like any good scientific piece, full of impenetrable equation porn.

But the basics are pretty simple — and incredibly effective. Our first run identified more than an order of magnitude more pairs than previous known methods, and did so using standard lab equipment and consumables. Therapeutics here we come.

This is why bringing both biology and computation to the party makes such a difference. We simply have double the weaponry at our disposal to attack hard problems. And dang if we don’t use those weapons really well. I’m super-proud to be a part of it.

Yeah, just another day at the coolest company around.

Leave a comment