Keith Nykamp

Sherloc: Standardized and consistent variant classification

Keith Nykamp, Lead Scientist, Invitae

Genetics is the future of medicine. But how soon will it arrive? Thanks to tremendous advances in DNA sequencing technology, we can now quickly and at a remarkably low cost identify the letters of a genome. Yet the biggest challenge is still ahead. How do we translate the letters into a story, and what does this story tell me about my health? Will I develop cancer? Are there drugs that can stop it? Will my siblings or children also have cancer?

Genetics has the answers, but it’s in a language we are still learning.

Genetic variants are changes in the letters and meaning of a person’s genome. Variant classifications can directly impact surgery, surveillance, and treatment. But as genetic testing becomes more common and clinicians test more genes, there is growing concern that inconsistent classifications associated with a genetic variant will add to the uncertainty for patients, slow our understanding of genetics, and limit the potential for improving healthcare.

What does it mean for a variant to be “pathogenic” or “benign”? What about “likely pathogenic” or “uncertain”? When a rare variant is detected, how are these conclusions reached?

At Invitae, the way we build consistent and accurate classifications is through a careful evaluation and logical application of evidence, using a refined schema derived from the recent standards and guidelines published by the American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) (Richards et al., Genet Med 2015).

This checklist represents a major step forward in the standardization of evidence assessment. However, many of the evidence criteria are quite broad and require a fair amount of subjective interpretation, which could result in inconsistency in their application. To date, a rigorous study has not been published examining the impact of these new guidelines on variant classification and clinical reporting. It also remains to be seen whether the new evidence checklist results in increased interpretation concordance between clinical laboratories.

Using the 2013 ACMG draft checklist as a framework, we developed a score-based classification method, called Sherloc, designed to be scalable and consistent across a large clinical variant-interpretation team. This method has been revised and refined based on experience with over 11,000 variants and the input of our highly qualified PhD scientists, board-certified medical geneticists, and genetic counselors.

As with any clinical or laboratory technique, validation is essential. To assess the concordance of Sherloc with current community standards, we compared Sherloc classifications of over 800 variants to a consensus classification derived from entries with multiple submissions in ClinVar, a public database that collects and reports variant classifications for clinically important genes. We found that Sherloc interpretations are often (92.2 percent) in the consensus majority—indeed, more so than other ClinVar submissions when benchmarked equivalently (85.7 percent). We also compared classifications based on strict adherence to the ACMG checklist and found these to have a lower overall agreement with the ClinVar consensus (61 percent) and a higher rate of variants of uncertain significance.

While a large number of ClinVar records had multiple submissions and a derived consensus interpretation, many did not. To get a sense of the overall concordance with ClinVar, we also compared Sherloc classifications to individual ClinVar submissions and those without a consensus. The overall agreement was significantly lower (67 percent), although the vast majority (95 percent) of the differences were relatively minor and found among VUS, likely benign, and benign classifications.

To better understand the nature of the disagreements for these discordant submissions, we reevaluated our own classifications. This was also a great opportunity to assess the robustness of the Sherloc method by asking variant analysts to reinterpret 42 of the most clinically significant disagreements. (The analysts were blinded to the original interpretation.) Importantly, these 42 variants are strongly enriched for difficult-to-interpret cases and are not representative of most variants.

Nevertheless, our reinterpretations of these variants showed high reproducibility, with 39 (92 percent) exactly matching the original interpretation. One classification changed from pathogenic to likely pathogenic based on inconsistent evaluation of clinical observations for this variant; two classifications changed (one from benign to likely benign and the other from VUS to likely pathogenic) due to inconsistent application of evidence criteria that, at the time, were still in development.

While disagreements are expected to occur, we found that Sherloc consistently and logically lays out the data such that our team can quickly evaluate the evidence and come to a consensus. In the process, we have improved our understanding of variant interpretation and the language of the genome.

Overall, the Sherloc system has proven to be highly adaptable, efficient, and consistent across multiple disease areas and a large variant-interpretation team. It adheres to the guidelines, but also illustrates the specific application and evolution of the ACMG criteria in a clinical molecular laboratory. We are in the process of publishing and releasing Sherloc for community feedback with the hope that it may be useful to others.

For more information about Sherloc and the comparison between its classifications and ClinVar entries, please view the Invitae webinar “Sherloc: A weighted, score-based variant classification system based on the 2015 ACMG ISV guidelines,” presented by Keith Nykamp.