- Commensal human blood viruses are identified in more than 8,000 physiologically healthy individuals.
- More than 80% of the detected viruses are due to contamination from commercial kits or are of environmental origin while the prevalence of 19 human viruses were reliably quantified.
- The characterization of normal human blood virome sets the benchmark for studies aiming at detecting novel human blood pathogens.
The normal practice for metagenomic studies is to map metagenomic sequences to the reference genome of the organism of interest and discard all unmapped reads. In this study, a team from Human Longevity Inc. characterized viruses that exist in human blood by looking at the discarded reads after mapping whole genome sequences from 8K healthy individuals to the human reference genome. The authors detected a total of 94 viruses. Commensal human blood viruses with highest prevalence were: Herpesviruses, Anelloviruses, Merkel cell polyomavirus, and T-lymphotropic virus. Nevertheless, 80% of the detected viruses were due to contamination from commercial kits (e.g. phiX174 used as spike-in control in the sequencing process) or of environmental origin which necessitates stringent control experiments when attempting to detect new human blood pathogens. In addition, viral prevalence associated with age where higher viral loads were identified in younger groups. The findings of this study are of immediate relevance to blood transfusions practices in clinical settings expanding the list of viruses that could potentially be transmitted via blood products. Equally important, the authors also demonstrate how looking at conventionally ignored parts of a dataset (e.g. unmapped reads) can lead to findings of significant biological impact.
Overall, the findings of this study sets a reference for the human blood virome to be used for benchmarking future studies aiming at identifying new human blood viruses.
The blood DNA virome in 8,000 humans
The characterization of the blood virome is important for the safety of blood-derived transfusion products, and for the identification of emerging pathogens. We explored non-human sequence data from whole-genome sequencing of blood from 8,240 individuals, none of whom were ascertained for any infectious disease. Viral sequences were extracted from the pool of sequence reads that did not map to the human reference genome. Analyses sifted through close to 1 Petabyte of sequence data and performed 0.5 trillion similarity searches. With a lower bound for identification of 2 viral genomes/100,000 cells, we mapped sequences to 94 different viruses, including sequences from 19 human DNA viruses, proviruses and RNA viruses (herpesviruses, anelloviruses, papillomaviruses, three polyomaviruses, adenovirus, HIV, HTLV, hepatitis B, hepatitis C, parvovirus B19, and influenza virus) in 42% of the study participants. Of possible relevance to transfusion medicine, we identified Merkel cell polyomavirus in 49 individuals, papillomavirus in blood of 13 individuals, parvovirus B19 in 6 individuals, and the presence of herpesvirus 8 in 3 individuals. The presence of DNA sequences from two RNA viruses was unexpected: Hepatitis C virus is revealing of an integration event, while the influenza virus sequence resulted from immunization with a DNA vaccine. Age, sex and ancestry contributed significantly to the prevalence of infection. The remaining 75 viruses mostly reflect extensive contamination of commercial reagents and from the environment. These technical problems represent a major challenge for the identification of novel human pathogens. Increasing availability of human whole-genome sequences will contribute substantial amounts of data on the composition of the normal and pathogenic human blood virome. Distinguishing contaminants from real human viruses is challenging.
Moustafa A, Xie C, Kirkness E, Biggs W, Wong E, Turpaz Y, et al. (2017) The blood DNA virome in 8,000 humans. PLoS Pathog 13(3): e1006292. doi:10.1371/journal.ppat.1006292