Analogy between language and biology
Understanding the mechanisms of viral escape is necessary. The vaccine against the flu must be renewed every year has between 20 and 50% efficiency. There is no vaccine for the AIDS virus and there is already the question of the efficacy of vaccines against COVID-19 in the long term.
Mutations in a protein can affect its function in the same way that the modification or substitution of a word in a sentence changes its meaning.
To escape, a mutant virus must retain its infectivity and its ability to evolve — it must obey a “grammar” of biological rules — and the mutant must no longer be recognized by the immune system, which is analogous to a change in the “meaning” or “semantics” of the virus.
- grammar of the virus : replicative capacity of the virus, its mutations.
- virus semantics : prediction of the antigenic properties of the virus. Antigens are substances foreign to the organism that will stimulate an immune response aimed at eliminating it.
An NLP model to predict viral escape
Bonnie Berger, professor and researcher at MIT, and her team have built an NLP model that for the first time analyzes both the grammar and semantics of a virus. Viral escape occurs when the grammar and semantics of the virus are high.
An important advantage of the model developed by MIT is that it only requires gene sequence information, which is much easier to obtain than protein structure information.
It also has the advantage that it can be trained with a relatively small amount of data. The research team used 60,000 HIV sequences, 45,000 influenza sequences and 4,000 coronavirus sequences in this study.
The model developed by the MIT researchers is based on BiLSTMs. BiLSTMs increase the amount of information available for predictions, improving the context available to the algorithm (for example, knowing which words immediately follow and precede a word in a sentence to be translated).
Creating more effective antivirals and vaccines
The phenomenon of viral escape occurs when genetic mutations may alter the different target areas of viral proteins that the immune system has learned to recognize. If the targets areas are modified, specialized antibodies are no longer able to neutralize them.
By developing this model, the goal of the MIT researchers is to enable the development of vaccines and antivirals that will target those areas of the protein that are least likely to be the site of viral escape.
For example, in the case of the Spike SARS-COV-2 protein, the researchers have validated that it is necessary to target the S2 region which is less prone to viral escape than the RBD region.