Science and Technology in Russian Federation

Sign in Sign up
August 20 | The FTP 2014-2020

Russian Scientists to Create a "Protein" Search Engine

Just a drop of blood - and diagnosis is ready. Obtaining of knowledge about all the proteins of living organisms and the development of technologies to identify them in biological samples are on the way to such a desired future of personalized medicine. One of these techniques – a "search proteomic machine" – is developed by the scientists of the RAS V.L. Tal'roze Institute of Energy Problems of Chemical Physics.

In the world perfect for medical science the proteins will become clear and controlled mechanisms that allow doctors to diagnose all diseases and to determine an effective scheme of treatment for each case and patient just with a blood test. This bright future is supposed to be connected with the development of proteomics - the science that studies the functions and interaction of hundreds of thousands of proteins in living organisms. The basement for "proteomic revolution" has been prepared for more than 20 years by scientists of almost all countries of the world. One can argue about how successfully the promises given by proteomics at the beginning of its way are implemented today. However, it is absolutely clear that the current stage of its development is an active development of technologies in the field of gas chromatography and mass spectrometry. Current instrumental methods allow literally seeing through the proteome of a biological sample, e.g., blood or tissue of a body, defining properties and relationships of proteins contained therein. Of course, a huge amount of data obtained in such experiments is not to be treated by human brain. It will take years to organize by hand the materials of one day work of modern gas chromatography-mass spectrometry equipment. You need specialized software solutions to use this information.

Mikhail_Gorshkov Mikhail Gorshkov

In Russia, the system for processing of massive amounts of experimental proteomic data, which is also called "proteomic search machine" is being created by a team of scientists from the RAS V.L. Tal'roze Institute of Energy Problems of Chemical Physics with the support of the "Research and development on priority directions of scientific-technological complex of Russia for 2014-2020" federal target program.

It will be a search engine something like the familiar Yandex or Google, but produces a specialized search in genomic databases of proteins and focused on certain circle of users associated with biology, bioinformatics and medicine.

"Working with the search engine will be organized in the usual access to its functions via the Internet. The user will have access by login and password to his personal account on server mounted with a search engine, where temporary storage of experimental data will also be organized. Next, it will be able to analyze on its own the information downloaded online and to get a tabular list of proteins that were present in the sample", - Mikhail Gorshkov, the head of the development team, head of the laboratory of physical and chemical methods of analysis of the structure of substances of the Institute, says.


Identification and comparison of the relative quantification of protein content in the samples of patients and healthy people will allow scientists to make assumptions about possible protein markers of pathological processes occurring in a particular living organism. These "signal points" of diseases that are likely to be personalized, can be used to design drugs of directed action soon, eliminating the cause of failures of a particular organism.


Russian researchers are certainly not the pioneers in the creation of "exploratory proteomic machines". Since the beginning of the 2000s, with the development of gas chromatography and mass spectrometry, the accumulation of  big experimental data many similar resources have been developed in the world, both commercial and free access. However, only of very skilled bioinformaticians are able to work with "open systems", and the weakness of commercial products is that they are too expensive: the cost of their annual license starts at $ 10 000. Russian search engine is going to be more affordable for domestic consumers.

In addition to cost advantages, it will have original scientific content. Capabilities pledged in it will allow optimizing the search options in automatic mode without human intervention that will simplify the use of the system by analysts or even biology students who do not know computational mathematics, the fundamentals of bioinformatics and don’t have programming skills on the needed level.

"Unfortunately, there are not many research groups in our country that are engaged in computational proteomics and understand all the complexities and aspects of the case. In our country analysis of biological samples is primarily conducted by the biologists and biochemists. They need easy to use, but nevertheless having all the attributes of existing proteomic systems search engine", - Mikhail Gorshkov says.


Domestic machine to find proteins is almost ready: program codes and the server are designed. The "packing" is needed: refinement of the interface and of the design of an Internet resource and of course bringing this resource to the end user.

According to the participants of the project, despite the fact that the introduction of deep proteomic analysis of biological samples to the practical diagnosis will depend primarily on the development of instrumental methods, the development of computational methods for data analysis will significantly advance the research in the field of proteomics. Russian search engine developers hope that their efforts will also contribute to the approximation of the era of personalized medicine long-awaited by the mankind.


Comment: Mark Ivanov, graduate student of MIPT:  "In addition to handling of gas chromatography-mass spectrometry data and the identification of peptides and proteins present in the sample, what any similar search engine makes, we added the processing of already received identifications using 12 parameters, derived from already obtained experimental data. This not only increases the number of identifications of the peptides, but also increases their reliability".


votes total: 0