A joint team of scientists and developers will create a neural network algorithm for monitoring the Baikal ecosystem. The algorithm will automatically analyze samples of Baikal water, recognize and classify the microorganisms contained in it. Such an analysis will facilitate the work of scientists who now have to distinguish between more than 400 species of Baikal plankton and systematize the data manually.
The new technological solution will be used in the Baikal ecological monitoring project «Point No1». The project consists in regular analysis of phyto- and zooplankton in the lake water. Observations show how the Baikal ecosystem is developing and how it is affected by climate change on the planet. The algorithm will not only automate the analysis of plankton, but also scale the project by opening new observation points.
Specialists of the Irkutsk State University Research Institute of Biology, developers of artificial intelligence models for the study of marine ecosystems MaritimeAI, the team of the Yandex.Cloud platform and the Lake Baikal Foundation for the Support of Applied Environmental Research and Development are taking part in the work on the creation of the algorithm.
To train the algorithm, scientists from the Institute of Biology of the ISU provided more than 1,000 images of each type of plankton. Based on these data, the MaritimeAI team will create a mechanism for classifying plankton species using Yandex DataSphere, a Yandex.Cloud service for data analysis, development and operation of machine learning models. Images of microorganisms will be transmitted to Yandex.Cloud directly from the microscopes of the laboratory of the Institute of Biology of the ISU, and the algorithm will automatically determine the species of plankton particles. It is assumed that the algorithm will determine up to 99% of all plankton species, and the specialists of the Institute of Biology will monitor the quality of its work. A working prototype of the system will be presented this summer.
The «Point No1» program appeared in 1945 and is included in the Russian Book of Records as the longest project of regular environmental monitoring in the history of science. For more than 75 years, scientists have been taking water samples from depths from 0 to 800 meters every 7-10 days. The accumulated data make it possible to monitor the state of the Baikal ecosystem and predict its development.
Why scientists and developers joined forces
In recent years, the «Point No1» project has been under threat of closure. The data recognition technique currently used in the project is technologically outdated. Scientists determine the types of microorganisms using classical microscopy methods. To do this, a specialist needs to learn to distinguish between more than 400 types of phyto- and zooplankton, the training of such a specialist takes more than 10 years of continuous practice. To maintain the project, several dozen high-level specialists would be required, while agreeing to perform routine operations. To preserve and develop the project, the scientists of the Institute of Biology of the ISU and the staff of the Lake Baikal Foundation formulated a goal — to create an intelligent system of digital support for the process of sample analysis using artificial intelligence technology, which can be trained to recognize microorganisms in order to automate the bulk of routine work of scientists.
The practical implementation of the task rested on a barrier — the creation of such a neural network from scratch required technical expertise and IT infrastructure, which the Institute of Biology of the ISU did not have.
Experts in the field of the Yandex.Cloud AI platform suggested using cloud computing power in the project, as well as the DataSphere ML development service, which accelerates the development of artificial intelligence models. The Yandex.Cloud team also helped to attract experts in creating ML algorithms for studying marine ecosystems to the project – the company MaritimeAI.
Alexey Bashkeev, Head of the Yandex.Cloud platform:
— Now scientists determine the types of microorganisms using classical microscopy methods. To do this, they need to learn to distinguish between more than 400 forms of phyto- and zooplankton, it takes more than 10 years of almost continuous work. At Yandex.Cloud, we decided to help scientists apply the new Yandex DataSphere service to facilitate their work and bring the unique project of collecting and analyzing data on the standing of Lake Baikal to a new level.
Maxim Timofeev, Doctor of Biological Sciences, Director of the Institute of Biology of the ISU:
— The phyto- and zooplankton community is essentially the foundation of the entire Baikal ecosystem. Understanding the processes in this foundation, their dynamics, we can make predictions on the vectors of development of the entire ecosystem of the lake. The monitoring project «Point No1» is unique in that it allows you to make an analysis based on long-term and continuous series of observations accumulated over 75 years. The partnership with Yandex.Cloud will solve the important task of transferring monitoring from the technological approaches of the XX century to the paradigm of the XXI century: from manual analysis of samples to methods using machine recognition and learning. At the same time, we will be able not only to maintain the continuity of the entire multi-year program, but also to scale the project by launching new observation points.
Anastasia Tsvetkova, CEO of the Lake Baikal Foundation for the Support of Applied Environmental Research and Development:
— The joint work of the Foundation, Yandex.Cloud and other partners meets the 17th UN Sustainable Development Goal, which draws attention to the value of multilateral cooperation, including through the mobilization of resources, technology and knowledge. For five years, the Lake Baikal Foundation has been supporting the project of long-term monitoring of Lake Baikal «Point No1» with grants. In 2016, we helped to avoid the closure of the program and have been supporting its comprehensive development ever since. Connection to the Yandex.Cloud project opens up new prospects for monitoring in the field of implementing machine learning technologies for regular analysis of phyto- and zooplankton samples of Lake Baikal. This cooperation is a direct evidence of how business, science and society can cooperate in the implementation of the ESG agenda.
Pavel Golubev, CEO OF MaritimeAI:
— The MaritimeAI team combines expertise in the field of geology and oceanology together with the latest achievements in the field of machine learning and artificial intelligence. This project is special for us for many reasons. Firstly, it is an opportunity to apply our knowledge and our experience to monitor the largest freshwater reservoir on the planet. Secondly, unlike our previous automation projects, here we are dealing with a unique scientific observation process with a length of 75 years. There are ocean scientists and geneticists in our team, and we perfectly understand the importance of preserving the observation process itself during its digitalization. Finally, it is important for us that this project, unlike our previous ones, is not industrial, but environmental. One of the key factors for the success of the project is the speed of its implementation. It is with this that the capabilities of Yandex DataSphere will help us. We use high-performance virtual machines with 4-8 GPUs, thanks to which the algorithm training time has been reduced from hours to minutes. We also use the Yandex service in the process of learning the algorithm, namely in data markup.Toloka.