Технологии

Scientists have increased the speed of reading data stored on DNA by 3200 times in — 10 minutes instead of several days


Researchers at the Israel Institute of Technology (Technion) have developed an AI-based method that speeds up the search for data stored in DNA by three orders of magnitude while improving accuracy.

The DNA molecule is responsible for preserving the genetic code of living organisms and consists of a sequence of special organic compounds — nucleotides. They are classified into four types, denoted by the letters A, C, G, and T. Unlike traditional computing, where data is encoded with only two digits (0 and 1), storage in DNA is based on sequences of four letters, which significantly increases the number of possible combinations.

Placing data in DNA can provide truly long-term storage of information (hundreds of thousands of years) and a data density 100 million times higher than existing digital storage. Storing data using this technology requires DNA synthesis — the creation of DNA molecules based on sequences that encode information. To read the stored data, you need to DNA sequencingA method for determining the primary structure of unbranched biopolymers such as DNA. The term is also used to describe the determination of the primary structure of other data types..

Storing information on DNA is associated with several technological challenges. Synthesis and sequencing are lengthy processes prone to deletion, insertion, and substitution errors. Due to the limitations of the synthesis process, multiple copies of each DNA molecule encoding data are created. These copies are stored together, in no particular order. During sequencing, many erroneous copies of these molecules occur — most of them contain errors, and some disappear completely.

Вчені у 3200 разів підвищили швидкість читання даних, збережених на ДНК — 10 хвилин замість кількох днів
Original research illustration / Technion

New research, published in the journal Nature Machine Intelligence, presents a comprehensive computational solution for finding and correcting errors in complex DNA-based storage systems. Using advanced algorithms and encoding techniques, the researchers demonstrated that their solution reduces the time it takes to search and read data from days to 10 minutes.

The DNAformer method developed at Technion is based on a transformer model trained on simulated data generated by a simulator also developed at Technion. The method reconstructs accurate DNA sequences from erroneous copies. It includes a special error correction code adapted for DNA.

An additional safety margin mechanism detects the most noisy DNA sequences (unwanted signals or errors that occur during the sequencing process that can interfere with accurate data interpretation) and applies algorithmic tools for more efficient processing. At the end of the process, the data is converted into digital information.

The new method allows 100 MB of data to be read at a speed that is 3,200 times faster than the most accurate method available without losing accuracy. Compared to previously known fast methods, DNAformer also improves accuracy by up to 40%. This was demonstrated on a 3.1MB dataset that included a 24-second audio recording of astronaut Neil Armstrong’s words on the moon, a written text discussing the benefits of DNA as a promising data storage method, and random data.

The researchers plan to develop individual versions of DNAformer adapted to different needs. They emphasize that their technology is scalable and adaptable, meaning that it can be optimized for large-scale data storage applications in response to market demands.

Source: TechXplore



Source link

Related posts

404 Страница не найдена – ITC.ua

admin

Технологические лидеры и аналитики ликуют, а инженерам-исследователям нужна поддержка

admin

Trump’s sons to launch American Bitcoin

admin

Leave a Comment

Этот сайт использует файлы cookie для улучшения вашего опыта. Мы будем считать, что вы согласны с этим, но вы можете отказаться, если хотите. Принять Подробнее