echemi logo
Product
  • Product
  • Supplier
  • Inquiry
    Home > Biochemistry News > Biotechnology News > The world-leading third generation of gene algorithm sequencing technology, sequencing only takes one day.

    The world-leading third generation of gene algorithm sequencing technology, sequencing only takes one day.

    • Last Update: 2020-08-05
    • Source: Internet
    • Author: User
    Search more information of high quality chemicals, good prices and reliable suppliers, visit www.echemi.com
    When Illumina genomic sequencing technology entered the market 10 years ago, unprecedented volumes of data eliminated earlier lysing tools for sequencing.
    history is always repeated.
    today, third-generation sequencing technology has reached a critical point in the scale of low-cost group sequencing. On December 10,
    , Nature-Methodology published its first assembly algorithm online that can keep up with the speed at which genome sequencing is produced.
    the paper's author, Dr. Yu Heng of the Institute of Agricultural Genomics of the Chinese Academy of Agricultural Sciences, and Li Heng, Ph.D., Harvard Medical School, the Author of the new third-generation sequencing data assembly algorithm called Wtdbg.
    the embarrassment of third-generation sequencing 20 years ago, deciphering the human genetic code was a challenging scientific project, with the Human Genome Sequencing Program and the Manhattan Atomic Bomb Project, the Apollo Project and the Three Science Projects.
    Now, completing a person's genome-wide sequencing is a "normal thing" that ordinary laboratories and even families can afford.
    using third-generation sequencing technology to complete individual genome-wide sequencing in just one day, the cost has been less than 50,000 yuan.
    2011, PacBio officially announced the commercialization of third-generation single-molecule sequencing.
    compared to the hundreds of base pairs sequenced in the second generation sequencing sequence, the average read length of the third generation sequencing reaches tens of thousands of base pairs, up to millions of base pairs. Qiu Qiang, a professor at the School of Ecology at Northwestern University of Technology in
    , told China Science Daily that when the technology emerged, researchers were looking to use it to fill high-repeating, high-hybrid regions in the genome sequence and challenge the difficult genome.
    However, it was quickly discovered that the popularization and application of this new technology had encountered great difficulties.
    " there are two main reasons: the cost of third-generation sequencing is much higher in the initial period than second-generation sequencing, and due to the high rate of third-generation sequencing errors, the previous assembly methods used for second-generation genome sequencing have failed, and the lack of efficient assembly tools, especially the Falcon method officially launched by PacBio, consumes a lot of resources.
    " Qiu Qiang introduced, a few years later, Ont introduced nanoporous sequencing technology, market competition gradually reduced the cost of third-generation sequencing.
    in genome assembly, despite the presence of a number of assembly software, such as canu and marvel, "assembly is still a time-consuming and laborious process, with a mammalian genome taking weeks to assemble."
    , for example, human genome assembly, would have cost 500,000 CPU hours in 2014 and could only be done in a cluster of very large computers.
    ", in this case, it would be inconceivable to assemble and analyze a large number of individuals at the same time.
    "but the reality is that sequencing of populations by genome-wide assembly has become a trend in biomedical research."
    ," he said.
    for the first time: Data analysis is faster than producing "wtdbg" and the upcoming tools could fundamentally change the current practice of sequencing data analysis.
    ," he told China Science Daily.
    previously, "data output is much faster than data analysis." "So in recent years, a group of scientists in the field of bioinformatics has worked to change this awkward situation and to develop more efficient assembly analysis algorithms,"
    .
    , for example, following falcon, canu and other algorithms, in April 2019, Pavel A. Pevzner, director of the NIH Computing Mass Spectrometry Center at the University of California, San Diego, published the Flye algorithm in Nature-Biotech at much faster rates than Falcon and Canu.
    , and The third generation sequencing data assembly algorithm wtdbg, officially published by Yu and Li Heng, is five times faster than the Flye algorithm, and for the first time makes data analysis less time than the output time. Scientists at the School of Ecology at Northwestern University of Technology,
    , have assembled more than a dozen mammalian genomes using wtdbg. "We've used assembly methods like falcon and canu, and in comparison, wtdbg has the fastest assembly time, uses less resources and saves a lot of time," Chen Yi, a professor at Northwestern University of Technology in
    , told China Science Daily.
    assembled genomes with high continuity, and the assembly quality is consistent with the current mainstream genome assessment.
    " in particular, for the assembly of very large genomes, wtdbg should be one of the few assembly softwares that can be used efficiently today.
    ", wtdbg is dozens of times faster than published tools for human genome data, while achieving considerable continuity and accuracy.
    it represents a major advance in algorithms and paves the way for future assembly analysis of group sizes.
    ," he said.
    Fuzzy Bruins came out in the 1990s, Pavel A. Pevzner introduced De Bruyne to genomic assembly.
    De Bruyne is a directional diagram that shows the overlapping relationship between symbol sequences.
    Due to the low rate of second-generation sequencing errors, most short strings (k-mers) are correct, and the same short strings can be combined to form an assembly diagram using the principles of De Bruyne.
    but the error rate of the third generation of sequencing data is very high, if you still use a short string of k-mer, most of the short string with sequencing errors, can not be combined.
    As a result, de Bruinto has never been successfully applied to third-generation sequencing data.
    breakthrough method is based on the ground-breaking theoretical basis.
    2013, Gong and Li Heng began to solve the problem of third-generation sequencing assembly, respectively, the development of SMARTdenovo and Miniasm in the field have a better performance.
    then designed a new theory of assembly map based on de Bruin.
    they redefined the "short string", cutting the sequencing data into new short string k-bins of fixed length, which is longer than k-mer.
    "new fuzzy Bruin diagram sempument for high noise data and subsequently did a lot of corresponding reconstructions to generate assembly diagrams and restore genome sequences, making them both efficient and high fault tolerance.
    ," he said.
    "general software assembly of the third generation of sequencing data is the idea of the sequencing data to compare and correct errors, and then the genome sequence construction.
    " Qiu Qiang said that wtdbg is directly genomic assembly, avoiding the time-consuming steps that need to correct errors in advance and directly obtaining a relatively reliable assembly result.
    " the real improvement in the problem of the time-consuming and laborious assembly began with the wtdbg algorithm developed by Gong and Li Heng.
    ," Qiu said.
    in their subject group, wtdbg algorithm has been widely used, greatly improving the efficiency of work. Not only
    , they also had in-depth communication with the nion, the supergenome assembly was optimized, "we have to get about 40G of high-quality genome sequence."
    technical improvements with public participation in 2016, in order to make the genome sequencing field timely use of new technologies, Zhai and Li Heng will be free of charge wtbg research results.
    3 years, wtdbg has not only been cited by dozens of academic papers, but also by a number of domestic genome sequencing analysis companies as the main assembly analysis tools, and in the 2019 World University Supercomputing Competition as a performance test ingons.
    " We have received a lot of feedback through emails, GitHub websites, and so on, which not only helped us to fix vulnerabilities in algorithmic software, but also brought new ideas and ideas.
    to put it another way, the paper now published has gone through more than 3 years of 'public review', thanks to the many years of participation and attention to wtdbg development peers.
    ," he said.
    Qiu Qiang believes that the wtdbg algorithm not only has the advantages of efficiency and accuracy over earlier falcon, canu and other algorithms, but also more reliable than the assembly algorithms such as flye, which have appeared since then.
    " This research result shows that China has the leading international strength in the field of genomic algorithm, and also represents the soft power of China's scientific and technological development.
    "Source:BrightNet.
    This article is an English version of an article which is originally in the Chinese language on echemi.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to service@echemi.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

    Contact Us

    The source of this page with content of products and services is from Internet, which doesn't represent ECHEMI's opinion. If you have any queries, please write to service@echemi.com. It will be replied within 5 days.

    Moreover, if you find any instances of plagiarism from the page, please send email to service@echemi.com with relevant evidence.