echemi logo
Product
  • Product
  • Supplier
  • Inquiry
    Home > Biochemistry News > Biotechnology News > Deep Mind meets rival, and Meta AI predicts 600 million protein structures

    Deep Mind meets rival, and Meta AI predicts 600 million protein structures

    • Last Update: 2022-11-14
    • Source: Internet
    • Author: User
    Search more information of high quality chemicals, good prices and reliable suppliers, visit www.echemi.com
      

    Deep Mind, Google's artificial intelligence (AI) company, this year unveiled the predicted structure of 220 million proteins, covering nearly every protein
    in a DNA database of known organisms.
    Now, another tech giant is filling the protein universe with dark matter
    .

    Researchers at Meta (formerly Facebook) used artificial intelligence to predict the structure of about 600 million proteins from bacteria, viruses and other microbes
    that have not yet been characterized.
    The study was published Nov.
    1 on the preprint site BioRxiv
    .

    "These are very mysterious proteins that offer the possibility
    of gaining insight into biology.
    " Alexander Rives, head of research in the Meta AI protein team, said
    .

    The team generated these predictions
    using a "large language model.
    " A "large language model" is a type of artificial intelligence that serves as the basis for
    tools that predict text from a few letters or words.

    Usually language models are trained on the basis of a large amount of
    text.
    To apply it to proteins, Rives' team "feeded" them known protein sequences that could be represented by 20 different amino acid chains, each represented by a letter
    .
    The model then learned to "autocomplete" proteins
    in the case of ambiguous amino acid ratios.

    Rives says this training gives the model an intuitive understanding of protein sequences, which contain information about the shape of
    proteins.

    The second step, inspired by DeepMind's pioneering artificial intelligence algorithm for protein structure, AlphaFold, combines this insight with information about the relationships between known protein structures and sequences to generate predictive structures
    from protein sequences.

    Earlier this summer, Rives' team reported that its model algorithm, called ESMFold, is not as accurate as AlphaFold, but about
    60 times faster at predicting structures.
    "This means we can scale structure prediction to a much larger database
    .
    " Rives said
    .

    As a test case, the team decided to apply the model to a large-scale sequencing database of "metagenomic" DNA from the environment, including soil, seawater, human gut, skin, and other microbial habitats
    .
    The vast majority of DNA entries encoding potential proteins come from organisms that have never been cultured and are unknown to scientists
    .

    In total, the Meta team predicted the structure of more than 617 million proteins, and the work took only two weeks
    .
    Rives says predictions are free and can be used by anyone, just like
    the underlying code of the model.

    Of those 617 million predictions, the model considers more than one-third of the predictions to be of high quality, so researchers can be confident that the overall shape of the protein is correct, and in some cases, the model can identify finer atomic-level details
    .
    It's worth mentioning that millions of these structures are completely new, unlike
    the experimentally determined protein structure database, or the AlphaFold database predicted from known organisms.

    A large portion of the AlphaFold database is made up of structures that are nearly identical to each other, while the metagenomic database is supposed to cover a large portion
    of the never-before-seen protein universe.

    Sergey Ovchinnikov, an evolutionary biologist at Harvard University, is skeptical
    of ESMFold's hundreds of millions of predictions.
    He believes that some proteins may lack a defined structure, while others may be noncoding DNA, mistaken for protein-coding material
    .

    Burkhard Rost, a computational biologist at the Technical University of Munich in Germany, was impressed
    by the speed and accuracy of Meta's model.
    But he questioned whether predicting proteins from metagenomic databases was really more
    accurate than AlphaFold.
    Prediction methods based on language models are better suited for quickly determining how mutations change protein structure, which
    AlphaFold cannot do.

    According to a representative of DeepMind, the company currently has no plans to make metagenomic structure predictions in its database, but does not rule out the possibility of
    doing so in the future.

    Martin Steinegger, a computational biologist at Seoul National University in South Korea, believes that the next step in such tools is clearly to study dark matter
    in biology.
    "We'll soon see an explosion in the analysis of these metagenomic structures
    .
    "

    Related paper information: https://doi.
    org/10.
    1101/2022.
    07.
    20.
    500902


    This article is an English version of an article which is originally in the Chinese language on echemi.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to service@echemi.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

    Contact Us

    The source of this page with content of products and services is from Internet, which doesn't represent ECHEMI's opinion. If you have any queries, please write to service@echemi.com. It will be replied within 5 days.

    Moreover, if you find any instances of plagiarism from the page, please send email to service@echemi.com with relevant evidence.