echemi logo
Product
  • Product
  • Supplier
  • Inquiry
    Home > Biochemistry News > Biotechnology News > Gao Yiqin's research group and Huawei Cloud jointly released an open source dataset of protein multiple sequence alignments

    Gao Yiqin's research group and Huawei Cloud jointly released an open source dataset of protein multiple sequence alignments

    • Last Update: 2021-09-20
    • Source: Internet
    • Author: User
    Search more information of high quality chemicals, good prices and reliable suppliers, visit www.echemi.com

    Source: HUAWEI CLOUD

     

    Recently, Peking University Biomedical Frontier Innovation Center (BIOPIC), Peking University School of Chemistry and Molecular Engineering, Shenzhen Bay Laboratory Professor Gao Yiqin's research group and Huawei jointly launched a protein multiple sequence alignment (Protein MSA) data set.


    The open source Protein MSA data set completely covers the protein sequences in the latest version (released in February 2021) of the UniRef50 database.


    There are more than 440 million protein sequences known to humans, but it is difficult to understand the relationship between proteins based on these single protein sequence databases


    In order to better serve researchers across fields, the Protein MSA data set will be organized into multiple data formats


    Professor Gao Yiqin said: “We encourage and look forward to the full collision and cooperation of experts and talents from the fields of bioinformatics, data science and AI research to introduce, improve or design new AI models to fully explore the hidden hidden in the Protein MSA data set.


      From a scientific point of view, the quantity and quality of MSA have largely affected the prediction speed and accuracy of the most advanced structural models, and the non-parametric algorithm that generates MSA is still one of the main steps in determining the speed of many protein prediction methods.


      

      The release of the database, relying on the HUAWEI CLOUD AI Gallery platform, can fully guarantee the access and download of data sets by users at home and abroad, and provide advanced data maintenance solutions that can be continuously updated and expanded, and related support for downstream AI applications and deployment.


      

      Attached:

      Data set open source description:

      https://gitee.


      Data set download address:

      https://marketplace.


      

      references:

      [1] AlQuraishi, Mohammed.


      【2】Suzek, BE, Wang, Y.


      [3] Mirdita M.


    This article is an English version of an article which is originally in the Chinese language on echemi.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to service@echemi.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

    Contact Us

    The source of this page with content of products and services is from Internet, which doesn't represent ECHEMI's opinion. If you have any queries, please write to service@echemi.com. It will be replied within 5 days.

    Moreover, if you find any instances of plagiarism from the page, please send email to service@echemi.com with relevant evidence.