echemi logo
Product
  • Product
  • Supplier
  • Inquiry
    Home > Can AI automate chemical synthesis?

    Can AI automate chemical synthesis?

    • Last Update: 2017-07-31
    • Source: Internet
    • Author: User
    Search more information of high quality chemicals, good prices and reliable suppliers, visit www.echemi.com
    Artificial intelligence (AI) is the number one "Internet red" in the scientific community at present From the diagnosis of leukemia in less than 10 minutes by Watson robot, to alphago's defeat of the world's number one go player; from the battlefield to space, it can be seen everywhere AI has already penetrated into the field of scientific research and is playing with scientific research in various ways, including understanding the secrets of chemical synthesis In the 1960s, organic chemistry labs looked like a heaven of alchemy Rows of reagent bottles, old wooden test tubes, and busy chemists at the edge of the case After 50 years of rapid development, the laboratory scene is changing Today's labs have a string of cupboards and analytical instruments However, the essence of researchers' work is the same Organic chemists usually plan their work on paper, depicting hexagons and carbon chains until they come up with the sequence of reactions needed to synthesize a given molecule, and then try to follow that sequence with painstaking hands Chemists have been trying to integrate machines and artificial intelligence into the scientific research process and free their hands from the field by creating devices that can automatically synthesize organic molecules First, the device must be able to access existing knowledge databases on how molecules are synthesized Second, it must be able to feed this knowledge back to an algorithm in order to plan the synthesis steps Finally, it must be able to automatically use the reagents in the machine reactor in sequence Novartis - the last step in the automation system at MIT's continuous production center (source: nature) is the fastest technological advance In the experimental process, people have achieved a lot of progress by using machines For example, pharmaceutical companies usually use automated high-throughput platforms for drug design Transcriptic and emerald cloud lab, two California start-ups, are creating systems to automate almost all of the experimental tasks run by biochemists Scientists can submit experimental plans online, and then the experimental steps are converted into code, input into the robot platform, and then the robot platform automatically performs a series of experiments This solution is most common in disciplines that require intensive experimental operations, such as molecular biology and chemical engineering Although automation is becoming more versatile, teaching a computer to design its own synthesis process is still a big problem Hardware has always been there, but data and software are big problems Sir Francis Bacon, the 17th century British philosopher and the founder of modern experimental science, in his book Novum organization, put forward the scientific discovery model now known as bacon induction: using inductive logic, the observation of specific phenomena is systematically collected, tabulated and objectively analyzed, so as to get a general point of view Bacon's view reveals an important fact: the process of scientific discovery itself is algorithmic It is repeated in a limited number of steps to produce meaningful results Bacon explicitly uses the word "machine" to describe his method His scientific algorithm has three main steps: first, collect the observation results of phenomena and integrate them into a knowledge base; second, form new hypotheses through new observation results; finally, verify the hypotheses through careful experiments If science has algorithms, then it has the possibility of automation Corey established the rule of reverse composition in the 1960s In the next 10 years, Corey developed Lhasa (logic and heuristics applied to synthetic analysis) software, which can use these rules to prompt the sequence of synthesis steps, so that synthesis design can become a science to learn, rather than a unique science with personal color But because the database contains too few responses and too many errors, neither Lhasa nor its successors succeeded In 2001, the Polish Academy of Sciences and the National Institute of science and technology of Ulsan, South Korea started to develop a software called chematica, hoping to help chemists quickly find the best synthetic route So far, the team has manually entered more than 10 million molecules and reactions, and linked them to each other to form a network layout Different from the molecular retrieval databases such as scifner and reaxys, which are widely used in the market at present, the operation of Mathematica is based on "deep learning", which can predict the reaction in a short time and even provide the molecular synthesis pathway not reported in the literature Chemists only need to input target molecules into Mathematica to get reaction routes based on cost, substrate availability and number of steps, which only takes a few seconds Each step of reaction and product will be graded based on two equations: reaction scoring function and compound scoring function If the involved chemical reaction is difficult to operate, the reaction score will be lower; if the involved compound structure is simple or common, the compound score will be higher, and the reaction route will be more reasonable These scoring functions enable Mathematica to evaluate each route and remove those that are obviously not feasible Syntaurus found out the best synthesis route of epicolactone (source: Chemistry World) In 2016, the team developed a new functional module syntaurus, which contains more than 20000 chemical synthesis rules, including groups that can not coexist, protection strategies, and even subtle differences in bond length and bond angle But Mathematica is only one of the synthetic analysis software / database developed in recent years Wiley, the publishing giant, has also developed a chemical synthesis software, champlanner, based on "big data" and "machine learning" As a computer-aided organic synthesis design system, it can help chemists to select simple and efficient optimal methods among various synthesis paths through cloud computing More importantly, the system is not limited to the existing literature, but can use the selected synthesis rules to predict the reaction roadmap, and complete the reverse synthesis analysis from the target product to the available starting material Champlaner can also redesign routes as needed (e.g., cost control, presence or absence of catalysts, etc.) Norquist team failure experiment record (source: Haverford College) the Norquist team of Haverford college in the United States also carried out a similar project, they developed a powerful machine learning algorithm (machine learning Algorithm), through training with a large number of experimental data (including success data, of course), the success rate in predicting crystal preparation strategy is as high as 89% The team used a standard machine learning method, using data from nearly 4000 synthetic crystal experiments under different reaction conditions (such as temperature, concentration, amount of reactants and acidity) to train the machine learning algorithm They converted the data recorded in the archived experiment record book into a format that can be analyzed by the machine, including those failed experiments Then, the computer finds out the principle to distinguish the success or failure of the experiment The team also set up a website called "dark reactions project" (http://darkreactions.haverford.edu/) to encourage chemists to share their failed experimental data in the preparation of new crystals The introduction of artificial intelligence does bring a little wave to organic synthetic chemistry, especially the total synthesis of natural products and drugs It's a bit like a GPS navigation system in the field of chemistry, very good at finding routes, but the current map is still too small The most challenging step in achieving full automation is to collect reliable data on a large scale, but at present, there is not a large enough central database to cover most of the known chemical knowledge The main science publishing houses have set up barriers to the data In addition, the paper itself will tend to the interpretation of the author, which contains comprehensive complex concepts and methods, which is difficult to extract and quantify.
    This article is an English version of an article which is originally in the Chinese language on echemi.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to service@echemi.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

    Contact Us

    The source of this page with content of products and services is from Internet, which doesn't represent ECHEMI's opinion. If you have any queries, please write to service@echemi.com. It will be replied within 5 days.

    Moreover, if you find any instances of plagiarism from the page, please send email to service@echemi.com with relevant evidence.