Chemistry and computer science work together to apply artificial intelligence to chemical reactions

Photo credit: CC0 Public Domain

In recent years, researchers have increasingly turned to data science techniques to aid in problem solving in organic synthesis.

Researchers in the laboratory of Abigail Doyle, A. Barton Hepburn, Professor of Chemistry at Princeton, in collaboration with Ryan Adams, Professor of Computer Science, have developed open source software that offers them a state-of-the-art optimization algorithm for their daily work and folds the, what has been learned in machine learning, in synthetic chemistry.

The software adapts the key principles of Bayesian optimization to enable faster and more efficient synthesis of chemicals.

Based on Bayes’ theorem, a mathematical formula for determining conditional probability, Bayesian optimization is a strategy that is widely used in science. In the broadest sense, it enables people and computers to use previous knowledge to inform and optimize future decisions.

The chemists in Doyle’s laboratory, working with Adams, a professor of computer science, and colleagues from Bristol-Myers Squibb, compared human decision-making skills to the software package. They found that the optimization tool offered both a higher efficiency compared to human participants and a lower bias of a test reaction. Her work appears in the current issue of the journal Nature.

“Reaction optimization is ubiquitous in chemical synthesis in both science and the chemical industry as a whole,” said Doyle. “Because the chemical space is so large, it is impossible for chemists to evaluate the entire reaction space experimentally. We wanted to develop and evaluate Bayesian optimization as a tool for synthetic chemistry because it is successful for related optimization problems in the sciences. “

Benjamin Shields, a former postdoctoral fellow in the Doyle lab and lead author of the newspaper, created the Python package.

“I come from synthetic chemistry, so I definitely appreciate that synthetic chemists are pretty good at addressing these problems on their own,” Shields said. “I think the real strength of Bayesian optimization is that we can model these high-dimensional problems and capture trends that we may not see in the data itself so that the data can be processed much better.

“And second, it’s not held back within a room by the prejudice of a human chemist,” he added.

How it works

The software was started as an out-of-field project to meet Shields’ doctoral requirements. Doyle and Shield then formed a team under the Center for Computer Assisted Synthesis (C-CAS), a National Science Foundation initiative launched at five universities to transform the way people plan and conduct the synthesis of complex organic molecules. Doyle has been a lead investigator at C-CAS since 2019.

“Optimizing responses can be an expensive and time-consuming process,” said Adams, who is also the director of the statistics and machine learning program. “This approach not only speeds it up using cutting-edge techniques, it also finds better solutions than humans would normally identify. I think this is just the beginning of what Bayesian optimization can do in this area.”

Users first define a search space – plausible experiments that need to be considered – such as a list of catalysts, reagents, ligands, solvents, temperatures, and concentrations. Once this area is prepared and the user defines how many experiments to run, the software selects the initial experimental conditions to evaluate. New experiments are then proposed to be carried out, cycling through smaller and smaller choices, until the reaction is optimized.

“In developing the software, I tried to include ways in which people could inject what they know about a response,” Shields said. “No matter how you use this, or machine learning in general, there will always be a case where human expertise is valuable.”

The software and examples of its use can be accessed in this repository. GitHub links are available for: software that represents the chemicals to be evaluated in a machine-readable format via density functional theory; Response optimization software; and the game that collects the chemists’ decisions to optimize the test reaction.

“Bayesian Reaction Optimization as a Tool for Chemical Synthesis” by Benjamin J. Shields, Jason Stevens, Jun Li, Marvin Parasram, Farhan Damani, Jesus I. Martinez Alvarado, Jacob M. Janey, Ryan P. Adams and Abigail G. Doyle appears in the February 3 issue of Nature magazine.

Machine learning innovation to develop a chemical library for drug discovery

More information:
Benjamin J. Shields et al. Bayesian Reaction Optimization as a Tool for Chemical Synthesis, Nature (2021). DOI: 10.1038 / s41586-021-03213-y Provided by Princeton University

Quote: Chemistry and Computer Science Collaborate to Apply Artificial Intelligence to Chemical Reactions (2021, February 5), accessed on March 26, 2021 from -chemical.html

This document is subject to copyright. Except for fair trade for the purpose of private study or research, no part may be reproduced without written permission. The content is provided for informational purposes only.

Comments are closed.