Photo: Zhiling Zheng and a MOF-powered water harvester (photo courtesy of Zhiling Zheng.)
UC Berkeley experts taught ChatGPT how to quickly create datasets on difficult-to-aggregate research about certain materials that can be used to fight climate change, according to a new paper published in the Journal of the American Chemical Society.
These datasets on the synergy of the highly-porous materials known as metal-organic frameworks (MOFs) will inform predictive models. The models will accelerate chemists' ability to create or optimize MOFs, including ones that alleviate water scarcity and capture air pollution. All chemists – not just coders – can build these databases due to the use of AI-fueled chatbots.
"In a world where you have sparse data, now you can build large datasets," said Omar Yaghi, the Berkeley chemistry professor who invented MOFs and an author of the study. "There are hundreds of thousands of MOFs that have been reported, but nobody has been able to mine that information. Now we can mine it, tabulate it and build large datasets."
This breakthrough by experts at the College of Computing, Data Science, and Society's Bakar Institute of Digital Materials for the Planet (BIDMaP) will lead to efficient and cost-effective MOFs more quickly, an urgent need as the planet warms. It can also be applied to other areas of chemistry. It is one example of how AI can augment and democratize scientific research.
"We show that ChatGPT can be a very helpful assistant," said Zhiling Zheng, lead author of the study and a chemistry Ph.D. student at Berkeley. "Our ultimate goal is to make [research] much easier."
Other authors of the study, "ChatGPT Chemistry Assistant for Text Mining and Prediction of MOF Synthesis," include the Department of Chemistry's Oufan Zhang and the Department of Electrical Engineering and Computer Sciences's Christian Borgs and Jennifer Chayes. All are affiliated with BIDMaP, except Zhang.
Certain authors are also affiliated with the Kavli Energy Nanoscience Institute, the Department of Mathematics, the Department of Statistics, the School of Information and KACST-UC Berkeley Center of Excellence for Nanomaterials for Clean Energy Applications.