by Mylène André

EMBL Grenoble’s Marquez Team has developed an AI-based training method that automates crystal screening and identification for macromolecular crystallography studies

Representation of the lab-in-the-loop approach used for AXIS, an AI-based training method in which machine learning predictions are corrected by expert scientists in a continuous learning loop to automate crystal identification, accelerating macromolecular crystallography studies. Credit: Daniela Velasco/EMBL

What is ‘lab-in-the-loop’?

Lab-in-the-loop aims at improving experimental processes with machine learning and generative AI methods, using iteration loops and user feedback to continually correct the model.

By focusing on discrepancies between machine learning predictions and human scores, this method accelerates the learning process. However, humans can also make mistakes. For each conflicting score, the HTX experts had to determine whether the machine or the scientist was correct. In each iteration, Personnaz added an extended training dataset together with the curated data to retrain the model. “The first iteration showed obvious errors, but at the second iteration, it made fewer ‘silly’ mistakes,” commented Personnaz. “This shows that CRIMS-AXIS made good progress, because increasingly, what remains are cases that are impossible to solve.” These cases are situations in which experts could not tell if an image showed very small crystalline-like material or amorphous precipitate.

Fully integrated into the CRIMS software, CRIMS-AXIS identifies crystals, as well as needles or other crystallisation forms. The model has received positive feedback from the users. “AXIS removes critical bottlenecks, particularly in the context of extensive crystallisation screens, unlocking the potential for higher levels of automation that are key in both fundamental and translational research,” explained Sihyun Sung, Staff Scientist in the Marquez Team and user of CRIMS-AXIS.

This work benefited from support from the European Commission via the Fragment-Screen project coordinated by Instruct-ERIC, and can be easily integrated by other labs, as the machine learning models have been deposited in a central repository and are available for the scientific community to use.

Next steps

Personnaz is now working with EMBL Grenoble colleagues on improving CRIMS-AXIS and upgrading their automated pipelines.

On the machine learning front, he is working with Alana de Sousa, an astrophysicist specialising in AI studies, who is currently doing a traineeship in the Marquez Team. They are attempting to apply ‘self-supervised learning’ for CRIMS-AXIS, leveraging the large number of unlabelled crystallography images produced over many years using the HTX platform. The aim is to try pre-training the model with only unlabelled crystallography images, therefore restricting the diversity of training images. This would let the model ‘learn to understand’ crystallography images and potentially achieve better results for crystal identification. The researchers also plan to test whether it can be used for other tasks like multi-class classification, crystal detection, or segmentation.

For moving towards fully automated crystallisation pipelines, Personnaz is collaborating with software engineer Jeremy Sinoir from the Papp Team to integrate automated crystal harvesting in CRIMS. Currently, HTX operators have to select in the software which crystals need to be harvested and prepared for diffraction data collection experiments, and how. The ‘automated harvesting’ Personnaz and Sinoir are developing would be integrated in CrystalDirect Harvester 4, the latest version of the harvesting machine, soon to be used on the HTX platform. The Marquez Team is also extending the lab-in-the-loop approach to other steps of the crystallography process.

This project has received funding from the European Union’s Horizon 2020 research and innovation program under the Marie Skłodowska-Curie grant agreement No. 945405