Hi there! I'm currently a PhD student in chemical engineering and applied chemistry at the University of Toronto. My research revolves around the acceleration of material discovery using deep learning and exploring the multimodality of metal-organic frameworks. Feel free to connect with me on LinkedIn or BlueSky if you are ever interested in talking about chemistry, cheminformatics, material science, engineering or just life!
There is a huge incentive (especially now!) with exploring deep learning in MOFs. When designing MOFs, there are millions of possible permutations between the set of available metals and organic linkers - which also means that doing experimental synthesis or screening through molecular dynamics would be extremely expensive and time-consuming. Machine (and deep) learning offers a much cheaper and faster way of exploring this space, while not sacrificing much in terms of performance. Not only does deep learning lay the foundation of the acceleration of material discovery in MOFs, but it also lays the foundation of application discovery! At the end of the day, we want to find MOFs with a diverse set of applications - and deep learning gives us a way of finding these.
In many cases, a single modality is insufficient for creating robust models. This is especially true for metal-organic frameworks (MOFs), where both the chemical composition and structural geometry must be adequately represented to build effective models that can accelerate material discovery. Therefore, it is essential to combine multiple modalities. Examples of modalities in MOFs include atomistic embeddings of the crystal structure, energy grids, textual representations like SMILES or MOFid, spectral data (e.g., PXRD, NMR), chemical descriptors (e.g., revised autocorrelations), and geometric descriptors (e.g., largest included sphere, free sphere, accessible surface area, and volumes). However, multimodality raises important questions, such as determining which modalities are most appropriate for specific scenarios. Ideally, the chosen modalities should be readily available or easy to obtain close to the synthesis stage.
One of my current interests is to perform generative design of MOFs using available modalities. With generative design, there are many incentives - but the most important ones are that it removes the complexity of constructing computational structures of MOFs and it allows greater amounts of discovery of brand new MOFs. Currently, MOF generative designs exist in the form of diffusion models, variational autoencoders and can even leverage large language models - but I wish to explore and expand on this.
Will be updating as my PhD carries on...