Deep learning chemical space​​


Harvard Chemistry and Chemical Biology

Abstract: Virtual screening is increasingly proven as a tool to test new molecules for a given application. Through simulation​ and regression​ we can gauge whether a molecule will be a promising candidate in an automatic and robust way. A large remaining challenge, however, is how to perform optimizations over a discrete space of size at least 10^60. Despite the size of chemical space, or perhaps precisely because of it, coming up with novel, stable, makeable molecules that are effective is not trivial. First-principles approaches to generating new molecules fail to capture the intuition embedded in the ~100 million existing molecules. I will report our towards developing an autoencoder that allows us to project molecular space into a continuous, differentiable representation where we can perform molecular optimization.