Pushing the Boundaries of Molecular Property Prediction for Drug Discovery with Multitask Learning BERT Enhanced by SMILES Enumeration

January 2022 Xiaochen Zhang, Chengkun Wu, Jiacai Yi, Xiangxiang Zeng, Canqun Yang, Aiping Lyu, Tingjun Hou, Dongsheng Cao Research

This work explores multitask molecular property prediction with a BERT framework enhanced by SMILES enumeration. The study shows that large-scale pretraining and sequence augmentation can improve robustness and predictive performance across diverse drug discovery tasks.