FormulationDT

Correctly assessing the formulatability of lead compounds and selecting the appropriate formulation strategy is critical for risk reduction and improved efficiency. Leveraging the approved drug formulation data, the present research designed and developed the first data-driven and knowledge-guided artificial intelligence (AI) system for intelligent formulation strategy decision-making. First, the approved small molecule drug formulation data were compiled, involving both oral and injectable drugs. The formulation strategy decision route was then established based on insights gleaned from approved drugs. Binary classification models were developed for each step of decision. Given the absence of exact negative samples in the marketed drug data, we improved and validated a positive-unlabeled learning algorithm for scoring and labeling unlabeled data. Next, the top-performing algorithm was selected from 8 commonly used supervised learning algorithms for each of the 12 classification tasks, with the average accuracy, recall, precision, and AUC of 86%, 82%, 86% and 91%, respectively. Lastly, the AI formulation strategy decision platform named FormulationDT was successfully constructed by integrating 12 well-trained models with expert knowledge, which can be applied at multiple drug development stages, from lead screening to commercial formulation development. The first data-driven and knowledge-guided AI formulation strategy decision platform, FormulationDT, demonstrates the value of partially supervised learning in pharmaceutical decision-making. It holds significant potential as an in silico tool for formulatability assessment and formulation strategy decision, facilitating efficiency gains in drug discovery and development.


Figure 1: a. Data flow; b. Formulation strategy distribution of marketed small molecule drugs
Table 1: Machine learning task definition and description.
Table 2: The dataset and outcome of positive-unlabeled bagging for 12 tasks
Table 3: The performance of the optimal models for 12 tasks (mean ± standard deviation, 5 repeats)
Browser compatibility

We tested FormulationAI on the following systems/browsers

OS Chrome Firefox Microsoft Edge Safari
Linux Ubuntu 20.04 LTS not tested 80.01 (64 bit) n/a n/a
Windows 10 106.0.5249.119 (64 bit) 107.0.1 (64 bit) 105.13.1343.50 (64 bit) not tested
Mac OSX 107.0.5304.121 not tested not tested 16.1
Android 11 not tested 107.2.0 107.0.1418.62 n/a