ARKA descriptors in QSAR

One of the most commonly used in silico approaches for assessing new molecules' activity/property/toxicity is the Quantitative Structure-Activity/Property/Toxicity Relationship (QSAR/QSPR/QSTR), which generates predictive models for efficiently predicting query compounds .[1] QSAR/QSPR/QSTR uses numerical chemical information in the form of molecular descriptors and correlates these to the response activity/property/toxicity using statistical techniques.[2] While QSAR is essentially a similarity-based approach, the occurrence of activity/property cliffs may greatly reduce the predictive accuracy of the developed models.[3] The novel Arithmetic Residuals in K-groups Analysis (ARKA) approach is a supervised dimensionality reduction technique that can easily identify activity cliffs in a data set.[4] Activity cliffs are similar in their structures but differ considerably in their activity. The basic idea of the ARKA descriptors is to group the conventional QSAR descriptors based on a predefined criterion and then assign weightage to each descriptor in each group. ARKA descriptors have also been used to develop classification-based[5] and regression-based[6] QSAR models with acceptable quality statistics.

The ARKA descriptors have been used for the identification of activity cliffs in QSAR studies and/or model development by multiple researchers [7-22]. Recently a multi-class ARKA framework has been proposed for improved q-RASAR (https://en.wikipedia.org/wiki/Q-RASAR) model generation [23].

References

[edit]
  1. ^ Muratov, Eugene N.; et al. (June 8, 2020). "QSAR without borders". Chemical Society Reviews. 49 (11): 3525–3564. doi:10.1039/D0CS00098A. PMC 8008490. PMID 32356548.
  2. ^ Cherkasov, Artem; et al. (June 26, 2014). "QSAR Modeling: Where Have You Been? Where Are You Going To?". Journal of Medicinal Chemistry. 57 (12): 4977–5010. doi:10.1021/jm4004285. PMC 4074254. PMID 24351051.
  3. ^ Dablander, Markus; Hanser, Thierry; Lambiotte, Renaud; et al. (April 17, 2023). "Exploring QSAR models for activity-cliff prediction". Journal of Cheminformatics. 15 (1): 47. doi:10.1186/s13321-023-00708-w. PMC 10107580. PMID 37069675.
  4. ^ Qin, Li-Tang; Zhang, Jun-Yao; Nong, Qiong-Yuan; Xu, Xia-Chang-Li; Zeng, Hong-Hu; Liang, Yan-Peng; Mo, Ling-Yun (November 2024). "Classification and regression machine learning models for predicting the combined toxicity and interactions of antibiotics and fungicides mixtures". Environmental Pollution. 360: 124565. doi:10.1016/j.envpol.2024.124565. PMID 39033842.
  5. ^ Banerjee, Arkaprava; Roy, Kunal (2024). "ARKA: a framework of dimensionality reduction for machine-learning classification modeling, risk assessment, and data gap-filling of sparse environmental toxicity data". Environmental Science: Processes & Impacts. 26 (6): 991–1007. doi:10.1039/D4EM00173G. PMID 38743054.
  6. ^ Sobańska, Anna W.; Banerjee, Arkaprava; Roy, Kunal (18 November 2024). "Organic Sunscreens and Their Products of Degradation in Biotic and Abiotic Conditions—In Silico Studies of Drug-Likeness and Human Placental Transport". International Journal of Molecular Sciences. 25 (22): 12373. doi:10.3390/ijms252212373. PMC 11595199. PMID 39596438.

[7] Classification and Regression Machine Learning Models for Predicting the Combined Toxicity and Interactions of Antibiotics and Fungicides Mixtures. Environmental Pollution (2024). https://doi.org/10.1016/j.envpol.2024.124565 [8] Evaluating ionic liquid toxicity with machine learning and structural similarity methods. Green Chemical Engg. (2024). https://doi.org/10.1016/j.gce.2024.08.008 [9] Development of a robust Machine learning model for Ames test outcome prediction. Chem Phys Lett (2024). https://doi.org/10.1016/j.cplett.2024.141663 [10] Comparative QSAR and q-RASAR Modeling for Aquatic Toxicity of Organic Chemicals to Three Trout Species: O. Clarkii, S. Namaycush, and S. Fontinalis. J Hazard Mater (2024). https://doi.org/10.1016/j.jhazmat.2024.136060 [11] New binary mixtures of fungicides against Macrophomina phaseolina: machine learning-driven QSAR, read-across prediction, and molecular dynamics simulation. Chemosphere (2024). https://doi.org/10.1016/j.chemosphere.2024.143533 [12] Contributions to the development of prediction models for the toxicity of ionic liquids. Struct Chem (2024). https://doi.org/10.1007/s11224-024-02411-4 [13] Explainable machine learning models for predicting the acute toxicity of pesticides to sheepshead minnow (Cyprinodon variegatus). Sci Tot Environ (2024). https://doi.org/10.1016/j.scitotenv.2024.177399 [14] Unveiling the interspecies correlation and sensitivity factor analysis of rat and mouse acute oral toxicity of antimicrobial agents: first QSTR and QTTR Modeling report. Toxicology Research, 13, tfae191 (2024). https://doi.org/10.1093/toxres/tfae191 [15] A comprehensive machine learning-based models for predicting mixture toxicity of azole fungicides toward algae (Auxenochlorella pyrenoidosa). Environ Internat (2024). https://doi.org/10.1016/j.envint.2024.109162 [16] Development of Quantitative Structure Property Relationship Models and Tool for Predicting the Soil Adsorption Coefficient (logKOC). Environ Pollut (2025). https://doi.org/10.1016/j.envpol.2025.125703 [17] Web Server-based Deep Learning-Driven Predictive Models for Respiratory Toxicity of Environmental Chemicals: Mechanistic Insights and Interpretability. J Hazard Mater (2025). https://doi.org/10.1016/j.jhazmat.2025.137575 [18] PBScreen: A Server for the High-Throughput Screening of Placental Barrier–Permeable Contaminants Based on Multifusion Deep Learning. Environ Pollut (2025). https://doi.org/10.1016/j.envpol.2025.125858 [19] Explainable machine learning models enhance prediction of PFAS bioactivity using quantitative molecular surface analysis-derived representation. Water Research (2025). https://doi.org/10.1016/j.watres.2025.123500 [20] Modeling and Interpretability Study of the Structure–Activity Relationship for Multigeneration EGFR Inhibitors. ACS Omega (2025). https://doi.org/10.1021/acsomega.4c10464 [21] Risk Assessment of Industrial Chemicals Towards Salmon Species Amalgamating QSAR, q-RASAR, and ARKA Framework. Toxicol Reports (2025). https://doi.org/10.1016/j.toxrep.2025.102017 [22] Predicting chemical toxicity towards Raphidocelis subcapitata with quantum chemical descriptors. Algal Research (2025) 104055. https://doi.org/10.1016/j.algal.2025.104055 [23] The multiclass ARKA framework for developing improved q-RASAR models for environmental toxicity endpoints. Environ Sci: Processes Impacts (2025). https://doi.org/10.1039/D5EM00068H