Iranian Journal of War and Public Health

eISSN (English): 2980-969X
eISSN (Persian): 2008-2630
pISSN (Persian): 2008-2622
JMERC
0.4
Volume 16, Issue 4 (2024)                   Iran J War Public Health 2024, 16(4): 363-368 | Back to browse issues page

Print XML PDF HTML


History

How to cite this article
Kavehie B. Disease Prediction in Victims of Chemical Exposure Using Forced Expiratory Volume; Logistic Regression vs. Machine Learning Methods. Iran J War Public Health 2024; 16 (4) :363-368
URL: http://ijwph.ir/article-1-1531-en.html
Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Rights and permissions
Authors B. Kavehie *
Department of Statistics, National Organization for Educational Testing, Tehran, Iran
* Corresponding Author Address: Department of Statistics, National Organization for Educational Testing, No. 204, Karim Khan Zand Street, Tehran, Iran. Post Box: 15875-4378 (kavehiebehrooz@yahoo.com)
Full-Text (HTML)   (27 Views)
Introduction
Chemical weapons are among the most inhumane tools of war ever created [1]. They were first used extensively during World War I, marking the beginning of modern chemical warfare. Subsequent conflicts, such as the Korean War, the Vietnam War, and the Iran-Iraq War, witnessed repeated use of these weapons by Iraqi forces against Iran. Chemical agents are considered incapacitating weapons which disrupt various bodily systems consisting of and not limited to numbness, respiratory disorders, hematological diseases, gastrointestinal issues, severe dermal blistering, genetic mutations, and even cancer. Mustard gas, the most frequently used agent in these conflicts, was confirmed by United Nations experts as a prominent chemical weapon [2, 3, 24, 25]. Sulfur mustard was first synthesized in 1822, while its blistering effects were described by Guthrie in 1860. The compound was purified in 1886, and Fritz Haber facilitated its use as a chemical warfare agent during World War I [2]. A comprehensive cohort study on chemical victims of a chemical attack using mustard gas in Sardasht city in Iran in 2009 conducted, which resulted in descriptive statistics of the victims in this city [3]. Sulfur mustard is a dense liquid that is colorless and odorless when pure. However, when combined with other chemicals, it appears brownish and emits an odor like garlic, onions, or mustard [4, 5].
Acute toxic effects of Sulphur mustard on the eyes, respiratory tract, and the skin are more prominent. Eyes are the most sensitive organs to Sulphur mustard. The first symptoms of exposure to Sulphur mustard are usually those of the eyes. Next to the eye lesions, the greatest discomfort produced by mustard gas results from irritation and toxicity of the respiratory system. Respiratory effects occur in a dose-dependent manner from the nasal mucosa to the terminal bronchioles [6].
Following the ocular complications, the respiratory system experiences the highest damage from Mustard gas exposure. Late complications resulting from chemical warfare agents lead to the development of chronic diseases among the victims. Consequently, many of the resulting complications are irreversible. Chronic obstructive pulmonary disease (COPD) is a common complication associated with Mustard Gas. In this study, 33% of individuals exposed to Mustard gas had chronic respiratory issues two years post-exposure. Various diagnostic tools are available for detecting pulmonary injuries, including clinical and paraclinical assessments such as spirometry, arterial blood gas analysis, and chest radiography. Nearly all the victims complained of cough, dyspnea, and suffocation. Mortality was 7.5% (respiratory failure) and 25% (renal failure) in the victim group and was 14.2% (malignancy), 14.2% (heart failure), 28.5% (accident), and 42.8% (aging) in the control group [7]. A study on the late complications of Mustard gas exposure conducted 2-28 months post-exposure among 233 Iranian soldiers reported the prevalence of respiratory complications (78%), central nervous system disorders (45%), skin injuries (41%), and ocular problems (39%) [8]. A separate study involving 36,000 Iranians exposed to Mustard Gas, conducted 13-20 years after exposure, identified the most common complications as respiratory (42.5%), ocular (39%), and dermatological (24.5%) diseases [9]. Recently, a study examined 40 Iranian individuals 16-20 years post-exposure, reposting respiratory complications in 95% of the cases, peripheral nervous system damage in 77.5%, skin disorders in 75%, and ocular diseases in 65% [10]. A study result of a scientific report on the delayed toxic effects of Sulphur mustard poisoning in 236 Iranian veterans 2 years after exposure revealed hyper- and hypopigmentation (34% and 16%, respectively) as well as dermal scarring (8%) as the most common findings. The most common skin complaint among these patients was itching, followed by a burning sensation and desquamation. These symptoms are caused by dryness of the skin and worsen in dry weather and after physical activity. Even after 2 decades, pruritus was still the most common subjective finding [11]. Of victims of the chemical weapons study, the mean Forced Expiratory Volume in 1 second (FEV₁) for the cohort was 81.78% of the predicted value.
During the 8-year Iran-Iraq war, Iran was subjected to repeated chemical attacks. On July 7, 1987, the city of Sardasht, with a population of 12,000, was bombed with chemical weapons. Six locations of this city were struck with Mustard gas bombs. Sardasht now has the highest number of Mustard gas casualties, estimated at 1,400 individuals, of whom approximately 500 required immediate hospitalization. To this day, many of these casualties suffer from chronic health conditions, with respiratory diseases being the most prevalent. Severe Mustard gas exposure frequently results in chronic and often irreversible pulmonary complications, with chronic bronchitis being the most common condition, observed in approximately 50% of the cases [12].
On the other hand, statistical methods serve as the foundation of such research. In recent years, statistical methods that leverage the computational power of modern computers for data analysis and statistical model fitting have seen significant advancements. These advancements include the integration of Artificial Intelligence (AI), such as Artificial Neural Networks, Machine Learning (ML), and Deep Learning.
In this study, machine learning models were used to predict the toxicity of chemical gases. Toxic and non-toxic gases, consisting of 144 gases, were identified according to the United States Environmental Protection Agency, Occupational Safety and Health Administration, and the Centers for Disease Control and Prevention. Six machine-learning models were used to predict the toxicity of these chemical gases. The performance of the models was verified through internal and external validation [13]. A review survey showed that more than 100 papers have been categorized and summarized to present the current development of Machine learning/Deep learning-based research in the area of chemical health and safety [14]. Another article reviewed machine learning methods that have been applied to toxicity prediction, including deep learning, random forests, k-nearest neighbors, and support vector machines [15]. In a review summarized, the cutting-edge embedding techniques and model designs in synthetic performance prediction, elaborating how chemical knowledge can be incorporated into machine learning until June 2022. By merging organic synthesis tactics and chemical informatics, they hoped that the results could provide a guide map and intrigue chemists to revisit the digitalization and computerization of organic chemistry principles [16]. The information produced by pairing chemistry and ML through data-driven analyses, neural network predictions, and monitoring of chemical systems allows 1) prompting the ability to understand the complexity of chemical data, 2) streamlining and designing experiments, 3) discovering new molecular targets and materials, and 4) planning or rethinking forthcoming chemical challenges. In fact, optimization engulfs all these tasks directly [17]. There are different definitions of artificial intelligence. Russell & Norvig define AI as “the study and design of intelligent systems capable of performing tasks that traditionally require human intelligence, such as natural language processing, learning, reasoning, problem-solving, and perception” [18]. Similarly, Bishop describes AI as “the study of how to enable computers to perform tasks currently better accomplished by humans” [19].
At the core of AI are Machine Learning and Deep Learning methodologies. Machine learning (ML) is defined as “a branch of AI focused on developing algorithms that enable computers to learn from data and make predictions based on it” [20]. Simply put, machine learning is a subset of AI that allows systems to improve automatically through data analysis without explicit programming. Key components of machine learning include data, statistical models, model training, and model evaluation. Common types of machine learning include supervised learning, unsupervised learning, and reinforcement learning. Applications of machine learning are diverse, ranging from facial recognition and natural language processing to financial market prediction and medical diagnostics [18, 20].
This research aimed to develop statistical models for predicting an individual’s health status using Spirometry and Forced Expiratory Volume in 1 Second (FEV₁) measurements and to compare the performance and accuracy of these two statistical methods.


Materials and Methods
This cross-sectional analytical study was conducted on 320 male individuals in Sardasht City who were affected by the harmful effects of Mustard gas (a sulfur-based chemical warfare weapon) in 2023. The researcher personally visited the archives of the Janbazan Foundation in Sardasht and conducted a thorough review of all the files on chemical victims made available to him. Consequently, this study adopts a census method, eliminating the need for sampling or sample size determination.
The data used in this research was extracted from the medical records of victims of chemical attacks from the Foundation of Martyrs and Veterans Affairs in Sardasht. The dataset included basic demographic information such as age, height, and lung function test results (FEV1) measured twice with a two-year interval. The degree of lung damage was classified into four categories by a physician using a Chest HI801 spirometer device (Chest HI801; Japan); Healthy lungs, reduced lung volume, airway obstruction, and a combination of reduced lung volume and airway obstruction. Using a Logit Linked Function, the extent of lung damage was dichotomized into healthy and unhealthy lungs (falling into any of the other three categories).
This article has measured FEV₁ values of 2005 and 2007. Whenever the term FEV₁ is mentioned, it does not refer to the raw FEV₁ value itself but rather its adjusted form, which represents the percentage of exhaled air (ERV) in the first-second relative to the spirometer’s estimated value, adjusted for the patient’s physical characteristics. The goal of this research was to model the factors influencing lung damage in individuals exposed to Mustard gas using two methods; Logistic Regression and Machine Learning. The study also aimed to compare the accuracy of these methods.
To perform the statistical analysis, binary logistic regression was first applied. Then, a machine learning model (using a logistic activation function) was developed using the same predictor variables as the logistic regression model. Notably, the machine learning model was trained using five different training set proportions (90%, 80%, 70%, 60%, and 50%) to evaluate its accuracy under various conditions. Six models were developed with X₁ (age), X2 (height), FEV186 (Expiratory Reserve Volume percentage of 2007), FEV184 (Expiratory Reserve Volume percentage of 2005) to estimate the parameters for the logistic regression model for predicting the probability of illness.
Using SPSS 22 software, the researcher utilized a modified version of the Logistic regression formula that calculates the probability of occurrence, tailored to meet the specific needs of the problem.

Findings
The mean age of the participants was 43.50±12.45 years. All participants were male, 41.6% were under 40, and 58.4% were 40 and above. Unhealthy individuals constituted 46.0% of the sample, while 54.0% were healthy.
The mean FEV₁ value at the first measurement (2005) was 73.10±20.70, and at the second measurement (2007) was 82.10±21.81. After performing significance tests, the final model was obtained as follows (α=0.01):



This model's accuracy was calculated to be 81.6%, meaning that out of 320 cases, 261 were correctly predicted. In other words, the model successfully classified unhealthy individuals as “Unhealthy” and healthy individuals as “Healthy.” Moreover, the Nagelkerke R2 statistics of this model were determined to be 0.478, VIF=1.92, and Tolerance=0.552 (for preconditions check).
Subsequently, five machine learning models were developed using Python and Artificial Intelligence (AI) libraries such as sklearn. The main difference among these five models was the proportion of data allocated to the training set. Table 1 shows the accuracy obtained for different models.
Calculated predictive accuracy for the ML 0.9, ML 0.8, ML 0.7, ML 0.6, ML 0.5 Logistic Regression models were 0.813, 0.809, 0.806, 0.813, 0.813, and 0.806, respectively. No significant differences have been observed between the performance of logistic regression and machine learning (ML) methods thus far. The agreement between the logistic regression models and ML 0.9, ML 0.8, ML 0.7, ML 0.6, ML 0.5 were 98.8%, 98.8%, 98.4%, 98.4%, and 99.6%, respectively. No significant differences were observed in the performance of Machine Learning and Logistic Regression models on data from chemical warfare casualties (Table 1).

Table 1. Accuracy values calculated for various machine learning models and traditional logistic regression


Discussion
Mustard gas exposure has profound and often long-term effects on the respiratory system. The most common complications include respiratory dysfunction, excessive airway secretions, airway bleeding, perfusion abnormalities, biochemical imbalances, vascular damage, and compromised pulmonary defense mechanisms [4, 21, 22]. Chronic bronchitis, characterized by persistent cough, shortness of breath (dyspnea), and increased sputum production, is reported as the most prevalent condition among affected individuals [4, 21-23]. These findings highlight the critical need for targeted interventions and efficient diagnostic tools in managing respiratory diseases arising from toxic exposures.
Traditional diagnostic methods for respiratory disorders often involve invasive procedures such as bronchoscopy or complex imaging techniques, which, while accurate, are resource-intensive, time-consuming, and ethically challenging, especially in vulnerable populations. For this reason, there is a growing emphasis on developing diagnostic approaches that are less intrusive, more cost-effective, and capable of yielding rapid preliminary results. Such approaches can guide clinicians in identifying high-risk cases early, ensuring that only those individuals undergo more exhaustive and detailed testing.
This study introduces a Logistic Regression model designed to function as a surrogate tool for the preliminary identification of respiratory complications. The model facilitates early risk stratification by leveraging existing patient data and incorporating targeted variables, such as forced expiratory volume in one second (FEV₁) from prior assessments. Specifically, the study recommends using this model with two additional FEV₁ measurements to predict pulmonary disease onset. For individuals identified as high-risk, further diagnostic investigations, such as advanced imaging or pulmonary function testing, can then be pursued.
A growing body of evidence supports the utility of Logistic Regression in clinical diagnostics. A detailed respiratory survey of 40 severely sulfur mustard (SM)-intoxicated Iranian war veterans documented that chronic obstructive pulmonary disease (COPD) was the most common delayed complication, affecting 35% of the cohort. Other complications included bronchiectasis (32.5%), asthma (25%), large airway narrowing (15%), pulmonary fibrosis (7.5%), and simple chronic bronchitis (5%) [22]. These findings underscore the diversity and severity of respiratory sequelae linked to mustard gas exposure, emphasizing the importance of early diagnosis and intervention.
The Logistic Regression model offers an interpretable and computationally efficient alternative to more complex machine learning (ML) algorithms in diagnostic applications. In this study, Logistic Regression demonstrated performance comparable to ML methods when applied to a dataset of 320 observations and four predictor variables. A study involving 12,608 cardiac patients observed that while Logistic Regression and ML methods exhibited similar discrimination capabilities, Logistic Regression outperformed ML algorithms in calibration. Calibration, which reflects the agreement between predicted probabilities and observed outcomes, is particularly critical in clinical settings to ensure the reliability of prognostic models [20]. However, machine learning methods have demonstrated distinct advantages in handling larger datasets and modeling non-linear relationships. For example, a study introduced an innovative ML-based model that outperformed traditional approaches in terms of accuracy while also significantly reducing system complexity [24]. Similarly, the results of another study highlighted the potential of deep learning frameworks for diagnosing chronic respiratory conditions by effectively integrating multimodal data, such as imaging and patient history, to improve diagnostic precision [25, 26].
The selection of diagnostic tools depends significantly on the nature and availability of data. Logistic Regression is particularly well-suited for datasets with fewer predictors and linear relationships, as it provides interpretable insights into variable relationships. On the other hand, ML approaches may be preferable in cases involving high-dimensional datasets or when uncovering complex, non-linear patterns is essential [27, 28].
Finally, Logistic Regression remains a valuable tool for the early detection of pulmonary diseases, particularly in individuals exposed to chemical agents like mustard gas. This approach can serve as an effective starting point for identifying high-risk patients by providing a balance of accuracy, interpretability, and computational simplicity. However, the potential of machine learning, especially as it evolves to address challenges related to calibration and interpretability, should not be overlooked. Future research should explore hybrid approaches that combine traditional statistical methods' strengths with machine learning's advanced capabilities to optimize diagnostic accuracy and efficiency.

Conclusion
Logistic regression is often preferred for its simplicity and interpretability, providing reliable results when statistical assumptions like the linearity of the logit and independence of errors are met. In contrast, machine learning methods excel in complex scenarios involving large datasets or non-linear relationships but may pose challenges in interpretability and computational demands. The choice between these approaches should be guided by data characteristics, research goals, and analytical context, ensuring the most suitable method is selected.

Acknowledgments: The author extends sincere gratitude to the Janbazan Medical and Engineering Research Center, for their invaluable support throughout this study. Additionally, the author extends profound appreciation to the esteemed veterans whose medical records were made available for this research, with heartfelt wishes for their continued health and well-being.
Ethical Permissions: Since the data for this study were obtained from the archives of health units and pertain to cases involving victims of the Iran-Iraq War (without including any names or personally identifiable information) and because the researcher is conducting this work independently and not as part of a government research project, obtaining an ethics code for the study is neither feasible nor required.
Conflicts of Interests: The author declares no conflicts of interest.
Authors' Contribution: Kavehie B (First Author), Introduction Writer/Methodologist/Main Researcher/Discussion Writer/Statistical Analyst (100%)
Funding/Support: This study did not receive financial support from any organization.
Keywords:

References
1. United Nations Security General. Report of the mission dispatched by the secretary-general to investigate allegations of the use of chemical weapons in the conflict between the Islamic Republic of Iran and Iraq. New York: United Nations; 1988. [Link]
2. Prentiss AM. Vesicant agents. In: Chemicals in warfare: A treatise on chemical warfare. London: McGraw-Hill; 1937. p. 177-300. [Link]
3. Ghazanfari T, Faghihzadeh S, Aragizadeh H, Soroush MR, Yaraee R, Mohammad Hassan Z, et al. Sardasht-Iran cohort study of chemical warfare victims: Design and methods. Arch Iran Med. 2009;12(1):5-14. [Link]
4. Marrs TC, Maynard Rl, Sidell FR. Chemical warfare against: Toxicology and treatment. Hoboken: John Wiley and Sons; 2007. [Link] [DOI:10.1002/9780470060032]
5. Somani SM. Chemical warfare against. San Diego: Academic Press; 1992. [Link]
6. Balali-Mood M, Balali-Mood B. Sulphur mustard poisoning and its complications in Iranian veterans. Iran J Med Sci. 2009;34(3):155-71. [Link]
7. Dadpey M, Ghahari L. Respiratory complication of Mustard gas in Iraq-Iran war victims living in Kermanshah. Ann Mil Health Sci Res. 2007;5(3):1331-5. [Persian] [Link]
8. Balali Mood M, Hefazati M. Acute poisoning with sulfur mustard gas. J Birjand Univ Med Sci. 2004:11(2):9-15. [Persian] [Link]
9. Khateri S, Ghanei M, Keshavarz S, Soroush M, Haines D. Incidence of lung, eye and skin lesions on late complications in 34,000 Iranian with wartime exposure to mustard agent. J Occup Environ Med. 2003;45(11):1136-43. [Link] [DOI:10.1097/01.jom.0000094993.20914.d1]
10. Balali-Mood M, Hefazi M, Mahmoudi M, Jalali E, Attaran D, Maleki M, et al. Long-term complications of Sulphur mustard poisoning in severely intoxicated Iranian veterans. Fundam Clin Pharmacol. 2005;19(6):713-21. [Link] [DOI:10.1111/j.1472-8206.2005.00364.x]
11. Balali-Mood M, Navaeian A. Clinical and paraclinical findings in 233 patients with sulfur mustard poisoning. Proceedings of the Second World Congress on New Compounds in Biological and Chemical Warfare Ghent. Ghent: Ghent University; 1986. [Link]
12. Kavehie B, Faghihzadeh S, Eskandari F, Kazemnejad S, Ghazanfari T, Soroosh MR. Studying the surrogate validity of respiratory indexes in predicting the respiratory illnesses in wounded people exposed to sulfur mustard. J Arak Univ Med Sci. 2011;13(4):75-82. [Persian] [Link]
13. Erturan AM, Karaduman G, Durmaz H. Machine learning-based approach for efficient prediction of toxicity of chemical gases using feature selection. J Hazard Mater. 2023;455:131616. [Link] [DOI:10.1016/j.jhazmat.2023.131616]
14. Zeren J, Hu P, Xu H, Wang Q. Machine learning and deep learning in chemical health and safety: A systematic review of techniques and applications. ACS Chem Health Saf. 2020;27(6):316-34. [Link] [DOI:10.1021/acs.chas.0c00075]
15. Wu Y, Wang G. Machine learning based toxicity prediction: From chemical structural description to transcriptome analysis. Int J Mol Sci. 2018;19(8):2358. [Link] [DOI:10.3390/ijms19082358]
16. Zhang SQ, Xu LC, Li SW, Oliveira JCA, Li X, Ackermann L, et al. Bridging chemical knowledge and machine learning for performance prediction of organic synthesis. Chemistry. 2023;29(6):e202202834. [Link] [DOI:10.1002/chem.202380662]
17. Cova TFGG, Pais AACC. Deep learning for deep chemistry: Optimizing the prediction of chemical patterns. Front Chem. 2019;7:809. [Link] [DOI:10.3389/fchem.2019.00809]
18. Russell S, Norvig P. Artificial fintelligence: A modern approach. 3rd ed. London: Pearson; 2009. [Link]
19. Bishop CM. Pattern recognition and machine learning. New York: Springer; 2006. [Link]
20. Jordan MI, Mitchell TM. Machine learning: Trends, perspectives, and prospects. Science. 2015;349(6245):255-60. [Link] [DOI:10.1126/science.aaa8415]
21. Mandel M, Gibson WS. Clinical manifestations and treatment of gas poisoning. JAMA. 1917;LXIX(23):1970-1. [Link] [DOI:10.1001/jama.1917.25910500001015]
22. Hafezi M, Attaran D, Mahmoudi M, Balali-Mood M. Late respiratory complications of mustard gas poisoning in Iranian veterans. Inhal Toxicol. 2005;17(11):587-92. [Link] [DOI:10.1080/08958370591000591]
23. Emad A, Rezaian GR. The diversity of the sulfur mustard gas inhalation or respiratory system 10 years after a single, heavy exposure: Analysis of 197 cases. Chest. 1997;112(3):734-8. [Link] [DOI:10.1378/chest.112.3.734]
24. Sabahi H, Vali M, Shafie D. In-hospital mortality prediction model of heart failure patients using imbalanced registry data: A machine learning approach. SCIENTIA IRANICA. 2023. [Link] [DOI:10.24200/sci.2023.61637.7412]
25. Evison D, Hinsley D, Rice P. Chemical weapons. BMJ. 2002;324(7333):332-5. [Link] [DOI:10.1136/bmj.324.7333.332]
26. Bullman T, Kang H. A fifty years mortality follow-up study of veterans exposed to low level chemical warfare agent, mustard Gas. Ann Epidemiol. 2000;10(5):333-8. [Link] [DOI:10.1016/S1047-2797(00)00060-0]
27. James G, Witten D, Hastie T, Tibshirani R. An introduction to statistical learning: With applications in R. New York: Springer; 2013. [Link] [DOI:10.1007/978-1-4614-7138-7]
28. Murphy KP. Machine learning: A probabilistic perspective (adaptive computation and machine learning series). Cambridge: The MIT Press; 2012. [Link]

Add your comments about this article : Your username or Email:
CAPTCHA