Evaluating the Generalizability of Support Vector Machine for Breast Cancer Detection

Authors

  • Ezekiel Olorunshola Oluwaseyi Department of Computer Science, Faculty of Computing, Air Force Institute of Technology, Kaduna, Nigeria
  • Dominic Ebuka Okeh Department of Computer Science, Faculty of Computing, Air Force Institute of Technology, Kaduna, Nigeria
  • Kolade Ademuwagun Adeniran Department of Computer Science, Faculty of Computing, Air Force Institute of Technology, Kaduna, Nigeria

Keywords:

Malignant, Benign, Support Vector Machine, Classifiers, Evaluation Metrics, Breast Cancer

Abstract

Breast cancer is caused by abnormal cell growth in the breast. Early detection has been observed to be crucial for successful treatment. Accurate detection methods are essential. Machine learning models, particularly Support Vector Machines (SVMs), have shown promise. However, concerns exist regarding their generalizability across real-world scenarios with varying software environments and data processing techniques. This research investigates this gap by comparing SVM performance with other classifiers such as Naïve Bayes, Random Forest, Multilayer Perceptron and Decision Tree. These classifiers were tested on the Wisconsin Breast Cancer dataset using both the Waikato Environment for Knowledge Analysis (WEKA) and Jupyter Notebook. The study recorded performance metrics such as accuracy, precision, recall, and f1_score. After the analysis, it was observed that in WEKA, Support Vector Machine under the 10-fold cross-validation and 70% split, had the highest accuracies of 0.981 and 0.977 respectively. Interestingly, Multilayer Perceptron also achieved an accuracy of 0.977 under the 70% split. In the Jupyter Notebook, Support Vector Machine also produced the highest accuracy value of 0.99 under the 70% split. However, Random Forest produced the highest accuracy of 0.97 which was closely followed by Support Vector Machine which had a value of 0.96 in the 10-fold cross-validation.

References

Akbuğday, B. (2019). Classification of breast cancer data using machine learning algorithms. 2019 Medical Technologies Congress (TIPTEKNO). https://doi.org/10.1109/tiptekno.2019.8895222

Chaurasiya, S., & Rajak, R. (2022). Comparative analysis of machine learning algorithms in breast cancer classification. Research Square (Research Square). https://doi.org/10.21203/rs.3.rs-1772158/v1

Dinesh, P., Vickram, A. S., & Kalyanasundaram, P. (2024). Medical image prediction for diagnosis of breast cancer disease comparing the machine learning algorithms: SVM, KNN, logistic regression, random forest and decision tree to measure accuracy. AIP Conference Proceedings. https://doi.org/10.1063/5.0203746

Elsadig, M. A., Altigani, A., & Elshoush, H. T. (2023). Breast cancer detection using machine learning approaches: a comparative study. International Journal of Power Electronics and Drive Systems/International Journal of Electrical and Computer Engineering, 13(1), 736. https://doi.org/10.11591/ijece.v13i1.pp736-745

Fatima, N., Liu, L., Sha, H., & Ahmed, H. (2020). Prediction of breast cancer, comparative review of machine learning techniques, and their analysis. IEEE Access, 8, 150360–150376. https://doi.org/10.1109/access.2020.3016715

Guleria, K., Sharma, A., Lilhore, U. K., & Prasad, D. (2020). Breast cancer prediction and classification using supervised learning techniques. Journal of Computational and Theoretical Nanoscience, 17(6), 2519–2522. https://doi.org/10.1166/jctn.2020.8924

Hicks, S. A., Strümke, I., Thambawita, V., Hammou, M., Riegler, M., Halvorsen, P., & Parasa, S. (2022). On evaluation metrics for medical applications of artificial intelligence. Scientific Reports, 12(1). https://doi.org/10.1038/s41598-022-09954-8

Hoque, N. R., Das, N. S., Hoque, N. M., & Hoque, N. M. (2024). Breast Cancer Classification using XGBoost. World Journal of Advanced Research and Reviews, 21(2), 1985–1994. https://doi.org/10.30574/wjarr.2024.21.2.0625

Ibeni, W. N. L. W. H., Salikon, M. Z. M., Mustapha, A., Daud, S. A., & Salleh, M. N. M. (2019). Comparative analysis on Bayesian classification for breast cancer problem. Bulletin of Electrical Engineering and Informatics, 8(4). https://doi.org/10.11591/eei.v8i4.1628

Islam, T., Sheakh, M. A., Tahosin, M. S., Hena, M. H., Akash, S., Jardan, Y. a. B., FentahunWondmie, G., Nafidi, H., & Bourhia, M. (2024). Predictive modeling for breast cancer classification in the context of Bangladeshi patients by use of machine learning approach with explainable AI. Scientific Reports, 14(1). https://doi.org/10.1038/s41598-024-57740-5

Keleş, M. K. (2019). Breast Cancer Prediction and detection Using Data Mining Classification Algorithms: A Comparative study. Tehnicki Vjesnik-technical Gazette, 26(1). https://doi.org/10.17559/tv-20180417102943

Mohammed, S. A., Darrab, S., Noaman, S. A., & Saake, G. (2020). Analysis of breast cancer detection using different machine learning techniques. In Communications in computer and information science (pp. 108–117). https://doi.org/10.1007/978-981-15-7205-0_10

Mosayebi, A., Mojaradi, B., Naeini, A. B., & Hosseini, S. H. K. (2020). Modeling and comparing data mining algorithms for prediction of recurrence of breast cancer. PLOS ONE, 15(10), e0237658. https://doi.org/10.1371/journal.pone.0237658

Naji, M. A., Filali, S. E., Aarika, K., Benlahmar, E. H., Abdelouhahid, R. A., & Debauche, O. (2021). Machine learning Algorithms for breast cancer prediction and diagnosis. Procedia Computer Science, 191, 487–492. https://doi.org/10.1016/j.procs.2021.07.062

Shah, C., & Jivani, A. (2013). Comparison of data mining classification algorithms for breast cancer prediction. 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT). https://doi.org/10.1109/icccnt.2013.6726477

Strelcenia, E., & Prakoonwit, S. (2023). Effective feature engineering and Classification of breast cancer diagnosis: a Comparative study. BioMedInformatics, 3(3), 616–631. https://doi.org/10.3390/biomedinformatics3030042 Risk factors and preventions of breast cancer

Sun, Y., Zhao, Z., Zhang, Y., Fang, X., Lu, H., Zhu, Z., Shi, W., Jiang, J., Yao, P., & Zhu, H. (2017). Risk factors and preventions of breast cancer. International Journal of Biological Sciences, 13(11), 1387–1397. https://doi.org/10.7150/ijbs.21635

World Health Organization: WHO & World Health Organization: WHO. (2023, July 12). Breast cancer. https://www.who.int/news-room/fact-sheets/ detail/breast-cancer

Downloads

Published

2024-11-26

How to Cite

Oluwaseyi, E. O., Okeh, D. E., & Adeniran, K. A. (2024). Evaluating the Generalizability of Support Vector Machine for Breast Cancer Detection. Journal of Computing and Social Informatics, 4(1), 1–10. Retrieved from https://publisher.unimas.my/ojs/index.php/jcsi/article/view/7461