Updates README.md discussion and conclusion section

GhostCat12 · Dec 17, 2023 · 9f8d5ef · 9f8d5ef
1 parent 105aca0
commit 9f8d5ef
Showing 1 changed file with 9 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -153,6 +153,15 @@ A sensitivity analysis was carried out to reduce the uncertainty surrounding bot
 Overall, BI-RADS and Margin are key factors in prediction, whereas Density and Shape are not. The case for Age cannot be ascertained as it improves one model whilst deteriorating the other. However, the insight given from this report can only be validated by cross-checking with other datasets, as the dataset used is relatively small and could contain biases. It should be noted that the RF method does create a new forest when altering the data (despite setting the forest as replicable), thus leading to minor deviations in prediction capabilities, nevertheless still included within this sensitivity analysis.
 
 ## 4. Discussion and Conclusion
+The mammographic masses (MM) dataset has been utilised in designing other machine learning models; one research group implemented a Naïve Bayes model on various datasets, including MM [12]. Naïve Bayes performed with 0.90 AUC compared to 0.891 (KNN) and 0.865 (RF). Unfortunately, other metrics were not provided for further comparison. Another research group employed artificial neural networks (ANN) and decision trees in which decision trees performed marginally worse, AUC  of  0.88 vs 0.87, respectively [13]. The KNN model performed better than both; however, the RF only performed better than the decision trees. Both studies stated that further testing and validation was required.       
+
+Although RF and KNN did not perform as well as expected, many other models performed well, and with better-optimised parameters could improve diagnostic accuracy. For example, using Orange to implement Naïve Bayes on the pre-processed data provided AUC=0.902 and F1=0.835; with little effort in hyperparameter optimisation, ANN provided AUC=0.899, this could be drastically improved with effort and could potentially perform better than KNN.     
+
+To further improve the methodology, a separate dataset would help reduce uncertainties. The MM dataset was from Germany; to ensure my model works equally well universally, it would need to be deployed on datasets from other countries. Furthermore, due to the retrospective nature of the MM dataset, to validate the models' clinical use, the models would need to run on real-time data; this would also expose whether the models were over-fitted.     
+
+Overall, the computer-aided diagnosis approach can improve breast cancer screening by reducing the problem of FP rates. Realistically, the models will be used in conjunction with a doctor’s expertise, who can consider complete case history to decide further action after screening.    
+
+The RF and KNN model performed at 0.839 and 0.805 F1 scores. This was not expected for RF as it is tailored towards classification predictions. It may perform better with a larger dataset and better optimisation. Whereas K-NN, which performed better, will have too high a computational cost with a large dataset to be used clinically. Testing these methods against another database will allow us to determine if these results are accurate and could provide ways to improve the accuracy of both models. There are many other robust models for diagnosing breast cancer, such as ANN and Naïve Bayes, which need to be further validated and hold potential clinical application potential. 
 
 
 ## 5. References