Malaria Detection using Microscopic Image Analysis: A Convolution Neural Network Based Approach

— Malaria is a potentially fatal disease which is caused by Plasmodium parasite. These parasites are transmitted to humans through the bites of female Anopheles mosquitoes which play the role of disease vector. Five types of plasmodium cause malaria named P. Falciparum, P. Vivax, P. Ovale, P. Knowlesi, and P. Malariae, Among the Plasmodium parasites, Falciparum and Vivax are particularly lethal to humans. Therefore, early detection of malaria is mandatory to avoid the loss of human life. Different automatic/semi-automatic malaria detection techniques are available in the literature, which reduces the chance of human errors in the prognosis of malaria. In recent years, deep learning-based methods have proven to be effective for object detection. Therefore, such methods have caught the attention of researchers to use for the detection of malarial parasites in human blood. In this paper, we proposed a Convolutional Neural Network (CNN) model, which detects malarial parasites in microscopic images of human blood samples with high accuracy. The proposed model comprises 15 layers. It has 8 convolution layers with ReLu activation function, 4 max-pooling layers, 1 flattening layer, and 2 fully connected layers. The proposed method has been evaluated using various statistical measures against existing state-of-the-art methods. The quantitative measures show the effectiveness of the proposed model. It has a 97.42% testing accuracy, 97.42% sensitivity, 97.41% specificity, 97.70% precision, 97.42% recall , 97.97% F1-score , 97.41% Area Under Curve (AUC), and 94.82% Mathews correlation coefficient.


I. INTRODUCTION
According to the world health organization (WHO), 228 million cases of malaria were reported in 2018 across the world [1]. The early symptoms of malaria are similar to the flu-like virus and it starts within a few days after the bite of the mosquito [3]. Early treatment is essential for the people infected with malaria to avoid further spread. To diagnose malaria, pathologists put a drop of blood over the glass plate and measure the infected cells through a microscope [3]. To build a system that minimizes the dependency of skilled pathologists and replicates human performance in malaria detection, there is a demand for feature extraction and mapping of the best suitable algorithm according to features in the existing techniques [4]. A recent study states that Convolutional Neural Network (CNN) can learn features by itself like human perception. This automatic feature extraction provides a high level of results for image classification [3]. Deep learning (DL) models have proven their efficacy on medical images [6]. In recent years, DL models such as CNN, VGGNet have achieved higher accuracy to classify images. Automated detection of malaria includes dataset acquisition, data preprocessing, feature extraction, and machine-learning techniques that need human input while DL models like CNN, ResNet, and VGGNet extract these features automatically. DL models show good results in malaria parasite detection, although this field needs security and privacy of patient data [9]. Our proposed CNN model achieves better results in accuracy, sensitivity, specificity, precision, recall, F-score, Area Under Curve (AUC), and Matthews Correlation Coefficient (MCC). Moreover, the proposed method shows outstanding performance when compared with the existing state-of-theart models [2] [7]. Figure 1 represents the block diagram of malaria detection, in general. These techniques consist of five stages i.e. (i) data acquisition, (ii) data preprocessing, (iii) segmentation, (iv) feature extraction, and (v) classification of data [5]. The rest of the paper is organized as follows: Section 2elaborates related work carried out for diagnosis and classification of the malaria parasite. Section 3 discusses the proposed method and data collection process. In Section 4, results are presented and Section 4 provides a discussion on experimental results. Last but not the least, Section 5 concludes the paper. CNN-based deep learning is the recent state-of-the-art technique used for detection, classification, and segmentation [6]. Rajaraman, et al. [2] proposed a 16-layers CNN model and achieves 95.46% accuracy in malaria detection. Vijayalakshmi et al. [4] proposed a transferlearning system for malaria-affected cells and used VGG19 network along with a support vector machine (SVM). The VGG-SVM model was compared with LeNet-5, AlexNet, and GoogleNet and was found better in accuracy (i.e., 93.13%) than all these models. Liang et al. [7] proposed a 16-layers CNN architecture and the accuracy was 97.37%. Hung et al. [10] trained the Faster R-CNN model over vivax images and detect the object in malaria images. They applied Region Proposal Network (RPN) based CNN, as a first stage, on malaria images for binary classification of RBC and non-RBC objects, and applied AlexNet, as a second stage, on non-RBC objects for detection of further classes. Narayanan et al. [11] proposed a CNN model and compared its performance with the transfer learning models, i.e. ResNet, VGG19, VGG16, and Dense Net. Reddy et al. [12] trained ResNet-50 over 27,558 falciparum images for malaria classification; claimed to achieve 95.91% training accuracy and 95.4% validation accuracy. Prasad et al. [15] develop a decision support system to detect malaria from thin blood smear images. This study includes 200-images, which are plasmodium vivax and falciparum. Bibin et al. [16] applied Deep Belief Network, for the first time for malaria detection, and 4100 infected and non-infected images were used. Timely detection of malaria provides a way to detect malaria parasites in an accurate aspect. Expert technicians require for the effective detection of malaria. The whole process is annoying and faulty due to the subject and might be false medication and even death of the patient [16]. Dong et al. [17] applied automated measurement of malaria diagnosis through DL techniques. The dataset includes malaria images and classifies the three CNN models that are LeNet, Alex Net, and Google Net, and compared with SVM. Deep learning models have an accuracy of 95%, which is more than 92% obtainable by SVM. Pamungkas et al. [19] used a decision tree algorithm for malaria classification which gives 87.67% accuracy. Razzak et al. [26]used artificial neural network for malaria classification, dataset was falciparum, vivax, malariae and ovale. Suryawanshi et al. [29]used SVM and Euclidean distance classifier, SVM gives 93.33% accuracy while Euclidean distance classifier provides 83.33% accuracy in malaria classification. Sharma et al. [33]predict malaria using SVM and ANN. SVM has high accuracy and early detection of malaria than ANN, 1680 samples are used.
Go et al. [34]used 6 machine learning techniques for malaria classification and proposed that SVM has high results with an accuracy score of 96.78% than other machine learning approaches that are logistic regression, decision tree, k nearest neighbor (KNN), linear discriminant classifier, quadratic discriminant classifier. For this purpose, 280 infected images are used and classification is done with 10-fold validation. Park et al. [35] identify plasmodium falciparum stages that are schizont and trophozoite using Linear Discriminant Classification (LDC), Logistic Regression (LR), and KNN. LR is based on the maximum likelihood method, while KNN finds adjacent points and classification based on the maximum nearest majority. Infected and uninfected 1237 plasmodium falciparum images were used for classification purposes with equal dataset splitting.
Rosado et al. [36] used SVM classifier and achieved 98.2% sensitivity and 72.1% specificity in the identification of white blood cells (WBC) of Giemsa stained blood cells and 80.5% sensitivity, 93.8% specificity in the identification of P. falciparum trophozoite stages. Jagtap et al. [37]proposed a machine learning model with high accuracy results when compared to SVM and bayesian models in the study. Savkare et al. [38] used SVM for malaria detection and achieved 96.42% accuracy. SVM classifier is used to classify blood cells as infected or noninfected. Linear, polynomial, and RBF kernels are used for the detection of malaria, and RBF kernels give high accuracy on 71 images in the identification of stages than other SVM kernels. Pandit et al. [39]used ANN for malaria detection and gives 90% accuracy. Table I represents the literature on the existing machine learning and deep learning techniques that were used for malaria detection.
Our proposed CNN architecture has achieved greater accuracy, sensitivity, precision, accuracy, specificity, recall, f-score, AUC, and MCC than the model of transfer learning.

III. PROPOSED METHOD
The proposed method is comprised of different steps, i.e., (1) data collection, (2) pre-processing, (3) proposed model, (4) splitting the dataset, (5) training the dataset, (6) testing the proposed model and the last section is about CNNarchitecture. Each block of the diagram is explained as follows: Figure 2 describes the malaria parasite types and their species, Figure 3 explains the CNN architecture for malaria detection and Figure 4 describes the training and validation accuracy/loss.

A. DATA COLLECTION
The Dataset was collected from the GitHub repository [21] and Kaggle [22], it contains 34,625 microscopic images. Figure 2 represents different types of the malaria parasite and their species. Deep learning models for malaria detection [2][7] [12] were evaluated on a 27,558 image dataset, our proposed 15-layers CNN model uses 34,625 microscopic images and shows better results. FIGURE2. In this figure Malaria parasites types are in first row i.e. falciparum, vivax, ovale, malariae and its species are in the second row i.e. ring, trophozoite, schizont, gametocyte and leukocyte Github repository [21]contains malaria Plasmodium types, i.e., falciparum, vivax, and ovale parasites, while Kaggle [22]contains all species of malaria that are ring, Leukocyte, Gametocyte, Trophozoite, Schizont.

B.
PRE-PROCESSING In order to preprocess the dataset, we resize the image into 64x64 pixels for the training and testing.

C.
PROPOSED MODEL The proposed model is comprised of eight convolutional layers, four pooling layers, one flatten and two fully connected layers. The sequence of layers is such that each two convolutional layers are followed by a pooling layer. Then, one flatten and two fully connected layers, are applied in the network. There are 848,018 parameters in the network and all parameters are trained. The batch size is 86, and total number of steps per epoch is founded by dividing 86 to total trainable examples, all filters have 3 x 3 size. Filter starts from 16 and ends on 128, two 16, two 32, two 64, and after that two 128. Relu [13]is used as an activation function and Batch normalization [14]is employed in the proposed model to avoid vanishing gradient problem. Figure 3 represents the 15-layers CNN architecture for malaria detection. IMPLEMENTATION AND TRAINING In order to implement the proposed model, we employed Python version 3.6.9, Keras 1.14.0 and Tensor Flow 2.3.1, Ubuntu 18.04, Jupyter Notebook, 8GB RAM. We used 70% of the dataset for training,, 15% dataset for validation and 15% dataset for testing. Softmax is used as the activation function on the fully connected layer. Categorical_crossentropy is used as loss parameter, and this parameter applied for the two label classes. ADAM is used as an optimizer, and it is applied to speed up the training process. The number of epochs is 30.

IV. RESULTS & DISCUSSION
The results of our model demonstrate its accuracy for malaria detection using microscopic images. The proposed CNN model achieves 97.42% accuracy. Table II represents that the proposed model has high accuracy in malaria detection than existing image processing, machine learning, and deep learning techniques. Previously, a 16-layer model [2] was proposed in which the dataset of 27,558 falciparum images was used. Whereas, in our research study we have used the dataset with 34,625 parasite images that contain all four plasmodium and their species as shown in Fig. 2. Computational cost is lower than in previous work [2], and the accuracy of detecting malaria is high as shown in Fig. 4. Table III represents the results of the CNN model and transfer learning model on the same dataset and proposed CNN model achieves better results in accuracy, specificity, sensitivity, recall, precision, f-score, MCC, and AUC.  Figure 4 represents the training accuracy, validation accuracy, training loss, and validation loss. Table IV shows the training, validation accuracy and loss. Figure 4 shows that our proposed model achieves 97.42% accuracy for malaria detection. It is obvious from the Figure 4 that the proposed model has a loss lower than 1%. V. CONCLUSIONS As malaria is among dangerous epidemics, therefore, there is a need for an intelligent system that detects the malaria disease accurately and efficiently. In this research study, a 15-layers CNN model was presented for malaria detection. The study illustrates that the proposed deep learning-based model achieved better results in diagnosing malaria. Furthermore, to evaluate the performance of the proposed model, the transfer-learning model is also trained on the same dataset and it is observed that CNN model achieves better results in terms of accuracy, sensitivity, specificity, precision, recall, f-score, AUC and MCC being above 97%. This also proves to be better when compared with existing state-of-the-art models such as VGG19, ResNet-50, LeNet, AlexNet, and GoogleNet. In this study, more than 34,000 images are used in the training process, and with lesser complexity, the computation cost is used.