Radiographic Image Analysis
Deep Hybrid Learning for Radiographic Image Analysis
Deep Learning algorithms are quite powerful and effective for doing image analysis. Complex Deep Neural Network architectures like DenseNet-201 has been extremely powerful for classifying complicated medical images like radiographic (X-ray) images. As discussed in my medium article, the main reason why Deep Learning approaches are successful, is because deep learning removes the need of manual feature engineering and can learn features on it’s own. Feature engineering for complicated medical images like X-ray images are always complicated, so classical machine learning fails to out-perform deep learning methods.
But can we combine the benefits of both deep learning and machine learning and create a fusion network that might perform better than either of them?
Actually we can! And we will call this fusion network as Deep Hybrid Network and this approach to creating a fusion network will be termed as Deep Hybrid Learning (DHL)!
For more details of DHL, please take a look at the medium article. But overall, in DHL, we will use a Deep Neural Network to generate features from Radiographic Images and then perform the classification task by using a classical Machine Learning classifier.
As we can see from the above figure, this is a two step process. In the first step we will use a Deep Neural Network to extract features from the raw image dataset, thereby removing the need for manual feature engineering and in the next step, we will use these extracted features to run a machine learning classification algorithm, thereby removing the fully connected classification layers from a typical Deep Learning model.
How to implement a DHN?
In this section, we will use Keras, Tensorflow and Scikit-Learn to implement the concept of Deep Hybrid Network. The following code snippet shows us how to load the data :
opath_len = len(orig_path) path = '../data' train_image = [] added_features = [] for image_path in data['Path'][:20000]: #print('../data' + image_path[opath_len:]) img = image.load_img(path + image_path[opath_len:],target_size= 250,250,3)) img = image.img_to_array(img) img = img/255 train_image.append(img)
Now, we will see how our simple DNN looks like:
num_classes = 2 ## CNN model - Deep Neural Network code model_LN = Sequential() model_LN.add(Conv2D(filters=16, kernel_size=(5, 5), activation="relu", input_shape=(250,250,3))) model_LN.add(MaxPooling2D(pool_size=(2, 2))) model_LN.add(Dropout(0.25)) model_LN.add(Conv2D(filters=32, kernel_size=(5, 5), activation='relu')) model_LN.add(MaxPooling2D(pool_size=(2, 2))) model_LN.add(Dropout(0.25)) model_LN.add(Conv2D(filters=64, kernel_size=(5, 5), activation="relu")) model_LN.add(MaxPooling2D(pool_size=(2, 2))) model_LN.add(Dropout(0.25)) model_LN.add(Conv2D(filters=64, kernel_size=(5, 5), activation='relu')) model_LN.add(MaxPooling2D(pool_size=(2, 2))) model_LN.add(Dropout(0.25)) model_LN.add(Flatten()) model_LN.add(Dense(128, activation='relu')) model_LN.add(Dropout(0.5)) model_LN.add(Dense(64, activation='relu')) model_LN.add(Dropout(0.5)) #model.add(Dense(num_classes, activation='sigmoid')) model_LN.add(Dense(num_classes, activation='softmax')) model_LN.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) early_stop = EarlyStopping(monitor='val_loss', patience=8, verbose=1, min_delta=1e-4) reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=4, verbose=1, min_delta=1e-4) callbacks_list = [early_stop, reduce_lr]
But for deep hybrid learning as we have said, we will not have the fully collected layers of the DNN model, so this is how it will look like :
num_classes = 2 ## Deep Hybrid Network's DNN layer which will be used for feature extraction conv_base= Sequential() conv_base.add(Conv2D(filters=16, kernel_size=(5, 5), activation="relu", input_shape=(250,250,3))) conv_base.add(MaxPooling2D(pool_size=(2, 2))) conv_base.add(Dropout(0.25)) conv_base.add(Conv2D(filters=32, kernel_size=(5, 5), activation='relu')) conv_base.add(MaxPooling2D(pool_size=(2, 2))) conv_base.add(Dropout(0.25)) conv_base.add(Conv2D(filters=64, kernel_size=(5, 5), activation="relu")) conv_base.add(MaxPooling2D(pool_size=(2, 2))) conv_base.add(Dropout(0.25)) conv_base.add(Conv2D(filters=64, kernel_size=(5, 5), activation='relu')) conv_base.add(MaxPooling2D(pool_size=(2, 2))) conv_base.add(Dropout(0.25)) conv_base.add(Flatten())
Also, DHN supports the concept of using pre-trained networks, for something like transfer learning. The below code shows how we can use a DenseNet model for this purpose :
from keras.applications import DenseNet121 conv_base = DenseNet121(include_top=False, weights='imagenet',input_shape=(250, 250, 3))
Now. let’s see how to extract features using a DNN.
from keras.preprocessing.image import ImageDataGenerator datagen = ImageDataGenerator(rescale=1./255) # Feature Extraction def extract_features(X_train, y_train, sample_count): batch_size = 32 features = np.zeros(shape=(sample_count, 7, 7, 1024)) labels = np.zeros(shape=(sample_count)) datagen.fit(X_train) generator = datagen.flow(X_train, y_train, batch_size=32) i = 0 for inputs_batch, labels_batch in generator: features_batch = conv_base.predict(inputs_batch) #print(features_batch.shape) features[i * batch_size : (i + 1) * batch_size] = features_batch labels[i * batch_size : (i + 1) * batch_size] = labels_batch i += 1 if i * batch_size >= sample_count: break return features, labels
train_features, train_labels = extract_features(xTrain, yTrain, xTrain.shape[0])
Finally, we will use these features and use a Machine Learning Classification algorithm to fit on the extracted features. So, we will see how to apply Random Forest, AdaBoost and XGBoost on the extracted features.
Random Forest :
from sklearn.ensemble import RandomForestClassifier hybrid_model_RF = RandomForestClassifier() hybrid_model_RF.fit(xTrain, yTrain) predicted_test_labels = hybrid_model_RF.predict(xTest)
AdaBoost :
from sklearn.ensemble import AdaBoostClassifier from sklearn.tree import DecisionTreeClassifier hybrid_model_AB = AdaBoostClassifier( DecisionTreeClassifier(max_depth=3), n_estimators=10 ) hybrid_model_AB.fit(xTrain, yTrain) predicted_test_labels = hybrid_model_AB.predict(xTest)
XGBoost :
from xgboost import XGBClassifier hybrid_model_XG = XGBClassifier() hybrid_model_XG.fit(xTrain, yTrain) predicted_test_labels = hybrid_model_XG.predict(xTest)
And, finally as discussed in my medium article, this is how the final model architecture looks like :
In the medium article, I have talked about the model evaluation part. Below is the code to perform model evaluation using the various metrics discussed in the artcile
import numpy as np from sklearn import metrics import matplotlib.pyplot as plt from sklearn.preprocessing import LabelBinarizer import seaborn as sns def multiclass_roc_auc_score(y_test, y_pred, average="macro"): lb = LabelBinarizer() lb.fit(y_test) y_test = lb.transform(y_test) y_pred = lb.transform(y_pred) return metrics.roc_auc_score(y_test, y_pred, average=average) def evaluate_model_performance_valid_data(y_orig, y_pred): accuracy = metrics.accuracy_score(y_orig, y_pred) print('Accuracy is:',accuracy) # For precision of each class individually use average= None precision = metrics.precision_score(y_orig, y_pred, average='macro') print('Precision is:',precision) # For recall of each class individually use average= None recall = metrics.recall_score(y_orig, y_pred, average='macro') print('Recall is:',recall) # For recall of each class individually use average= None f1_score_value = metrics.f1_score(y_orig, y_pred, average='macro') print('F1 Score is:',f1_score_value) # AUC Scores auc_score = multiclass_roc_auc_score(y_orig, y_pred) print('AUC score is:', auc_score) cm = metrics.confusion_matrix(y_orig, y_pred) print(cm) plt.figure(figsize=(5,5)) sns.heatmap(cm, annot=True, fmt=".2f", linewidths=.5, square = True, cmap = 'Blues'); plt.ylabel('Actual label'); plt.xlabel('Predicted label'); all_sample_title = 'AUC Score: {:.2f}'.format(auc_score) plt.title(all_sample_title, size = 12)
And this is how the result will look like:
I have seen remarkable results so far using this approach. I would encourage all the readers to try this out and let me know if they are getting in improvement in the performance both in terms of execution time and as well as in terms model performance.
With this we come to the end of this article. Please comment or connect with me, with your feedback or any suggestions!
25 Responses
hey! thanks for this explanation. Can i get complete code.
If you could provide then it will be very helpful to me.
Thanks
Hello Renu,
I think in the article snippets of code are already mentioned. You would just have to put everything together and run your own analysis. There is a similar work for which the code is well documented. Please feel free to refer this: https://github.com/adib0073/NUMODRIL
Best Regards,
Aditya
Hi,
Thank you for your explanation. I am trying to use your code with my data (OCT images) for regression, I put all code together. I saved the images in the list as you have done (train_image).
when I split the data and call the extract_features function, I think there are missing steps!!
if you could explain that then it will be very helpful for me.
much appreciated.
Thanks.
Hi Sumeia,
Thanks for reaching out. Can you help understand your problem in little more depth. You have mentioned about using this approach your regression, so I am not sure how that will be helpful. But if you can share your code or notebook through email or LinkedIn, I can take a look and try helping you solve the problem that you are facing.
Thanks,
Aditya
Hey, Aditya I am working on multi-disease classification using chest X-ray using NIH dataset but as the dataset is largely imbalanced our model have some issue with accuracy so I was exploring some balancing technique which landed me here. So I just wanted to know how can I apply SMOTE on mutli class dataset whether I should apply on dataset first and create my balanced dataset and then apply to our model. or any other way. It would be really helpful to me.
Inshort how will I balance mutliple classes using smote
Hi, SMOTE can be applied for Multiclass classification problem in the same way as it is applied for Binary classification problem. You can refer the code from this repository of mine: https://github.com/adib0073/NUMODRIL/blob/main/Code%20Snippets/SMOTE.py
In case if the statistical properties of all the classes are quite similar, you may want to try with a different technique like focal loss, which proved to be effective in many use cases.
Hi,
I’m working on a similar problem, but with thermal images instead of X-Ray. Although, I have a major doubt about this approach. You use the images from the training set to train the CNN and then you use the same images to extract features. Is this considered good practice? Also, do you think it’s also possible to combine handcrafted features with the CNN features? Thanks in advance.
Regards,
Akash
Hi Akash,
Thanks for reaching out. To answer your question, initially you would need to train your model on your training dataset and then use the trained model to extract features from your test or validation dataset. For the feature extraction, there can be multiple ways. Either you can cut end FC layers from a conventional DNN model and train only the encoder part. Or you can train a full model, and just cut out the FC layers to get the feature results. You can also refer this code: https://github.com/adib0073/NUMODRIL/blob/main/Code%20Snippets/DHL.py and observe how the feature extraction is done differently, and can be done for training, testing and validation datasets. For your second question, yes it is absolutely possible. Usually the CNN extracted featured are flattened and can be appended with other external features or handcrafted features.
Can I get a complete code of this implementation.
Can I get a complete code for this implementation? and want to know how you have applied SMOTE and other imbalance technique
Hi,
Thanks for reaching out. You can refer the code for a similar but slightly different problem from this repository of mine: https://github.com/adib0073/NUMODRIL/blob/main/Code%20Snippets/SMOTE.py
Hi Aditya,
Are you aware of additional use-cases of this hybrid approach, which is used in production?
Best,
Saar
Hi Saar,
Thanks for the question. In short any problem that can be solved using Deep Learning can also be solved using Deep Hybrid Learning. Using I have seen better results for supervised learning problems. But unsupervised problems like image based search system is also another good example where DHP is a go to approach.
Thanks,
Aditya
Thank you for this very useful article, and I would like to ask you, do you think this method will be effective in the case of multi-class classification ??
and, is it possible to apply a realistic example of a multi-class classification using this method,
Thank you very much…
Yes absolutely! But it might also depend on the dynamics of your dataset. Another approach that worked for me for multi class classification is the Focal Loss approach. You may try that as well.
Hi Aditya,
I understood X_train, y_train but what are xTrain, yTrain…in….. ?
train_features, train_labels = extract_features(xTrain, yTrain, xTrain.shape[0]).
Please explain am just badly confused.
Thanks in advance.
Hi Aditya,
I understood X_train, y_train but what are xTrain, yTrain…in….. ?
train_features, train_labels = extract_features(xTrain, yTrain, xTrain.shape[0]).
Please explain am just badly confused.
Thanks in advance.
Hi Priyanka,
Thanks for your question. So, xTrain/yTrain is equivalent to X_train, its just that I had applied SMOTE to upsample the minority classes and after applying SMOTE I had stored x_train values under new variable name xTrain.
OK…Thank you so much, Aditya.
Hi Aditya,
I came across with another error while extracting features:
ValueError Traceback (most recent call last)
in ()
—-> 1 train_features, train_labels = extract_features(x_train, y_train, x_train.shape[0])
in extract_features(x_train, y_train, sample_count)
23
24 features[i * batch_size : (i + 1) * batch_size] = features_batch
—> 25 labels[i * batch_size : (i + 1) * batch_size] = labels_batch
26 i += 1
27 if i * batch_size >= sample_count:
ValueError: could not broadcast input array from shape (32,1) into shape (32)
I am trying for the last 15 days but could not able to solve it. Kindly help me out. Thanks in advance.
Hi sir thank you very much for the video it was very help full and the explanation was very good…
sir can u suggest me any latest de-noising algorithms to de noise medical images like x ray images?
Hi Ashwini,
Thank you for your question. For image de-noising I did come across certain encoder decoder based architectures and certain unsupervised and semi supervised methods used for denoising.
Dear Aditya Sir,
i am doing text classification on tweeter data. i am very new to research. Now I am applying a deep hybrid model to text classification.
Could be please share your email id so that I will communicate my problem which help me
Thanks for reaching out. Please feel free to connect with me through any means mentioned here: https://aditya-bhattacharya.net/contact-me/
Wow, awesome blog layout! How long have you been blogging for?
you make blogging look easy. The overall look of your website is fantastic,
as well as the content!