Radiographic Image Analysis

by Aditya
in Computer Vision, Data Science, Deep Learning
on July 27, 2020

Deep Hybrid Learning for Radiographic Image Analysis

Chest X Ray Results For Copd - COPD Blog e

Deep Learning algorithms are quite powerful and effective for doing image analysis. Complex Deep Neural Network architectures like DenseNet-201 has been extremely powerful for classifying complicated medical images like radiographic (X-ray) images. As discussed in my medium article, the main reason why Deep Learning approaches are successful, is because deep learning removes the need of manual feature engineering and can learn features on it’s own. Feature engineering for complicated medical images like X-ray images are always complicated, so classical machine learning fails to out-perform deep learning methods.

But can we combine the benefits of both deep learning and machine learning and create a fusion network that might perform better than either of them?

Actually we can! And we will call this fusion network as Deep Hybrid Network and this approach to creating a fusion network will be termed as Deep Hybrid Learning (DHL)!

For more details of DHL, please take a look at the medium article. But overall, in DHL, we will use a Deep Neural Network to generate features from Radiographic Images and then perform the classification task by using a classical Machine Learning classifier.

As we can see from the above figure, this is a two step process. In the first step we will use a Deep Neural Network to extract features from the raw image dataset, thereby removing the need for manual feature engineering and in the next step, we will use these extracted features to run a machine learning classification algorithm, thereby removing the fully connected classification layers from a typical Deep Learning model.

How to implement a DHN?

In this section, we will use Keras, Tensorflow and Scikit-Learn to implement the concept of Deep Hybrid Network. The following code snippet shows us how to load the data :

opath_len = len(orig_path)
path = '../data'
train_image = []
added_features = []
for image_path in data['Path'][:20000]:
    #print('../data' + image_path[opath_len:])
    img = image.load_img(path  + image_path[opath_len:],target_size= 250,250,3))
    img = image.img_to_array(img)
    img = img/255
    train_image.append(img)

Now, we will see how our simple DNN looks like:

num_classes = 2 
## CNN model - Deep Neural Network code
model_LN = Sequential()
model_LN.add(Conv2D(filters=16, kernel_size=(5, 5), activation="relu", input_shape=(250,250,3)))
model_LN.add(MaxPooling2D(pool_size=(2, 2)))
model_LN.add(Dropout(0.25))
model_LN.add(Conv2D(filters=32, kernel_size=(5, 5), activation='relu'))
model_LN.add(MaxPooling2D(pool_size=(2, 2)))
model_LN.add(Dropout(0.25))
model_LN.add(Conv2D(filters=64, kernel_size=(5, 5), activation="relu"))
model_LN.add(MaxPooling2D(pool_size=(2, 2)))
model_LN.add(Dropout(0.25))
model_LN.add(Conv2D(filters=64, kernel_size=(5, 5), activation='relu'))
model_LN.add(MaxPooling2D(pool_size=(2, 2)))
model_LN.add(Dropout(0.25))
model_LN.add(Flatten())
model_LN.add(Dense(128, activation='relu'))
model_LN.add(Dropout(0.5))
model_LN.add(Dense(64, activation='relu'))
model_LN.add(Dropout(0.5))
#model.add(Dense(num_classes, activation='sigmoid'))
model_LN.add(Dense(num_classes, activation='softmax'))

model_LN.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

early_stop = EarlyStopping(monitor='val_loss', patience=8, verbose=1, min_delta=1e-4)
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=4, verbose=1, min_delta=1e-4)
callbacks_list = [early_stop, reduce_lr]

But for deep hybrid learning as we have said, we will not have the fully collected layers of the DNN model, so this is how it will look like :

num_classes = 2 
## Deep Hybrid Network's DNN layer which will be used for feature extraction
conv_base= Sequential()
conv_base.add(Conv2D(filters=16, kernel_size=(5, 5), activation="relu", input_shape=(250,250,3)))
conv_base.add(MaxPooling2D(pool_size=(2, 2)))
conv_base.add(Dropout(0.25))
conv_base.add(Conv2D(filters=32, kernel_size=(5, 5), activation='relu'))
conv_base.add(MaxPooling2D(pool_size=(2, 2)))
conv_base.add(Dropout(0.25))
conv_base.add(Conv2D(filters=64, kernel_size=(5, 5), activation="relu"))
conv_base.add(MaxPooling2D(pool_size=(2, 2)))
conv_base.add(Dropout(0.25))
conv_base.add(Conv2D(filters=64, kernel_size=(5, 5), activation='relu'))
conv_base.add(MaxPooling2D(pool_size=(2, 2)))
conv_base.add(Dropout(0.25))
conv_base.add(Flatten())

Also, DHN supports the concept of using pre-trained networks, for something like transfer learning. The below code shows how we can use a DenseNet model for this purpose :

from keras.applications import DenseNet121
conv_base = DenseNet121(include_top=False, weights='imagenet',input_shape=(250, 250, 3))

Now. let’s see how to extract features using a DNN.

from keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(rescale=1./255)

# Feature Extraction
def extract_features(X_train, y_train, sample_count):
    batch_size = 32
    features = np.zeros(shape=(sample_count, 7, 7, 1024))
    labels = np.zeros(shape=(sample_count))
    datagen.fit(X_train)
    generator = datagen.flow(X_train, y_train, batch_size=32)
    i = 0
    for inputs_batch, labels_batch in generator:
        features_batch = conv_base.predict(inputs_batch)
        #print(features_batch.shape)
        features[i * batch_size : (i + 1) * batch_size] = features_batch
        labels[i * batch_size : (i + 1) * batch_size] = labels_batch
        i += 1
        if i * batch_size >= sample_count:
            break
    return features, labels

train_features, train_labels = extract_features(xTrain, yTrain, xTrain.shape[0])

Finally, we will use these features and use a Machine Learning Classification algorithm to fit on the extracted features. So, we will see how to apply Random Forest, AdaBoost and XGBoost on the extracted features.

Random Forest :

from sklearn.ensemble import RandomForestClassifier
hybrid_model_RF = RandomForestClassifier()
hybrid_model_RF.fit(xTrain, yTrain)
predicted_test_labels = hybrid_model_RF.predict(xTest)

AdaBoost :

from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
hybrid_model_AB = AdaBoostClassifier(
    DecisionTreeClassifier(max_depth=3),
    n_estimators=10
)
hybrid_model_AB.fit(xTrain, yTrain)
predicted_test_labels = hybrid_model_AB.predict(xTest)

XGBoost :

from xgboost import XGBClassifier
hybrid_model_XG = XGBClassifier()
hybrid_model_XG.fit(xTrain, yTrain)
predicted_test_labels = hybrid_model_XG.predict(xTest)

And, finally as discussed in my medium article, this is how the final model architecture looks like :

In the medium article, I have talked about the model evaluation part. Below is the code to perform model evaluation using the various metrics discussed in the artcile

import numpy as np
from sklearn import metrics
import matplotlib.pyplot as plt
from sklearn.preprocessing import LabelBinarizer
import seaborn as sns

def multiclass_roc_auc_score(y_test, y_pred, average="macro"):
        lb = LabelBinarizer()
        lb.fit(y_test)
        y_test = lb.transform(y_test)
        y_pred = lb.transform(y_pred)
        return metrics.roc_auc_score(y_test, y_pred, average=average)
        
def evaluate_model_performance_valid_data(y_orig, y_pred):
        accuracy = metrics.accuracy_score(y_orig, y_pred)
        print('Accuracy is:',accuracy)

        # For precision of each class individually use average= None
        precision = metrics.precision_score(y_orig, y_pred, average='macro')
        print('Precision is:',precision)

        # For recall of each class individually use average= None
        recall = metrics.recall_score(y_orig, y_pred, average='macro')
        print('Recall is:',recall)

        # For recall of each class individually use average= None
        f1_score_value = metrics.f1_score(y_orig, y_pred, average='macro')
        print('F1 Score is:',f1_score_value)
        
        # AUC Scores
        auc_score = multiclass_roc_auc_score(y_orig, y_pred)
        print('AUC score is:', auc_score)

        cm = metrics.confusion_matrix(y_orig, y_pred)
        print(cm)

        plt.figure(figsize=(5,5))
        sns.heatmap(cm, annot=True, fmt=".2f", linewidths=.5, square = True, cmap = 'Blues');
        plt.ylabel('Actual label');
        plt.xlabel('Predicted label');
        all_sample_title = 'AUC Score: {:.2f}'.format(auc_score)
        plt.title(all_sample_title, size = 12)

And this is how the result will look like:

I have seen remarkable results so far using this approach. I would encourage all the readers to try this out and let me know if they are getting in improvement in the performance both in terms of execution time and as well as in terms model performance.

With this we come to the end of this article. Please comment or connect with me, with your feedback or any suggestions!

Pages: 12

Tags: Deep Hybrid Learning, Deep Hybrid Network, Radiographic Image Analysis

25 Responses

Renu says:

March 9, 2021 at 3:26 pm

hey! thanks for this explanation. Can i get complete code.
If you could provide then it will be very helpful to me.
Thanks

Reply
- Aditya says:
  
  March 10, 2021 at 8:12 am
  
  Hello Renu,
  
  I think in the article snippets of code are already mentioned. You would just have to put everything together and run your own analysis. There is a similar work for which the code is well documented. Please feel free to refer this: https://github.com/adib0073/NUMODRIL
  
  Best Regards,
  Aditya
  
  Reply
  - sumeia says:
    
    March 10, 2021 at 6:12 pm
    
    Hi,
    Thank you for your explanation. I am trying to use your code with my data (OCT images) for regression, I put all code together. I saved the images in the list as you have done (train_image).
    when I split the data and call the extract_features function, I think there are missing steps!!
    if you could explain that then it will be very helpful for me.
    much appreciated.
    Thanks.
    
    Reply
    - Aditya says:
      
      April 3, 2021 at 8:23 am
      
      Hi Sumeia,
      
      Thanks for reaching out. Can you help understand your problem in little more depth. You have mentioned about using this approach your regression, so I am not sure how that will be helpful. But if you can share your code or notebook through email or LinkedIn, I can take a look and try helping you solve the problem that you are facing.
      
      Thanks,
      Aditya
      
      Reply
Vishal Savade says:

May 3, 2021 at 1:39 pm

Hey, Aditya I am working on multi-disease classification using chest X-ray using NIH dataset but as the dataset is largely imbalanced our model have some issue with accuracy so I was exploring some balancing technique which landed me here. So I just wanted to know how can I apply SMOTE on mutli class dataset whether I should apply on dataset first and create my balanced dataset and then apply to our model. or any other way. It would be really helpful to me.
Inshort how will I balance mutliple classes using smote

Reply
- Aditya says:
  
  May 9, 2021 at 10:32 pm
  
  Hi, SMOTE can be applied for Multiclass classification problem in the same way as it is applied for Binary classification problem. You can refer the code from this repository of mine: https://github.com/adib0073/NUMODRIL/blob/main/Code%20Snippets/SMOTE.py
  In case if the statistical properties of all the classes are quite similar, you may want to try with a different technique like focal loss, which proved to be effective in many use cases.
  
  Reply
Akash Ravi Prame says:

June 8, 2021 at 3:15 pm

Hi,

I’m working on a similar problem, but with thermal images instead of X-Ray. Although, I have a major doubt about this approach. You use the images from the training set to train the CNN and then you use the same images to extract features. Is this considered good practice? Also, do you think it’s also possible to combine handcrafted features with the CNN features? Thanks in advance.

Regards,
Akash

Reply
- Aditya says:
  
  June 11, 2021 at 6:56 pm
  
  Hi Akash,
  
  Thanks for reaching out. To answer your question, initially you would need to train your model on your training dataset and then use the trained model to extract features from your test or validation dataset. For the feature extraction, there can be multiple ways. Either you can cut end FC layers from a conventional DNN model and train only the encoder part. Or you can train a full model, and just cut out the FC layers to get the feature results. You can also refer this code: https://github.com/adib0073/NUMODRIL/blob/main/Code%20Snippets/DHL.py and observe how the feature extraction is done differently, and can be done for training, testing and validation datasets. For your second question, yes it is absolutely possible. Usually the CNN extracted featured are flattened and can be appended with other external features or handcrafted features.
  
  Reply
Sharath Manjunath says:

June 21, 2021 at 10:58 pm

Can I get a complete code of this implementation.

Reply
Sharath Manjunath says:

June 21, 2021 at 11:00 pm

Can I get a complete code for this implementation? and want to know how you have applied SMOTE and other imbalance technique

Reply
- Aditya says:
  
  June 26, 2021 at 11:05 pm
  
  Hi,
  
  Thanks for reaching out. You can refer the code for a similar but slightly different problem from this repository of mine: https://github.com/adib0073/NUMODRIL/blob/main/Code%20Snippets/SMOTE.py
  
  Reply
Saar Eliad says:

August 4, 2021 at 8:12 pm

Hi Aditya,
Are you aware of additional use-cases of this hybrid approach, which is used in production?

Best,
Saar

Reply
- Aditya says:
  
  August 12, 2021 at 3:11 pm
  
  Hi Saar,
  
  Thanks for the question. In short any problem that can be solved using Deep Learning can also be solved using Deep Hybrid Learning. Using I have seen better results for supervised learning problems. But unsupervised problems like image based search system is also another good example where DHP is a go to approach.
  
  Thanks,
  Aditya
  
  Reply
bayan says:

August 31, 2021 at 12:46 am

Thank you for this very useful article, and I would like to ask you, do you think this method will be effective in the case of multi-class classification ??

and, is it possible to apply a realistic example of a multi-class classification using this method,

Thank you very much…

Reply
- Aditya says:
  
  September 7, 2021 at 10:52 am
  
  Yes absolutely! But it might also depend on the dynamics of your dataset. Another approach that worked for me for multi class classification is the Focal Loss approach. You may try that as well.
  
  Reply
Priyanka Sahu says:

September 3, 2021 at 11:31 am

Hi Aditya,

I understood X_train, y_train but what are xTrain, yTrain…in….. ?
train_features, train_labels = extract_features(xTrain, yTrain, xTrain.shape[0]).
Please explain am just badly confused.
Thanks in advance.

Reply
PRIYANKA says:

September 3, 2021 at 11:33 am

Hi Aditya,

I understood X_train, y_train but what are xTrain, yTrain…in….. ?
train_features, train_labels = extract_features(xTrain, yTrain, xTrain.shape[0]).
Please explain am just badly confused.
Thanks in advance.

Reply
- Aditya says:
  
  September 7, 2021 at 11:03 am
  
  Hi Priyanka,
  
  Thanks for your question. So, xTrain/yTrain is equivalent to X_train, its just that I had applied SMOTE to upsample the minority classes and after applying SMOTE I had stored x_train values under new variable name xTrain.
  
  Reply
  - PRIYANKA says:
    
    September 9, 2021 at 7:01 am
    
    OK…Thank you so much, Aditya.
    
    Reply
  - PRIYANKA says:
    
    September 26, 2021 at 8:37 am
    
    Hi Aditya,
    
    I came across with another error while extracting features:
    
    ValueError Traceback (most recent call last)
    in ()
    —-> 1 train_features, train_labels = extract_features(x_train, y_train, x_train.shape[0])
    
    in extract_features(x_train, y_train, sample_count)
    23
    24 features[i * batch_size : (i + 1) * batch_size] = features_batch
    —> 25 labels[i * batch_size : (i + 1) * batch_size] = labels_batch
    26 i += 1
    27 if i * batch_size >= sample_count:
    
    ValueError: could not broadcast input array from shape (32,1) into shape (32)
    
    I am trying for the last 15 days but could not able to solve it. Kindly help me out. Thanks in advance.
    
    Reply
ashwini says:

September 9, 2021 at 10:17 am

Hi sir thank you very much for the video it was very help full and the explanation was very good…
sir can u suggest me any latest de-noising algorithms to de noise medical images like x ray images?

Reply
- Aditya says:
  
  September 10, 2021 at 7:02 pm
  
  Hi Ashwini,
  
  Thank you for your question. For image de-noising I did come across certain encoder decoder based architectures and certain unsupervised and semi supervised methods used for denoising.
  
  Reply
siva krishna says:

August 2, 2022 at 4:03 pm

Dear Aditya Sir,

i am doing text classification on tweeter data. i am very new to research. Now I am applying a deep hybrid model to text classification.

Could be please share your email id so that I will communicate my problem which help me

Reply
- Aditya says:
  
  August 4, 2022 at 10:47 pm
  
  Thanks for reaching out. Please feel free to connect with me through any means mentioned here: https://aditya-bhattacharya.net/contact-me/
  
  Reply
online pharmacies says:

November 7, 2022 at 4:49 pm

Wow, awesome blog layout! How long have you been blogging for?
you make blogging look easy. The overall look of your website is fantastic,
as well as the content!

Reply