Neural network image classification of stained scan with Keras

Before saving your paper documents in a product like Sismics Docs, it’s always nice to evaluate the quality of the scans.

For one of our projects, we had to do this automatic evaluation using a neural network. The obvious framework for machine learning these days is Keras. All the following code is working with Keras 2.0 and Tensorflow as a backend.

To follow this article, you will need:

  • A machine with Python 3, Keras 2 and Tensorflow installed
  • Preferably a configured Nvidia GPU to speed up the learning process. We used a GTX 970 and the training time was only a few minutes for 100 epochs
  • Some stained and clean documents

The full working code is in this Github repository: https://github.com/sismics/keras-neural-net-image-classification-stain

We decided to train our model using “homemade” data, so we didn’t have a lot of data a our disposal. We simply took a few clean documents and did the dirty work ourselves 😉

Making machine learning data the poor man way

Once this was done, we scanned those dirty documents, and some clean ones, and sliced those scans in 10000 64×64 images. The slicing is there to increase the input data volume, and decrease the input image resolution. After some testing, 5000 images seems like the minimum you need to achieve some accuracy, but as always the more the better. Then, we manually classified those images in two folders “stain” and “clean”.

Input sliced images

To reduce overfitting of our model, and increase the variations in our images, we used the Keras image data generator.

train_datagen = ImageDataGenerator(
    rescale=1./255,
    horizontal_flip=True,
    vertical_flip=True,
    fill_mode='nearest',
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    zoom_range=0.2
)

train_generator = train_datagen.flow_from_directory(
    'data/train',  # this is the target directory
    target_size=(64, 64),  # all images will be resized to 64x64
    batch_size=batch_size,
    color_mode='grayscale',
    class_mode='binary')

This generator takes our images and do random changes like zooming, rotating, … and then feed it in the model during training.
The model used has 3 convolution layers with a ReLU activation and followed by max-pooling layers, as recommended in this official Keras article.

model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(64, 64, 1)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

Training the model was quite fast using a GPU, and after 100 epochs we got 85%+ accuracy on our validation data. The validation accuracy is greater than the training accuracy and the validation loss is lower than the training loss, a good sign of non-overfitting of our model.

Tensorboard output

The end result gives quite good information about the quality of our input scan.

Red is detected as stained, green is ignored and the rest is detected as clean

After that, we developped a small interface to test our trained model using Bootstrap 4, Vue.js and Flask as a backend.

Dirty or not?

As further improvements, we could think of:

  • Obviously add more data, more kind of stains, more edge cases
  • Tweak the hyperparameters to achieve better accuracy
  • We explicitely chose to grayscale our images, but maybe keeping the color information is a better idea

Sismics co-founder, Docs open-source developer

Leave a Reply

Your email address will not be published.

Back to Top