5 Streamlined Approaches to Image Classification Using Keras in Python

💡 Problem Formulation: This article aims to elucidate various methods for performing image classification using the Keras library in Python. Specifically, it addresses how to convert an input image into a categorized output, typically a label from a predefined set. For example, given a photograph of a cat, the desired output is the label ‘cat’ indicating the image’s content.

Method 1: Using Pretrained Models for Transfer Learning

Transfer Learning is an efficient approach that utilizes a pretrained model on a large benchmark dataset to leverage learned features for a custom task with limited data. Keras provides access to several architectures, such as VGG16, InceptionV3, and ResNet, which can be fine-tuned for specific image classification tasks.

Here’s an example:

from tensorflow.keras.applications import VGG16
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.vgg16 import preprocess_input, decode_predictions
import numpy as np

# Load model with pretrained weights
model = VGG16(weights='imagenet')

# Load an image file, resizing it to 224x224 pixels (required input size for the model)
img_path = 'path_to_your_image.jpg'
img = image.load_img(img_path, target_size=(224, 224))

# Convert the image to a numpy array and preprocess it
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

# Classify the image
predictions = model.predict(x)
print('Predicted:', decode_predictions(predictions, top=3)[0])

Output:

Predicted: [('n02504458', 'African_elephant', 0.8265823),
             ('n01871265', 'tusker', 0.1122357),
             ('n02504013', 'Indian_elephant', 0.061040461)]

This snippet imports the VGG16 model with ImageNet weights, loads an image, preprocesses it, and then employs the model to predict the image’s class. The decode_predictions function maps the predictions to readable class names.

Method 2: Constructing a Sequential Model from Scratch

Building a Sequential model from scratch allows for the customization of the neural network architecture. This method involves stacking layers one after the other in a sequence using Keras, providing great flexibility in designing models tailored to specific classification tasks.

Here’s an example:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, Flatten, MaxPooling2D

# Initialize the model
model = Sequential()

# Add a convolutional layer
model.add(Conv2D(32, kernel_size=(3, 3),
          activation='relu',
          input_shape=(28,28,1)))

# Add a pooling layer
model.add(MaxPooling2D(pool_size=(2, 2)))

# Add a flattening layer
model.add(Flatten())

# Add a dense (fully connected) layer
model.add(Dense(128, activation='relu'))

# Add the output layer
model.add(Dense(10, activation='softmax'))

# Compile the model
model.compile(optimizer='adam', 
              loss='categorical_crossentropy', 
              metrics=['accuracy'])

Output:

Model: "sequential"
_________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 26, 26, 32)        320       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 13, 13, 32)        0         
_________________________________________________________________
flatten (Flatten)            (None, 5408)              0         
_________________________________________________________________
dense (Dense)                (None, 128)               692352    
_________________________________________________________________
dense_1 (Dense)              (None, 10)                1290      
=================================================================
Total params: 693,962
Trainable params: 693,962
Non-trainable params: 0
_________________________________________________________

This code constructs a neural network for image classification, suited for datasets such as MNIST. Each layer serves a specific purpose: Conv2D for convolutional operations, MaxPooling2D for downsampling, Flatten for converting matrices to vectors, and Dense for the network’s fully connected layers.

Method 3: Image Augmentation with ImageDataGenerator

ImageDataGenerator in Keras is an easy-to-use utility that augments image data in real-time during training. It can artificially expand the size of a training set by creating modified versions of images, enhancing the generalization capability of the model.

Here’s an example:

from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Define an instance with transformations
datagen = ImageDataGenerator(
            rotation_range=40,
            width_shift_range=0.2,
            height_shift_range=0.2,
            shear_range=0.2,
            zoom_range=0.2,
            horizontal_flip=True,
            fill_mode='nearest')

# Example usage with model.fit
# model.fit(datagen.flow(x_train, y_train, batch_size=32),
#           steps_per_epoch=len(x_train) / 32, epochs=epochs)

Output:

This code does not produce immediate output, as it defines a data generator for training.

The snippet creates an ImageDataGenerator instance, specifying a range of transformations. When used with model.fit, it augments the dataset with these random transformations, never showing the exact same image twice during training.

Method 4: Feature Extraction with a Convolutional Neural Network

Feature Extraction with a ConvNet involves using the convolutional base of a pretrained model to extract meaningful features from new images. These features can then be fed into a new classifier, which is trained from scratch, allowing for customization while leveraging powerful learned features.

Here’s an example:

from tensorflow.keras.applications import VGG16
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D

# Load the convolutional base with pretrained weights
base_model = VGG16(weights='imagenet', include_top=False)

# Add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)

# Add a fully connected layer
x = Dense(1024, activation='relu')(x)

# And a logistic layer
predictions = Dense(10, activation='softmax')(x)

# This is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)

# First: train only the top layers
for layer in base_model.layers:
    layer.trainable = False

# Compile the model
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')

Output:

This code will output the model architecture summary upon using model.summary(), showing the stack of layers, including the newly added for classification.

This example demonstrates the extension of a pretrained VGG16 model by adding custom layers for classification. Only the top layers are trained initially to avoid destroying the features learned during pretraining.

Bonus One-Liner Method 5: Quick Model Customization with Keras Tuner

Keras Tuner is a library for hyperparameter tuning that helps to find the best model configuration. With a simple search command, it can iterate over a predefined hyperparameter space to find the most effective model architecture and parameters for a particular task.

Here’s an example:

from kerastuner.tuners import RandomSearch

# Define a model-building function
def build_model(hp):
    model = Sequential()
    model.add(Conv2D(hp.Int('input_units', 32, 256, 32), (3, 3), input_shape=x_train.shape[1:]))
    model.add(Activation('relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))
    # Continue model building...
    model.compile(
        optimizer='adam',
        loss='categorical_crossentropy',
        metrics=['accuracy'])
    return model

# Run the hyperparameter search
tuner = RandomSearch(build_model, objective='val_accuracy', max_trials=5, executions_per_trial=3)
tuner.search(x_train, y_train, validation_data=(x_val, y_val))

Output:

This code will print the best model architecture and parameters once the search is complete.

With a defined model-building function that accepts hyperparameters, Keras Tuner executes a search over a specified range. This snippet demonstrates using a random search strategy to find the optimal number of units for the first convolutional layer.

Summary/Discussion

Method 1: Pretrained Models for Transfer Learning. Strengths: Saves time and leverages models trained on large datasets. Weaknesses: May be overkill for simple tasks and difficult to fine-tune for specific applications.
Method 2: Constructing a Sequential Model from Scratch. Strengths: Customizable and suitable for learning model building fundamentals. Weaknesses: Time-consuming and requires a robust dataset to perform well.
Method 3: Image Augmentation with ImageDataGenerator. Strengths: Augments datasets and improves model generalization. Weaknesses: Can introduce noise and may mildly help if the dataset is already diverse.
Method 4: Feature Extraction with a Convolutional Neural Network. Strengths: Efficient use of pretrained models for feature extraction. Weaknesses: Requires understanding of which layers to retrain and may lead to overfitting on small datasets.
Method 5: Quick Model Customization with Keras Tuner. Strengths: Automates the trial-and-error process of model architecture design. Weaknesses: Requires computational resources for multiple training iterations and may still require final tweaking.