5 Best Ways to Use Keras with a Pre-Trained Model in Python

💡 Problem Formulation: Many machine learning practitioners face the challenge of leveraging powerful pre-trained models to solve specific tasks without reinventing the wheel. For instance, a developer may want to use a model trained on ImageNet to recognize everyday objects in a new set of photographs. The desired output is a system that accurately labels the photos using the knowledge encapsulated in the pre-trained model.

Method 1: Feature Extraction with a Pre-Trained Model

Feature extraction involves using the representations learned by a previous network to extract meaningful features from new samples. You simply add a new classifier, which you will train, on top of the pre-trained model. Keras makes it easy to do this by setting the ‘include_top’ parameter to ‘False’ which excludes the top, i.e., the last fully connected layers.

Here’s an example:

from keras.applications.vgg16 import VGG16
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D

# Load VGG16 model, exclude top(fc) layers.
base_model = VGG16(weights='imagenet', include_top=False)

# Add GlobalAveragePooling2D layer and Dense layer for classification.
x = base_model.output
x = GlobalAveragePooling2D()(x)
predictions = Dense(10, activation='softmax')(x)

# This is the model we'll train.
model = Model(inputs=base_model.input, outputs=predictions)

Output: A Keras model with architecture VGG16 sans its fully connected layers, augmented with a global average pooling layer and a new dense layer for classification.

This code snippet creates a new model using a pre-trained VGG16 network, with fully connected layers removed. A global average pooling layer followed by a dense layer with softmax activation for classification is added. This new top layer will be trained on the new dataset. This method is efficient when you have a small dataset and can take advantage of the already-learned features.

Method 2: Fine-Tuning a Pre-Trained Model

Fine-tuning adjusts the weights of a pre-trained model by continuing the backpropagation. As it involves modifying the original model, it requires a larger dataset to avoid overfitting. With Keras, you fine-tune by first training the top-level classifier, then unfreezing the later layers of the base model and training them again with a very low learning rate.

Here’s an example:

from keras.applications.vgg16 import VGG16
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D
from keras.optimizers import Adam

# Load the VGG16 model, pretrained on ImageNet.
base_model = VGG16(weights='imagenet', include_top=False)

# Freeze all layers in base model.
for layer in base_model.layers:
    layer.trainable = False

# Create new top layers.
x = base_model.output
x = GlobalAveragePooling2D()(x)
predictions = Dense(10, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=predictions)

# Train only the top layers.
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(...)

# Unfreeze some layers in the base model.
for layer in base_model.layers[-4:]:
    layer.trainable = True

# Fine-tune these layers with a very low learning rate.
model.compile(optimizer=Adam(lr=0.0001), loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(...)

Output: A Keras model with the later layers of VGG16 tuned further with data from the specific task.

This snippet demonstrates how to fine-tune a model using Keras. The VGG16 model is imported with its weights pre-trained on ImageNet. The top-level classifier is then trained, and later some layers of VGG16 are unfrozen and the network is fine-tuned with a reduced learning rate. This is best if your new dataset is large and not very similar to the original dataset the base model was trained on.

Method 3: Using Pre-Trained Models as Predictors

Using a pre-trained model as a predictor means utilizing the model for inferring on new data with no changes to the model. Keras models with pre-trained weights can be used to predict classes directly. This is useful for quick prototyping or when the new dataset is very similar to the data on which the model was trained.

Here’s an example:

from keras.preprocessing import image
from keras.applications.vgg16 import VGG16, preprocess_input, decode_predictions
import numpy as np

# Load the model including the top fully connected layers.
model = VGG16(weights='imagenet')

# Load an image and preprocess it.
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

# Predict class probabilities.
predictions = model.predict(x)

# Print the top-5 predicted classes.
print('Predicted:', decode_predictions(predictions, top=5)[0])

Output: Predicted classes for the input image, e.g., [(‘n02504458’, ‘African_elephant’), … , (‘n01871265’, ‘tusker’)]

This code uses Keras to load a VGG16 model with weights pre-trained on ImageNet and predicts the class for a new image. The image is preprocessed to fit the input size of the model, and then the model’s ‘predict’ method is used to find the most likely classes. This method is straightforward and requires no additional training.

Method 4: Transfer Learning with Custom Input Shapes

Keras pre-trained models allow customization of the input shape if the include_top parameter is False. This is especially useful if your images have a different size from the standard size the model was trained on. You can adapt any pre-trained model to your specific input shape and use it thereafter for feature extraction or fine-tuning.

Here’s an example:

from keras.applications.vgg16 import VGG16
from keras.models import Model
from keras.layers import Dense, Flatten
from keras import Input

# Load VGG16 model without the classifier layers and with a custom input shape.
input_tensor = Input(shape=(160, 160, 3))
base_model = VGG16(weights='imagenet', include_top=False, input_tensor=input_tensor)

# Add new top layers.
x = base_model.output
x = Flatten()(x)
predictions = Dense(10, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=predictions)

Output: A Keras model with VGG16 architecture adapted to a new custom input shape for different image sizes.

This example illustrates how to customize the input shape of a pre-trained VGG16 model in Keras. The model is loaded with a new input size and then the classification layers are customized accordingly. This gives flexibility when dealing with images of sizes different from those in the pre-trained dataset.

Bonus One-Liner Method 5: Loading a Model with Keras for Instant Use

If you simply want to load a pre-trained model and use it without any changes, Keras provides a one-liner. This is the quickest way to leverage a pre-trained model in your Python applications for out-of-the-box predictions or transfer learning.

Here’s an example:

from keras.applications import ResNet50
model = ResNet50(weights='imagenet')

Output: A fully loaded ResNet50 Keras model with pre-trained weights ready for use.

This example shows the simplicity of using Keras to load a pre-trained model. With just one line of code, the ResNet50 model is ready to perform predictions. This approach is excellent for rapid prototyping where no custom architecture or fine-tuning is needed.

Summary/Discussion

Method 1: Feature Extraction with a Pre-Trained Model. Best for small datasets. Can leverage powerful features learned on large datasets. May not capture dataset-specific intricacies.
Method 2: Fine-Tuning a Pre-Trained Model. Great for larger datasets. Customizes the model more towards your specific task. Requires more computational resources and can overfit if not managed properly.
Method 3: Using Pre-Trained Models as Predictors. Quick and easy. Great for similar datasets. No training needed, but inflexible for dataset-specific features.
Method 4: Transfer Learning with Custom Input Shapes. Offers flexibility in handling different image sizes. Requires customization of the top layer according to the new shape.
Method 5: Loading a Model with Keras for Instant Use. Most user-friendly approach for immediate prediction tasks. Limited to using the model “as is” – no customization.