π‘ Problem Formulation: When working with Keras in Python, it’s crucial for data scientists and machine learning engineers to be able to save their models after training. Saving a model allows for operational deployment, further training, evaluation, or sharing with others. Imagine you’ve just trained a sophisticated neural network, and you need to save the complete model with its architecture, weights, and training configuration to be reused later without retraining. We explore solutions to save your Keras model efficiently.
Method 1: Using model.save()
Function
This method saves a Keras model into a single HDF5 file which will contain: the architecture of the model, the weights of the model, the training configuration (loss, optimizer), and the state of the optimizer. This allows to resume training exactly where you left off.
Here’s an example:
from keras.models import Sequential from keras.layers import Dense import numpy as np # Assuming you have a simple model model = Sequential([ Dense(64, activation='relu', input_shape=(32,)), Dense(10, activation='softmax') ]) # Compile your model model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy']) # Saving the model model.save('my_model.h5')
Output: The model is saved as ‘my_model.h5’.
The code snippet demonstrates how to define a simple Keras model and save it using the model.save()
function. The specified filename is where the model’s details are stored, and the ‘.h5’ extension represents the HDF5 file format used for storage.
Method 2: Using save_weights()
and model.to_json()
Saving weights separately from the model architecture can be beneficial if you only need to preserve the model’s state but not its configuration. The save_weights()
function allows you to save the learned weights, while model.to_json()
saves the architecture of the model as a JSON string.
Here’s an example:
model_json = model.to_json() with open('model.json', 'w') as json_file: json_file.write(model_json) # Save weights to HDF5 model.save_weights('model_weights.h5')
Output: Model architecture in ‘model.json’ and weights in ‘model_weights.h5’.
We first convert the modelβs architecture to JSON format and write it to a file. Then, we save the trained weights into an HDF5 file. This approach is good when you need flexibility in loading the weights into a different architecture.
Method 3: Using save_weights()
and model.to_yaml()
Similar to the JSON method, this approach saves the model weights and architecture separately. The architecture is saved in YAML format, which might be preferred in some environments for its readability and compatibility.
Here’s an example:
model_yaml = model.to_yaml() with open('model.yaml', 'w') as yaml_file: yaml_file.write(model_yaml) # Save weights to HDF5 model.save_weights('model_weights.h5')
Output: Model architecture in ‘model.yaml’ and weights in ‘model_weights.h5’.
First, we serialize the model architecture into YAML format and save it into a file. Then, we store the weights separately in an HDF5 file. This two-stage approach is useful when model configuration needs to be edited or read by humans or different software.
Method 4: Using the callbacks.ModelCheckpoint
Callback
This method is optimal when you want to automatically save checkpoints of your model at regular intervals during training, or save the best model according to the performance on a validation set.
Here’s an example:
from keras.callbacks import ModelCheckpoint # Create model checkpoint callback checkpoint = ModelCheckpoint('model_best_checkpoint.h5', save_best_only=True, monitor='val_loss', mode='min') # Fit the model, with the checkpoint callback model.fit(X_train, Y_train, validation_split=0.2, callbacks=[checkpoint])
Output: The best model during training saved as ‘model_best_checkpoint.h5’.
By setting up a ModelCheckpoint
callback, you can specify to save the whole model. In this case, we save the model that performs best on validation loss. The checkpointed file will store not just weights but the entire model configuration.
Bonus One-Liner Method 5: Save and Load with joblib
For a quick save and load without needing to go through Keras’ specific methods, you might consider Python’s joblib
library for serializing large numpy arrays efficiently. However, this does not save the training configuration or the state of the optimizer.
Here’s an example:
from joblib import dump, load # Saving the model dump(model, 'model.joblib') # Later on, to load the model loaded_model = load('model.joblib')
Output: The model is saved to ‘model.joblib’ and can be loaded back from that file.
The one-liner uses dump
from joblib
to save the model and load
to load it back from the disk. Keep in mind that not everything may be preserved, such as optimizer state.
Summary/Discussion
- Method 1:
model.save()
Function. Saves everything about the model into a single file. Very convenient but creates large files and depends on HDF5. - Method 2:
save_weights()
andmodel.to_json()
. Offers flexibility in modifying architecture before loading weights. Requires additional steps when loading the model back. - Method 3:
save_weights()
andmodel.to_yaml()
. Human-readable format for the architecture. Same as Method 2 but with YAML. - Method 4:
callbacks.ModelCheckpoint
. Best for saving models during training. Can only save the best model or save at intervals. - Bonus One-Liner Method 5:
joblib
. Quick for serializing models. Not a native Keras method and doesn’t save optimizer state or configurations.