Loading and serving Gluon models on Multi Model Server (MMS)
Multi Model Server (MMS) supports loading and serving MXNet Imperative and Hybrid Gluon models. This is a short tutorial on how to write a custom Gluon model, and then serve it with MMS.
This tutorial covers the following:
Prerequisites
- Basic Gluon knowledge. If you are using Gluon for the first time, but are familiar with creating a neural network with MXNet or another framework, you may refer this 10 min Gluon crash-course: Predict with a pre-trained model.
- Gluon naming. Fine-tuning pre-trained Gluon models requires some understanding of how the naming conventions work. Take a look at the Naming of Gluon Parameter and Blocks tutorial for more information.
- Basic MMS knowledge. If you are using MMS for the first time, you should take advantage of the MMS QuickStart tutorial.
- MMS installed. If you haven’t already, install MMS with pip or install MMS from source. Either installation will also install MXNet.
Refer to the MXNet model zoo documentation for examples of accessing other models.
Load and serve a Gluon model
There are three scenarios for serving a Gluon model with MMS:
- Load and serve a pre-trained Gluon model.
- Load and serve a custom imperative Gluon model.
- Load and serve a custom hybrid Gluon model.
To learn more about the differences between gluon and hybrid gluon models refer to the following document
Load and serve a pre-trained Gluon model
Loading and serving a pre-trained Gluon model is the simplest of the three scenarios. These models don’t require you to provide symbols
and params
files.
It is easy to access a model with a couple of lines of code. The following code snippet shows how to load and serve a pretrained Gluon model.
class PretrainedAlexnetService(GluonBaseService):
"""
Pretrained alexnet Service
"""
def initialize(self, params):
self.net = mxnet.gluon.model_zoo.vision.alexnet(pretrained=True)
self.param_filename = "alexnet.params"
super(PretrainedAlexnetService, self).initialize(params)
def postprocess(self, data):
idx = data.topk(k=5)[0]
return [[{'class': (self.labels[int(i.asscalar())]).split()[1], 'probability':
float(data[0, int(i.asscalar())].asscalar())} for i in idx]]
svc = PretrainedAlexnetService()
def pretrained_gluon_alexnet(data, context):
res = None
if not svc.initialized:
svc.initialize(context)
if data is not None:
res = svc.predict(data)
return res
For an actual code implementation, refer to the custom-service code which uses the pre-trained Alexnet
Serve pre-trained model with MMS
To serve pre-trained models with MMS we would need to create an model archive file. Follow the below steps to get the example custom service loaded and served with MMS.
- Create a
models
directorymkdir /tmp/models
- Copy the example code
and other required artifacts to this folder
cp ../model_service_template/gluon_base_service.py ../model_service_template/mxnet_utils/ndarray.py gluon_pretrained_alexnet.py synset.txt signature.json /tmp/models/.
- Run the model-export tool on this folder.
model-archiver --model-name alexnet --model-path /tmp/models --handler gluon_pretrained_alexnet:pretrained_gluon_alexnet --runtime python --export-path /tmp
This creates a model-archive file
/tmp/alexnet.mar
. - You could run the server with this model file to serve the pre-trained alexnet.
multi-model-server --start --models alexnet.mar --model-store /tmp
- Test your service
curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg curl -X POST http://127.0.0.1:8080/alexnet/predict -F "data=@kitten.jpg"
Load and serve a custom Gluon imperative model
To load an imperative model for use with MMS, you must activate the network in a MMS custom service. Once activated, MMS can load the pre-trained parameters and start serving the imperative model. You also need to handle pre-processing and post-processing of the image input.
We created a custom imperative model using Gluon. Refer to custom service code The network definition, which is defined in the example, is as follows
class GluonImperativeAlexNet(gluon.Block):
"""
Fully imperative gluon Alexnet model
"""
def __init__(self, classes=1000, **kwargs):
super(GluonImperativeAlexNet, self).__init__(**kwargs)
with self.name_scope():
self.features = nn.Sequential(prefix='')
with self.features.name_scope():
self.features.add(nn.Conv2D(64, kernel_size=11, strides=4,
padding=2, activation='relu'))
self.features.add(nn.MaxPool2D(pool_size=3, strides=2))
self.features.add(nn.Conv2D(192, kernel_size=5, padding=2,
activation='relu'))
self.features.add(nn.MaxPool2D(pool_size=3, strides=2))
self.features.add(nn.Conv2D(384, kernel_size=3, padding=1,
activation='relu'))
self.features.add(nn.Conv2D(256, kernel_size=3, padding=1,
activation='relu'))
self.features.add(nn.Conv2D(256, kernel_size=3, padding=1,
activation='relu'))
self.features.add(nn.MaxPool2D(pool_size=3, strides=2))
self.features.add(nn.Flatten())
self.features.add(nn.Dense(4096, activation='relu'))
self.features.add(nn.Dropout(0.5))
self.features.add(nn.Dense(4096, activation='relu'))
self.features.add(nn.Dropout(0.5))
self.output = nn.Dense(classes)
def forward(self, x):
x = self.features(x)
x = self.output(x)
return x
The pre-process, inference and post-process steps are similar to the service code that we saw in the above section.
class ImperativeAlexnetService(GluonBaseService):
"""
Gluon alexnet Service
"""
def initialize(self, params):
self.net = GluonImperativeAlexNet()
self.param_filename = "alexnet.params"
super(ImperativeAlexnetService, self).initialize(params)
def postprocess(self, data):
idx = data.topk(k=5)[0]
return [[{'class': (self.labels[int(i.asscalar())]).split()[1], 'probability':
float(data[0, int(i.asscalar())].asscalar())} for i in idx]]
svc = ImperativeAlexnetService()
def imperative_gluon_alexnet_inf(data, context):
res = None
if not svc.initialized:
svc.initialize(context)
if data is not None:
res = svc.predict(data)
return res
Test your imperative Gluon model service
To serve imperative Gluon models with MMS we would need to create an model archive file. Follow the below steps to get the example custom service loaded and served with MMS.
- Create a
models
directorymkdir /tmp/models
- Copy the example code
and other required artifacts to this folder
cp ../model_service_template/gluon_base_service.py ../model_service_template/mxnet_utils/ndarray.py gluon_imperative_alexnet.py synset.txt signature.json /tmp/models/.
- Download/copy the parameters to this
/tmp/models
directory. For this example, we have the parameters file in a S3 bucket.wget https://s3.amazonaws.com/gluon-mms-model-files/alexnet.params mv alexnet.params /tmp/models
- Run the model-export tool on this folder.
model-archiver --model-name alexnet --model-path /tmp/models --handler gluon_imperative_alexnet:imperative_gluon_alexnet_inf --runtime python --export-path /tmp
This creates a model-archive file
/tmp/alexnet.mar
. - You could run the server with this model file to serve the pre-trained alexnet.
multi-model-server --start --models alexnet.mar --model-store /tmp
- Test your service
curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg curl -X POST http://127.0.0.1:8080/alexnet/predict -F "data=@kitten.jpg"
The output should be close to the following:
{"prediction":[{"class":"lynx,","probability":0.9411474466323853},{"class":"leopard,","probability":0.016749195754528046},{"class":"tabby,","probability":0.012754007242619991},{"class":"Egyptian","probability":0.011728651821613312},{"class":"tiger","probability":0.008974711410701275}]}
Load and serve a hybrid Gluon model
To serve hybrid Gluon models with MMS, let’s consider gluon_imperative_alexnet.py
in multi-model-server/examples/gluon_alexnet
folder.
We first convert the model to a Gluon
hybrid block.
For additional background on using HybridBlocks
and the need to hybridize
refer to this Gluon hybridize tutorial.
The above network, after this conversion, would look as follows:
class GluonHybridAlexNet(HybridBlock):
"""
Hybrid Block gluon model
"""
def __init__(self, classes=1000, **kwargs):
super(GluonHybridAlexNet, self).__init__(**kwargs)
with self.name_scope():
self.features = nn.HybridSequential(prefix='')
with self.features.name_scope():
self.features.add(nn.Conv2D(64, kernel_size=11, strides=4,
padding=2, activation='relu'))
self.features.add(nn.MaxPool2D(pool_size=3, strides=2))
self.features.add(nn.Conv2D(192, kernel_size=5, padding=2,
activation='relu'))
self.features.add(nn.MaxPool2D(pool_size=3, strides=2))
self.features.add(nn.Conv2D(384, kernel_size=3, padding=1,
activation='relu'))
self.features.add(nn.Conv2D(256, kernel_size=3, padding=1,
activation='relu'))
self.features.add(nn.Conv2D(256, kernel_size=3, padding=1,
activation='relu'))
self.features.add(nn.MaxPool2D(pool_size=3, strides=2))
self.features.add(nn.Flatten())
self.features.add(nn.Dense(4096, activation='relu'))
self.features.add(nn.Dropout(0.5))
self.features.add(nn.Dense(4096, activation='relu'))
self.features.add(nn.Dropout(0.5))
self.output = nn.Dense(classes)
def hybrid_forward(self, F, x):
x = self.features(x)
x = self.output(x)
return x
We could use the same custom service code as in the above section,
class HybridAlexnetService(GluonBaseService):
"""
Gluon alexnet Service
"""
def initialize(self, params):
self.net = GluonHybridAlexNet()
self.param_filename = "alexnet.params"
super(HybridAlexnetService, self).initialize(params)
self.net.hybridize()
def postprocess(self, data):
idx = data.topk(k=5)[0]
return [[{'class': (self.labels[int(i.asscalar())]).split()[1], 'probability':
float(data[0, int(i.asscalar())].asscalar())} for i in idx]]
svc = HybridAlexnetService()
def hybrid_gluon_alexnet_inf(data, context):
res = None
if not svc.initialized:
svc.initialize(context)
if data is not None:
res = svc.predict(data)
return res
Similar to imperative models, this model doesn’t require Symbols
as the call to .hybridize()
compiles the neural net.
This would store the symbols
implicitly.
Test your hybrid Gluon model service
To serve Hybrid Gluon models with MMS we would need to create an model archive file. Follow the below steps to get the example custom service loaded and served with MMS.
- Create a
models
directorymkdir /tmp/models
- Copy the example code
and other required artifacts to this folder
cp ../model_service_template/gluon_base_service.py ../model_service_template/mxnet_utils/ndarray.py gluon_hybrid_alexnet.py synset.txt signature.json /tmp/models/.
- Download/copy the parameters to this
/tmp/models
directory. For this example, we have the parameters file in a S3 bucket.wget https://s3.amazonaws.com/gluon-mms-model-files/alexnet.params mv alexnet.params /tmp/models
- Run the model-export tool on this folder.
model-archiver --model-name alexnet --model-path /tmp/models --handler gluon_hybrid_alexnet:hybrid_gluon_alexnet_inf --runtime python --export-path /tmp
This creates a model-archive file
/tmp/alexnet.mar
. - You could run the server with this model file to serve the pre-trained alexnet.
multi-model-server --start --models alexnet.mar --model-store /tmp
- Test your service
curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg curl -X POST http://127.0.0.1:8080/alexnet/predict -F "data=@kitten.jpg"
The output should be close to the following:
{"prediction":[{"class":"lynx,","probability":0.9411474466323853},{"class":"leopard,","probability":0.016749195754528046},{"class":"tabby,","probability":0.012754007242619991},{"class":"Egyptian","probability":0.011728651821613312},{"class":"tiger","probability":0.008974711410701275}]}
Conclusion
In this tutorial you learned how to serve Gluon models in three unique scenarios: a pre-trained imperative model directly from the model zoo, a custom imperative model, and a hybrid model. For further examples of customizing gluon models, try the Gluon tutorial for Transferring knowledge through fine-tuning. For an advanced custom service example, try the MMS SSD example.