Quick Start
This example demonstrates how to tune the configuration for a Mobilenet model deployed with Tensorflow Serving under Morphling.
For demonstration, we choose two configurations to tune: the first one the CPU cores (resource allocation), and the second one is maximum serving batch size (runtime parameter). We use grid search for configuration sampling.
Submit the configuration tuning experiment
kubectl apply -f https://raw.githubusercontent.com/alibaba/morphling/main/examples/experiment/experiment-mobilenet-grid.yaml
Monitor the tuning experiment status
kubectl get pe
kubectl describe pe
Monitor sampling trials (performance test)
kubectl get trial
Get the searched optimal configuration
kubectl get pe
Expected output:
NAME STATE AGE OBJECT NAME OPTIMAL OBJECT VALUE OPTIMAL PARAMETERS
mobilenet-experiment-grid Succeeded 5m59s qps 31 [map[category:env name:BATCH_SIZE value:2] map[category:resource name:cpu value:2000m] map[category:resource name:memory value:2000Mi]]
Delete the tuning experiment
kubectl delete pe --all