Spaces:
Sleeping
Sleeping
| # Deployment on Azure Machine Learning | |
| ## Pre-requisites | |
| ``` | |
| cd inference/triton_server | |
| ``` | |
| Set the environment for AML: | |
| ``` | |
| export RESOURCE_GROUP=Dhruva-prod | |
| export WORKSPACE_NAME=dhruva--central-india | |
| export DOCKER_REGISTRY=dhruvaprod | |
| ``` | |
| Also remember to edit the `yml` files accordingly. | |
| ## Registering the model | |
| ``` | |
| az ml model create --file azure_ml/model.yml --resource-group $RESOURCE_GROUP --workspace-name $WORKSPACE_NAME | |
| ``` | |
| ## Pushing the docker image to Container Registry | |
| ``` | |
| az acr login --name $DOCKER_REGISTRY | |
| docker tag indictrans2_triton $DOCKER_REGISTRY.azurecr.io/nmt/triton-indictrans-v2:latest | |
| docker push $DOCKER_REGISTRY.azurecr.io/nmt/triton-indictrans-v2:latest | |
| ``` | |
| ## Creating the execution environment | |
| ``` | |
| az ml environment create -f azure_ml/environment.yml -g $RESOURCE_GROUP -w $WORKSPACE_NAME | |
| ``` | |
| ## Publishing the endpoint for online inference | |
| ``` | |
| az ml online-endpoint create -f azure_ml/endpoint.yml -g $RESOURCE_GROUP -w $WORKSPACE_NAME | |
| ``` | |
| Now from the Azure Portal, open the Container Registry, and grant ACR_PULL permission for the above endpoint, so that it is allowed to download the docker image. | |
| ## Attaching a deployment | |
| ``` | |
| az ml online-deployment create -f azure_ml/deployment.yml --all-traffic -g $RESOURCE_GROUP -w $WORKSPACE_NAME | |
| ``` | |
| ## Testing if inference works | |
| 1. From Azure ML Studio, go to the "Consume" tab, and get the endpoint domain (without `https://` or trailing `/`) and an authentication key. | |
| 2. In `client.py`, enable `ENABLE_SSL = True`, and then set the `ENDPOINT_URL` variable as well as `Authorization` value inside `HTTP_HEADERS`. | |
| 3. Run `python3 client.py` | |