Overview
This post is entirely for fun. I am trying a developer preview
product – the OpenShift Container Platform 4 (OCP 4) Installer Provisioned Infrastructure (IPI) on Microsoft Azure.
I really didn’t want the day to end in blood, sweat, and tears so I went through as much documentation as I could related to OCP4.1 about AWS, Azure, and generally some code. I created a pay as you go account an purchased a domain name. For now let’s call it example.com
.
Blood, Sweat, and Tears (or not)
The first thing I created in Azure was a resource group called openshift4-azure
. I created a public DNS zone with a DNS name in that resource group that was delegated for management to the Azure DNS servers. This is to manage the entries that the OCP 4. installer will need to create in order to manage traffic into the cluster.
I then created my local golang environment. I needed to create a golang environment and path https://golang.org/doc/install. This was to compile the installer for Azure. The binaries are not readily available yet. To test that this was working env |grep GOPATH
. My path is $HOME/go
.
I then forked the openshift/installer repository and cloned it in the go path: $HOME/go/src/github.com/openshift/
. I added the correct upstream git remote add upstream https://github.com/openshift/installer.git
to my fork in case I made code/documentation changes for PRs. To build the binary I needed I run ./hack/build.sh
from the installer. This created the installer in the bin
folder.
I followed the instructions at https://github.com/openshift/installer/tree/master/docs/user/azure/install.md to clone an image for CoreOS in my region. In every region where I want to create a cluster I need to copy the same image. I wanted to run these repeatedly so I downloaded the Azure CLI from https://docs.microsoft.com/en-us/cli/azure/install-azure-cli-yum?view=azure-cli-latest. As I’m running this in uksouth
so these are the commands I needed to run:
export VHD_NAME=rhcos-410.8.20190504.0-azure.vhd az storage account create --location uksouth --name ckocp4storage --kind StorageV2 --resource-group openshift-azure az storage container create --name vhd --account-name ckocp4storage az group create --location uksouth --name rhcos_images ACCOUNT_KEY=$(az storage account keys list --account-name ckocp4storage --resource-group openshift-azure --query "[0].value" -o tsv) az storage blob copy start --account-name "ckocp4storage" --account-key "$ACCOUNT_KEY" --destination-blob "$VHD_NAME" --destination-container vhd --source-uri "https://openshifttechpreview.blob.core.windows.net/rhcos/$VHD_NAME"
To create a unique storage account it took me a few tries. I think it needs to be unique across a region.
It is recommended to use Premium_LRS sku. To get premium storage in Azure in a PAYG account, I needed to enable the right subscription in the storage provider in PayAsYouGo subscription -> Resource Providers. This needed to be registered. Before creating the image, the storage blob needs to finish creating otherwise you get the following error:
Cannot import source blob https://ckocp4storage.blob.core.windows.net/vhd/rhcos-410.8.20190504.0-azure.vhd since it has not been completely copied yet. Copy status of the blob is CopyPending.
export RHCOS_VHD=$(az storage blob url --account-name ckocp4storage -c vhd --name "$VHD_NAME" -o tsv)
az image create --resource-group rhcos_images --name rhcostestimage --os-type Linux --storage-sku Premium_LRS --source "$RHCOS_VHD" --location uksouth
I created a service principal for my installation and copied the following somewhere safe:
az ad sp create-for-rbac --name openshift4azure { "appId": "serviceprincipal", "displayName": "openshift4azure", "name": "http://openshift4azure", "password": serviceprincipalpassword", "tenant": "tenant id" }
And gave it the following access:
az role assignment create --assignee serviceprincipal --role "User Access Administrator" az role assignment create --assignee serviceprincipal --role "Contributor"
I then got my oc
client and pull secret as described at https://cloud.redhat.com/openshift/install/azure/user-provisioned.
I tried my first Azure IPI OCP4 install and the first thing that I got was the following.
openshift-install create cluster ? SSH Public Key $HOME/.ssh/id_rsa.pub ? azure subscription id yyyy-xxxx-nnnn-bbbb-fffffff ? azure tenant id yyy-xxxx-nnnn-bbbb-nnnnnn ? azure service principal client id yyy-xxxx-nnnn-bbbb-ccccccccc ? azure service principal client secret [? for help] ************************************ INFO Saving user credentials to "$HOME/.azure/osServicePrincipal.json" ? Region uksouth ? Base Domain example.com ? Cluster Name attempt1 ? Pull Secret [? for help] *********************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************************** INFO Creating infrastructure resources... ^CERROR ERROR Error: compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="OperationNotAllowed" Message="Operation results in exceeding quota limits of Core. Maximum allowed: 10, Current in use: 8, Additional requested: 8. Please read more about quota increase at https://aka.ms/ProdportalCRP/?#create/Microsoft.Support/Parameters/{\"subId\":\"ae90eef6-f8ea-479c-8c6a-9dd4bf9e47d0\",\"pesId\":\"15621\",\"supportTopicId\":\"32447243\"}." ERROR ERROR on ../../../../../../../../tmp/openshift-install-216822811/bootstrap/main.tf line 117, in resource "azurerm_virtual_machine" "bootstrap": ERROR 117: resource "azurerm_virtual_machine" "bootstrap" { ERROR ERROR ERROR ERROR Error: compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="OperationNotAllowed" Message="Operation results in exceeding quota limits of Core. Maximum allowed: 10, Current in use: 8, Additional requested: 8. Please read more about quota increase at https://aka.ms/ProdportalCRP/?#create/Microsoft.Support/Parameters/{\"subId\":\"ae90eef6-f8ea-479c-8c6a-9dd4bf9e47d0\",\"pesId\":\"15621\",\"supportTopicId\":\"32447243\"}." ERROR ERROR on ../../../../../../../../tmp/openshift-install-216822811/master/master.tf line 44, in resource "azurerm_virtual_machine" "master": ERROR 44: resource "azurerm_virtual_machine" "master" { ERROR ERROR ERROR ERROR Error: compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=0 -- Original Error: autorest/azure: Service returned an error. Status=<nil> Code="OperationNotAllowed" Message="Operation results in exceeding quota limits of Core. Maximum allowed: 10, Current in use: 8, Additional requested: 8. Please read more about quota increase at https://aka.ms/ProdportalCRP/?#create/Microsoft.Support/Parameters/{\"subId\":\"ae90eef6-f8ea-479c-8c6a-9dd4bf9e47d0\",\"pesId\":\"15621\",\"supportTopicId\":\"32447243\"}." ERROR ERROR on ../../../../../../../../tmp/openshift-install-216822811/master/master.tf line 44, in resource "azurerm_virtual_machine" "master": ERROR 44: resource "azurerm_virtual_machine" "master" { ERROR ERROR FATAL failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply using Terraform
Standard PAYG account does not allow for the amount of resources that IPI will create. It requires more than the 10 compute resources available so I needed to increase compute quota to allow for creation:
Resource Manager, UKSOUTH, DSv2 Series from 10 to 100 Resource Manager, UKSOUTH, DSv3 Series from 10 to 100
I needed to export the environment variable for the install image for RHCOS which I found from the account storage account blob:
export OPENSHIFT_INSTALL_OS_IMAGE_OVERRIDE="/resourceGroups/rhcos_images/providers/Microsoft.Compute/images/rhcostestimage"
I destroyed the stack oc destroy cluster --dir=cluster-dir
to try again and watched with glee as my Azure attempt-1
resource group diminished. It was then time for attempt 2 for which I also passed the Azure authentication credentials location in a json
file by exporting this variable AZURE_AUTH_LOCATION=creds.json
. Baaaad idea. The installer overwrote my credentials location. It’s a good thing I had a copy and didn’t particularly care.
Attempt 2 seems to have worked. I have an operational cluster. All my operators are running in a good state (not degraded and not progressing):
bin]$ ~/bin/oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE authentication 4.2.0-0.okd-2019-06-25-110619 True False False 46m cloud-credential 4.2.0-0.okd-2019-06-25-110619 True False False 66m cluster-autoscaler 4.2.0-0.okd-2019-06-25-110619 True False False 65m console 4.2.0-0.okd-2019-06-25-110619 True False False 50m dns 4.2.0-0.okd-2019-06-25-110619 True False False 65m image-registry 4.2.0-0.okd-2019-06-25-110619 True False False 59m ingress 4.2.0-0.okd-2019-06-25-110619 True False False 53m kube-apiserver 4.2.0-0.okd-2019-06-25-110619 True False False 61m kube-controller-manager 4.2.0-0.okd-2019-06-25-110619 True False False 62m kube-scheduler 4.2.0-0.okd-2019-06-25-110619 True False False 61m machine-api 4.2.0-0.okd-2019-06-25-110619 True False False 66m machine-config 4.2.0-0.okd-2019-06-25-110619 True False False 62m marketplace 4.2.0-0.okd-2019-06-25-110619 True False False 59m monitoring 4.2.0-0.okd-2019-06-25-110619 True False False 52m network 4.2.0-0.okd-2019-06-25-110619 True False False 66m node-tuning 4.2.0-0.okd-2019-06-25-110619 True False False 60m openshift-apiserver 4.2.0-0.okd-2019-06-25-110619 True False False 60m openshift-controller-manager 4.2.0-0.okd-2019-06-25-110619 True False False 62m openshift-samples 4.2.0-0.okd-2019-06-25-110619 True False False 53m operator-lifecycle-manager 4.2.0-0.okd-2019-06-25-110619 True False False 63m operator-lifecycle-manager-catalog 4.2.0-0.okd-2019-06-25-110619 True False False 63m operator-lifecycle-manager-packageserver 4.2.0-0.okd-2019-06-25-110619 True False False 59m service-ca 4.2.0-0.okd-2019-06-25-110619 True False False 66m service-catalog-apiserver 4.2.0-0.okd-2019-06-25-110619 True False False 60m service-catalog-controller-manager 4.2.0-0.okd-2019-06-25-110619 True False False 60m storage 4.2.0-0.okd-2019-06-25-110619 True False False 59m support 4.2.0-0.okd-2019-06-25-110619 True False False 66m
Conclusion
For a first attempt on a developer preview things went very well. I’ve trolled through the Azure logs and found things like access role issues so I still don’t know if I’ve made a mistake on my Service Principal allocation. I think some better error handling and messages would help with the installer. I’d hate to see things like Machine Sets not being able to be expanded because my IAM is wrong and I didn’t know about it. Ofcourse general things like installation behind proxy, bring your own DNS or SecurityGroups/Networking and better publicising of the CoreOS images would also help.
I’m hoping to find out more as I use the cluster over the next few days. If you haven’t yet, try the installer on Azure and let me know what you think:
- To get started, visit try.openshift.com and click on “Get Started”.
- Log in or create a Red Hat account and follow the instructions for setting up your first cluster on Azure.
References
- Many thanks to https://github.com/mjudeikis for his help.
- My fork of the installer https://github.com/ckyriakidou/installer
- The installer overview https://github.com/openshift/installer/blob/master/docs/user/overview.md
- The OpenShift Container Platform Azure side of things https://github.com/openshift/installer/tree/master/docs/user/azure
- Using DNS Zone delegation in Azure https://docs.microsoft.com/en-us/azure/dns/dns-delegate-domain-azure-dns
- To install the OCP4 cli and get the Pull secret for OCP in Azure follow https://cloud.redhat.com/openshift/install/azure/user-provisioned
- More on OCP 4 can be found https://docs.openshift.com/container-platform/4.1/welcome/index.html