As all of you know, we are in the process of migrating to OpenShift 4. Here is an update on the progress and some details about the changes you have to expect.
Note that most documentation changes will be delayed until the end of the migration. Meanwhile, refer to this document to inform yourself about the differences.
Migration plan and Schedule
Schedule
With support of OpenShift 3 expiring, we are on a tight schedule.
Due date | Description |
---|---|
2022-05-03 | Finish Migrating all test systems |
2022-05-05 | Start prod migrations |
2022-05 | Migrate address service |
2022-05 | Migrate thumbnail service |
2022-05 | Migrate commit info service (Jira integration) |
2022-06 | Migrate Sonar |
2022-06 | Migrate manual |
2022-06-30 | Prod migrations completed |
Migration Details
There is two migration scenarios:
a) When we control all DNS records:
This is the case for all test systems and numerous production systems.
In this scenario, the migration path is straight forward:
- Set up installation on OpenShift 4
- Copy TLS certificate
- Deploy and start Nice on new platform
- Adjust DNS and wait for TTL to expire (1h)
- Stop installation on OpenShift 3
- Enable renewal of TLS certificates via ACME
During the migration, requests are spread between OpenShift 3 and OpenShift 4, on both of which Nice is running.
No downtime is expected and the full migration is completed within about an hour.
b) When the customer controls DNS records:
Migration path here is a bit more involved:
- Set up installation on OpenShift 4
- Forward traffic to /.well-known/acme-challenge/ from OS3 to OS4 employing a reverse proxy
- Issue TLS certificates via ACME on OpenShift 4
- Deploy and start Nice on new platform
- Forward all traffic from OS3 to OS4
- Wait for customer to update DNS (may take days or weeks)
- Remove reverse proxy on OS3
All traffic from OpenShift 3 is forwarded to OpenShift 4 to give the customer time to adjust the DNS records.
Migration takes as long as the customer needs to adjust the DNS records. Here too, no downtime is expected and deployments are expected to be unavailable for about an hour.
Accessing OpenShift 4
Terminal
Login:
oc login -u <username> https://api.c-tocco-ocp4.tocco.ch:6443
Note that the toco- prefix has been dropped on OpenShift 4. That is, the project behind master is now called nice-master rather than toco-nice-master:
oc project nice-master
On OpenShift 4, some of you have limited access to nodes. See
Nodes / Resources. Those with access can also
fetch resources across all namespaces using --all-namespaces
.
List pods in all namespaces:
oc get pods --all-namespaces
List resources usage of all pods in all namespaces:
kubectl top pods --all-namespaces --sort-by cpu
Or list all pods associated with a failed deployment in all namespaces:
oc get pods --all-namespaces --field-selector 'status.phase==Failed' -o custom-columns='Namespace:metadata.namespace,Pod Name:metadata.name,Reason:status.reason'
OpenShift Web Console
Web Console is available at https://console.apps.openshift.tocco.ch.
On OpenShift 4, some of you have limited access to nodes. See Nodes / Resources.
Changes in Ansible
In order to support OpenShift 4, many changes have been made to Ansible and TeamCity. This changes are at a lower level of abstraction and, thus, are transparent to users of Ansible.
The one and only change required to run an installation on OpenShift 4 is an explicit location:
location: cloudscale-os4
Once everything is moved, cloudscale-os4 will be made the default and removed again from the installations’ configurations.
Let me also point out another change, which isn’t specific to OpenShift 4. Some of you used to change the DOCKER_PULL_URL parameter in TeamCity manually. This parameter is managed by Ansible now to ensure the image is fetched from the right platform. With the new naming scheme, there is no need to adjust this manually anymore. With these scheme, the Docker image for a production deployment is fetched from <installation_name>test, if it exists, unconditionally.
Nodes / Resources
Those of you with admin access (same people that have root access) can now access node details.
List nodes:
$ oc get nodes
NAME STATUS ROLES AGE VERSION
infra-a5b4 Ready infra,worker 35d v1.22.5+5c84e52
infra-c235 Ready infra,worker 35d v1.22.5+5c84e52
infra-fc11 Ready infra,worker 35d v1.22.5+5c84e52
master-c946 Ready master 35d v1.22.5+5c84e52
master-d7ca Ready master 35d v1.22.5+5c84e52
master-fb50 Ready master 35d v1.22.5+5c84e52
worker-0188 Ready app,worker 6d17h v1.22.5+5c84e52
worker-565d Ready app,worker 35d v1.22.5+5c84e52
worker-61aa Ready app,worker 6d17h v1.22.5+5c84e52
The nodes prefixed with worker-
are the ones that run instances of Nice.
Show resource consumption:
$ kubectl top nodes
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
infra-a5b4 1275m 17% 15970Mi 51%
infra-c235 584m 7% 11260Mi 36%
infra-fc11 826m 11% 16377Mi 52%
master-c946 1051m 30% 10937Mi 73%
master-d7ca 795m 22% 10103Mi 67%
master-fb50 658m 18% 8345Mi 56%
worker-0188 2257m 34% 32650Mi 51%
worker-565d 1134m 17% 17556Mi 27%
worker-61aa 1687m 25% 18507Mi 29%
Show resource requests and limits:
$ oc describe node worker-0188
…
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 3374m (51%) 200m (3%)
memory 35298Mi (55%) 73964Mi (116%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
…
Resource utilization is also available, in the form of graphs, in the Web Console:
Logging
Kibana can be accessed at https://kibana-openshift-logging.apps.openshift.tocco.ch.
Main difference is that search is no longer segregated by project/namespace. Filter by
kubernetes.namespace_name
to search logs of a specific installation:
As result of this change, it’s now possible to list or visualize log messages across all or a selected number of installations.
DNS
Installations on OpenShift 4 require different DNS records.
Type A Records
This (OpenShift 3):
example.net. 3600 IN A 5.102.151.2
example.net. 3600 IN A 5.102.151.3
becomes (OpenShift 4):
example.net. IN A 5.102.151.37
Type CNAME/ANAME/ALIAS Records
This (OpenShift 3):
extranet.example.net IN CNAME ha-proxy.tocco.ch.
becomes (OpenShift 4):
extranet.example.net IN CNAME os4.tocco.ch.
See DNS section in Tocco Docs for details.
Projects / Namespaces
OpenShift Projects, which are built on top of Kubernetes Namespaces, can now be created via the Kubernetes API.
Create a project:
oc new-project <project_name>
This creates a fresh project and grants you access to it. In order to grant everyone else access, it’s recommended to grant access to groups tocco-admin and tocco-dev:
Grant access to tocco-admin:
oc create -f - <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: tocco-admin
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: admin
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: tocco-admin
EOF
Grant access to tocco-dev:
oc create -f - <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: tocco-admin
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: admin
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: tocco-admin
EOF
Whenever Ansible is used to manage a service, it needs access too:
oc create -f - << EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: ansible-admin
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: admin
subjects:
- kind: ServiceAccount
name: ansible
namespace: serviceaccounts
EOF
Similarly, remove a project like this:
oc delete project <project_name>
Monitoring
For the time being monitoring stays at https://monitoring.vshn.net, and merely the label has been adjusted to allow distinguishing between OpenShift 3 and 4 easily:
Long term, the plan is to switch to Prometheus for monitoring which will also allow to monitor metrics like memory usage, queue sizes or thread pool usage.
Persistent Volumes
Persistent volumes are used to store data, persistently, and are made available to pods via filesystem. On OpenShift 3, a Gluster-based storage was used which added some additional, unwanted complexity. On OpenShift 4, volumes are directly obtained from the storage provided by Cloudscale. This is potentially faster and more reliable.
Single Writer
The drawback of this setup is that only so called ReadWriteOnce storage is supported. That is, any number of users can read a volume but only one can have write access. On OpenShift 3, multiple, concurrent writers could exist.
Use of volumes within Nice is currently very limited:
- Some legacy web sites store resources on the filesystem.
- LMS module, before 3.0, stored e-tests on the filesystem.
Any user of a volume will no longer be able to run multiple instances. Consequently, rolling deployments can no longer be used with such installations. During a rolling deployment, a new pod is started and verified to be online before even attempting to shut down any old pods. This would lead to concurrent write access. Hence, such installations will have to use a different deployment strategy, namely, recreate. During a recreate deployment, the installation is stopped first, schema changes are applied, and only then is the installation started again.
It’s to be noted that new installations are not affected by this. Currently,
I expect that about five Two customers will have to use this strategy until their
installations can be updated.
EDIT:
On second thought, only customers with the LMS module are affected. Namely, iffp, sfb and spi whereof sfb isn’t running on our infrastructure at all. The aforementioned volumes for web resources are unaffected; those can be stored on read-only volumes.
This will lead to downtime during code and configuration deployments. First measurements indicate that simple configuration changes cause < 2 minutes downtime and minor schema upgrades < 4 minutes.
See also Using deployment strategies in the OpenShift 4 documentation.
Storage Classes
There are also advantages. In addition to the previously mentioned reduction in complexity, the storage is much cheaper, as no additional Gluster service is needed, and we can pick between SSD and even cheaper bulk (HDD) storage:
Request SSD volume:
oc set volume dc/nice -c nice --add --claim-class=sdd --name=lms --claim-name=lms --claim-size=10Gi --mount-path=/app/var/lms
Default used when --claim-class
is omitted.
Request HDD volume:
oc set volume dc/nice -c nice --add --claim-class=bulk --name=lms --claim-name=lms --claim-size=10Gi --mount-path=/app/var/lms
Memory / Heap Dumps
In order preserve heap dumps across an application crash, a persistent volume was used. With OpenShift 4, /app/var/heap_dumps/, where heap dumps go, has been converted to an emptyDir volume. Such volumes are ephemeral and bound to a single pod. Yet, importantly, such volumes will survive an application crash and restart.
This means, in order to enable automatic memory dump during OOM, this is now sufficient:
oc set env dc/nice NICE2_DUMP_ON_OOM=true
There is no need to create a volume for automatic or manual dumps.
Note that, while emptyDir volumes are preserved across restarts, they will vanish together with the pod. Do not delete the pod or stop it by scaling down.
Ingress and ACME
On OpenShift 3, we used routes:
$ oc get route
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
nice master.tocco.ch ... 1 more nice 80-tcp edge/Redirect None
nice-tocco.bitserver.ch tocco.bitserver.ch nice 80-tcp edge/Redirect None
On OpenShift 4, ingresses are used instead:
$ oc get ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
nice <none> master.tocco.ch router-default.apps.openshift.tocco.ch 80, 443 9d
nice-tocco.bitserver.ch <none> tocco.bitserver.ch router-default.apps.openshift.tocco.ch 80, 443 9d
Routes are OpenShift-specific while ingress is what native Kubernetes uses. The reason we switch is that the new ACME integration only supports ingress. ACME is the protocol used by Let’s Encrypt, and others, to fully automate TLS certificate issuance.
In the background, a route is created automatically for every ingress:
$ oc get route
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
nice-7v864 master.tocco.ch / nice 80-tcp edge/Redirect None
nice-tocco.bitserver.ch-zn4hv tocco.bitserver.ch / nice 80-tcp edge/Redirect None
However, using ingress directly is preferred.
Enabling ACME also differs slightly. Namely, a different annotation needs to be set to enable it:
oc annotate ingress/<name> cert-manager.io/cluster-issuer=letsencrypt-production
Of course, Ansible still does this automatically for Nice installations.
I do not yet have any experience troubleshooting the new ACME integration. No failure to issue a certificates has occurred yet. If needed, in addition the Troubleshooting section on Tocco Docs, you may want to check the Troubleshooting guide upstream.
On a side note, other object too are based on native Kubernetes objects. For instance, OpenShift’s Project is an extension of Kubernetes’ Namespace and DeploymentConfig of Deployment. As general rule, the native Kubernetes object should be preferred when none of the features of the OpenShift object is needed. The reason for this is that we want to keep open the possibility of switching to alternatives, like SUSE’s Ranger, in the future. Keeping as close to native Kubernetes will ease any such transition considerably.