Adopting the data plane
Adopting the Red Hat OpenStack Services on OpenShift (RHOSO) data plane involves the following steps:
-
Stop any remaining services on the Red Hat OpenStack Platform (RHOSP) 17.1 control plane.
-
Deploy the required custom resources.
-
Perform a fast-forward upgrade on Compute services from RHOSP 17.1 to RHOSO 18.0.
-
If applicable, adopt Networker nodes to the RHOSO data plane.
After the RHOSO control plane manages the newly deployed data plane, you must not re-enable services on the RHOSP 17.1 control plane and data plane. If you re-enable services, workloads are managed by two control planes or two data planes, resulting in data corruption, loss of control of existing workloads, inability to start new workloads, or other issues. |
Stopping infrastructure management and Compute services
You must stop cloud Controller nodes, database nodes, and messaging nodes on the Red Hat OpenStack Platform 17.1 control plane. Do not stop nodes that are running the Compute, Storage, or Networker roles on the control plane.
The following procedure applies to a single node standalone director deployment. You must remove conflicting repositories and packages from your Compute hosts, so that you can install libvirt packages when these hosts are adopted as data plane nodes, where modular libvirt daemons are no longer running in podman containers.
-
Define the shell variables. Replace the following example values with values that apply to your environment:
EDPM_PRIVATEKEY_PATH="/home/lab-user/.ssh/my-guidkey.pem" declare -A computes computes=( ["compute02.localdomain"]="172.22.0.110" ["compute03.localdomain"]="172.22.0.112" )
-
Remove the conflicting repositories and packages from all Compute hosts:
PacemakerResourcesToStop=( "galera-bundle" "haproxy-bundle" "rabbitmq-bundle") echo "Stopping pacemaker services" for i in {1..3}; do SSH_CMD=CONTROLLER${i}_SSH if [ ! -z "${!SSH_CMD}" ]; then echo "Using controller $i to run pacemaker commands" for resource in ${PacemakerResourcesToStop[*]}; do if ${!SSH_CMD} sudo pcs resource config $resource; then ${!SSH_CMD} sudo pcs resource disable $resource fi done break fi done
Adopting Compute services to the RHOSO data plane
Adopt your Compute (nova) services to the Red Hat OpenStack Services on OpenShift (RHOSO) data plane.
-
You have stopped the remaining control plane nodes, repositories, and packages on the Compute service (nova) hosts. For more information, see Stopping infrastructure management and Compute services.
-
In the bastion, create the dataplane network (IPAM):
cd /home/lab-user/labrepo/content/files/ oc apply -f osp-ng-dataplane-netconfig-adoption.yaml
-
Get the libvirt secret password:
LIBVIRT_PASSWORD=$(cat ~/tripleo-standalone-passwords.yaml | grep ' LibvirtTLSPassword:' | awk -F ': ' '{ print $2; }') LIBVIRT_PASSWORD_BASE64=$(echo -n "$LIBVIRT_PASSWORD" | base64)
-
Create the libvirt secret:
oc apply -f - <<EOF apiVersion: v1 kind: Secret metadata: name: libvirt-secret namespace: openstack type: Opaque data: LibvirtPassword: ${LIBVIRT_PASSWORD_BASE64} EOF
-
You have defined the shell variables to run the script that runs the fast-forward upgrade:
PODIFIED_DB_ROOT_PASSWORD=$(oc get -o json secret/osp-secret | jq -r .data.DbRootPassword | base64 -d) alias openstack="oc exec -t openstackclient -- openstack" declare -A computes computes=( ["compute02.localdomain"]="172.22.0.110" ["compute03.localdomain"]="172.22.0.112" )
-
Create an SSH authentication secret for the data plane nodes:
oc create secret generic dataplane-ansible-ssh-private-key-secret \ --save-config \ --dry-run=client \ --from-file=authorized_keys=/home/lab-user/.ssh/my-guidkey.pub \ --from-file=ssh-privatekey=/home/lab-user/.ssh/my-guidkey.pem \ --from-file=ssh-publickey=/home/lab-user/.ssh/my-guidkey.pub \ -n openstack \ -o yaml | oc apply -f-
-
Replace
/home/lab-user/.ssh/my-guidkey.pem
with the path to your SSH key.
-
-
Generate an ssh key-pair
nova-migration-ssh-key
secret:cd "$(mktemp -d)" ssh-keygen -f ./id -t ecdsa-sha2-nistp521 -N '' oc get secret nova-migration-ssh-key || oc create secret generic nova-migration-ssh-key \ -n openstack \ --from-file=ssh-privatekey=id \ --from-file=ssh-publickey=id.pub \ --type kubernetes.io/ssh-auth rm -f id* cd -
-
As we use a local storage back end for libvirt, create a
nova-compute-extra-config
service to remove pre-fast-forward workarounds and configure Compute services to use a local storage back end:cat << EOF | oc apply -f - apiVersion: v1 kind: ConfigMap metadata: name: nova-extra-config namespace: openstack data: 19-nova-compute-cell1-workarounds.conf: | [workarounds] disable_compute_service_check_for_ffu=true EOF
The secret nova-cell<X>-compute-config
auto-generates for eachcell<X>
. You must specify values for thenova-cell<X>-compute-config
andnova-migration-ssh-key
parameters for each customOpenStackDataPlaneService
CR that is related to the Compute service.The resources in the
ConfigMap
contain cell-specific configurations. -
Create a secret for the subscription manager:
oc create secret generic subscription-manager \ --from-literal rhc_auth='{"login": {"username": "<subscription_manager_username>", "password": "<subscription_manager_password>"}}'
-
Replace
<subscription_manager_username>
with the applicable user name. -
Replace
<subscription_manager_password>
with the applicable password.
-
-
Create a secret for the Red Hat registry:
oc create secret generic redhat-registry \ --from-literal edpm_container_registry_logins='{"registry.redhat.io": {"<registry_username>": "<registry_password>"}}'
-
Replace
<registry_username>
with the applicable user name. -
Replace
<registry_password>
with the applicable password.
-
-
Create the
OpenStackDataPlaneNodeSet
CR:oc apply -f osp-ng-dataplane-node-set-deploy-adoption-compute.yaml
-
Take some time to go through the osp-ng-dataplane-node-set-deploy-adoption-compute.yaml file. We have configured the edpm_ovn_bridge_mappings with "datacentre:br-ex"
-
-
Run the pre-adoption validation:
-
Create the validation service:
cat << EOF | oc apply -f - apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: name: pre-adoption-validation spec: playbook: osp.edpm.pre_adoption_validation EOF
-
Create a
OpenStackDataPlaneDeployment
CR that runs only the validation:cat << EOF | oc apply -f - apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: openstack-pre-adoption spec: nodeSets: - compute servicesOverride: - pre-adoption-validation EOF
-
When the validation is finished, confirm that the status of the Ansible EE pods is
Completed
:watch oc get pod -l app=openstackansibleee
oc logs -l app=openstackansibleee -f --max-log-requests 20
-
Wait for the deployment to reach the
Ready
status:oc wait --for condition=Ready openstackdataplanedeployment/openstack-pre-adoption --timeout=10m
If any openstack-pre-adoption validations fail, you must reference the Ansible logs to determine which ones were unsuccessful, and then try the following troubleshooting options:
-
If the hostname validation failed, check that the hostname of the data plane node is correctly listed in the
OpenStackDataPlaneNodeSet
CR. -
If the kernel argument check failed, ensure that the kernel argument configuration in the
edpm_kernel_args
andedpm_kernel_hugepages
variables in theOpenStackDataPlaneNodeSet
CR is the same as the kernel argument configuration that you used in the Red Hat OpenStack Platform (RHOSP) 17.1 node. -
If the tuned profile check failed, ensure that the
edpm_tuned_profile
variable in theOpenStackDataPlaneNodeSet
CR is configured to use the same profile as the one set on the RHOSP 17.1 node.
-
-
-
Remove the remaining director services:
-
Create an
OpenStackDataPlaneService
CR to clean up the data plane services you are adopting:cat << EOF | oc apply -f - apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneService metadata: name: tripleo-cleanup spec: playbook: osp.edpm.tripleo_cleanup EOF
-
Create the
OpenStackDataPlaneDeployment
CR to run the clean-up:cat << EOF | oc apply -f - apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: tripleo-cleanup spec: nodeSets: - compute2-3-set servicesOverride: - tripleo-cleanup EOF
-
-
When the clean-up is finished, deploy the
OpenStackDataPlaneDeployment
CR:cat << EOF | oc apply -f - apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: compute-adoption spec: nodeSets: - compute2-3-set EOF
-
Confirm that all the Ansible EE pods reach a
Completed
status:watch oc get pod -l app=openstackansibleee
oc logs -l app=openstackansibleee -f --max-log-requests 20
-
Wait for the data plane node set to reach the
Ready
status:oc wait --for condition=Ready osdpns/compute-adoption --timeout=30m
-
Verify that the Networking service (neutron) agents are running:
oc exec openstackclient -- openstack network agent list +--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+ | ID | Agent Type | Host | Availability Zone | Alive | State | Binary | +--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+ | 174fc099-5cc9-4348-b8fc-59ed44fcfb0e | DHCP agent | standalone.localdomain | nova | :-) | UP | neutron-dhcp-agent | | 10482583-2130-5b0d-958f-3430da21b929 | OVN Metadata agent | standalone.localdomain | | :-) | UP | neutron-ovn-metadata-agent | | a4f1b584-16f1-4937-b2b0-28102a3f6eaa | OVN Controller agent | standalone.localdomain | | :-) | UP | ovn-controller | +--------------------------------------+------------------------------+------------------------+-------------------+-------+-------+----------------------------+
-
You must perform a fast-forward upgrade on your Compute services. For more information, see Performing a fast-forward upgrade on Compute services.
Performing a fast-forward upgrade on Compute services
You must upgrade the Compute services from Red Hat OpenStack Platform 17.1 to Red Hat OpenStack Services on OpenShift (RHOSO) 18.0 on the control plane and data plane by completing the following tasks:
-
Update the cell1 Compute data plane services version.
-
Remove pre-fast-forward upgrade workarounds from the Compute control plane services and Compute data plane services.
-
Run Compute database online migrations to update live data.
-
Wait for cell1 Compute data plane services version to update:
oc exec openstack-cell1-galera-0 -c galera -- mysql -rs -uroot -p$PODIFIED_DB_ROOT_PASSWORD \ -e "select a.version from nova_cell1.services a join nova_cell1.services b where a.version!=b.version and a.binary='nova-compute' and a.deleted=0;"
The query returns an empty result when the update is completed. No downtime is expected for virtual machine workloads.
Review any errors in the nova Compute agent logs on the data plane, and the
nova-conductor
journal records on the control plane. -
Patch the
OpenStackControlPlane
CR to remove the pre-fast-forward upgrade workarounds from the Compute control plane services:oc patch openstackcontrolplane openstack -n openstack --type=merge --patch ' spec: nova: template: cellTemplates: cell0: conductorServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false cell1: metadataServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false conductorServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false apiServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false metadataServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false schedulerServiceTemplate: customServiceConfig: | [workarounds] disable_compute_service_check_for_ffu=false '
-
Wait until the Compute control plane services CRs are ready:
oc wait --for condition=Ready --timeout=300s Nova/nova
-
Complete the steps in Adopting Compute services to the RHOSO data plane.
-
Remove the pre-fast-forward upgrade workarounds from the Compute data plane services:
cat << EOF | oc apply -f - apiVersion: v1 kind: ConfigMap metadata: name: nova-extra-config namespace: openstack data: 20-nova-compute-cell1-workarounds.conf: | [workarounds] disable_compute_service_check_for_ffu=false --- apiVersion: dataplane.openstack.org/v1beta1 kind: OpenStackDataPlaneDeployment metadata: name: openstack-nova-compute-ffu namespace: openstack spec: nodeSets: - openstack-edpm-compute-2 servicesOverride: - nova EOF
The service included in the servicesOverride
key must match the name of the service that you included in theOpenStackDataPlaneNodeSet
CR. For example, if you use a custom service callednova-custom
, ensure that you add it to theservicesOverride
key. -
Wait for the Compute data plane services to be ready:
oc wait --for condition=Ready openstackdataplanedeployment/openstack-nova-compute-ffu --timeout=5m
-
Run Compute database online migrations to complete the fast-forward upgrade:
oc exec -it nova-cell0-conductor-0 -- nova-manage db online_data_migrations oc exec -it nova-cell1-conductor-0 -- nova-manage db online_data_migrations
-
Discover the Compute hosts in the cell:
oc rsh nova-cell0-conductor-0 nova-manage cell_v2 discover_hosts --verbose
-
Verify if the existing test VM instance is running:
${BASH_ALIASES[openstack]} server --os-compute-api-version 2.48 show --diagnostics test-server 2>&1 || echo FAIL
-
Verify if the Compute services can stop the existing test VM instance:
${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test ACTIVE" && ${BASH_ALIASES[openstack]} server stop test || echo PASS ${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test SHUTOFF" || echo FAIL ${BASH_ALIASES[openstack]} server --os-compute-api-version 2.48 show --diagnostics test 2>&1 || echo PASS
-
Verify if the Compute services can start the existing test VM instance:
${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test SHUTOFF" && ${BASH_ALIASES[openstack]} server start test || echo PASS ${BASH_ALIASES[openstack]} server list -c Name -c Status -f value | grep -qF "test ACTIVE" && \ ${BASH_ALIASES[openstack]} server --os-compute-api-version 2.48 show --diagnostics test --fit-width -f json | jq -r '.state' | grep running || echo FAIL
After the data plane adoption, the Compute hosts continue to run Red Hat Enterprise Linux (RHEL) 9.2. To take advantage of RHEL 9.4, perform a minor update procedure after finishing the adoption procedure. |