Yann Neuhaus

Subscribe to Yann Neuhaus feed Yann Neuhaus
Updated: 1 hour 26 min ago

Patching Oracle databases with Nutanix NDB

Thu, 2024-03-14 10:04

A few years back I had the great opportunity to test Nutanix NDB, at that time called Nutanix Era, and I had written a few blogs about provisioning Oracle databases, taking snapshot, log catchup, cloning a database, restoring a database, aso… and see interaction with Oracle Database itself. But I was not able, at that time, to test the patching of Oracle databases. I recently had the chance to get a new environment and I finally had the opportunity to test the patching of Oracle databases. I would like to share with you my experience and show you how easy the patching is with Nutanix NDB. You will be able to see the advantage that Nutanix NDB offers to you in term of database’s patching.

Template VM patching

I have been initially creating a template VM with Oracle 19.3.0.0.190416 version, named MAW-Oracle19cSource-19.3.0.0.190416, used only as reference to create the software profile and provisioned several oracle databases on several VMs.

I have then cloned the same template, to create a new VM that I called MAW-Oracle19cSource-19.21.0.0.231017.

I manually patched Oracle version for Oracle RDBMS and Oracle Grid on the new template VM MAW-Oracle19cSource-19.21.0.0.231017 from version 19.3 to last version 19.21.

Version of grid before patching the template:

[grid@oel-19c ~]$ /u01/app/19.0.0/grid/OPatch/opatch lspatches
29585399;OCW RELEASE UPDATE 19.3.0.0.0 (29585399)
29517247;ACFS RELEASE UPDATE 19.3.0.0.0 (29517247)
29517242;Database Release Update : 19.3.0.0.190416 (29517242)

OPatch succeeded.

Version of oracle before patching the template:

[oracle@oel-19c ~]$ $ORACLE_HOME/OPatch/opatch lspatches
29585399;OCW RELEASE UPDATE 19.3.0.0.0 (29585399)
29517242;Database Release Update : 19.3.0.0.190416 (29517242)

OPatch succeeded.

Version of grid after patching the template:

[grid@oel-19c ~]$ $ORACLE_HOME/OPatch/opatch lspatches
35655527;OCW RELEASE UPDATE 19.21.0.0.0 (35655527)
35652062;ACFS RELEASE UPDATE 19.21.0.0.0 (35652062)
35643107;Database Release Update : 19.21.0.0.231017 (35643107)
35553096;TOMCAT RELEASE UPDATE 19.0.0.0.0 (35553096)
33575402;DBWLM RELEASE UPDATE 19.0.0.0.0 (33575402)

OPatch succeeded.

Version of oracle after patching the template:

[oracle@oel-19c ~]$ $ORACLE_HOME/OPatch/opatch lspatches
35655527;OCW RELEASE UPDATE 19.21.0.0.0 (35655527)
35643107;Database Release Update : 19.21.0.0.231017 (35643107)

OPatch succeeded.

Checking on prism we can see the 2 templates.

Register new template VM

We now need to register the new MAW-Oracle19cSource-19.21.0.0.231017 VM into Nutanix NDB. For this we will go in the Database Server VMs menu and from the submenu List we will click the +Register button, and choose Oracle.

And fill all needed information.

We can ignore next missing dependencies, knowing sshpass is only required for Oracle RAC patching.

In the Operations menu, we can check the progression of the registration.

Finally, from the DB Server VMs menu, we can see our new VM template being registered.

Update existing software profile

We now need to update existing software profile with new 19.21 template. All previous databases and VMs have been provisioned with this software profile, named MAW-Oracle19c. This will give the ability to patch to 19.21 the Oracle databases been deployed with this software profile.

Going to the Profiles menu, Software submenu, we can see and select concerned software profile named MAW-Oracle19c.

Chose the button +Create to update your software profile by creating a new version.

Provide a name for the version and select the new VM template which has been recently patched to version 19.21.

Give some notes for the new OS version, grid version and oracle home version. Operating System version? Yes, because we could also in mean time use Nutanix NDB to easily patch the OS of the VM!

From the Operations menu, we can follow the progress of creating the Software Profile version.

Checking Software Profile Versions in use

From the same Profiles menu, Software submenu, clicking on our profile we can see the number of database servers using each version. In our case, we have 5 database servers (initially deployed) using the 19.3 version and 0 so far using the new 19.21 version.

Publish the new Software Profile version

Entering the concerned Software Profile, we see the same with more details.

We will click on the update button on profile 2.0 version (the one we just created) to publish it and make it available for recommended patches.
Do not forget to select the check box : By publishing this version of the software profile, I understand that NDB will recommend that all databases using an earlier versions of this software profile should update to this new version. The recommendation will appear on the Database Server VM home page.

The new 2.0 version is now in Published version.

Check Database Server VMs

In Database Server VMs Menu, List submenu, we see the list of all VMs.

Selecting and entering one of the initial deployed VMs, we see that there is now a recommended new version that can be applied, the new 19.21 we just created.

Let’s ssh TEST1-VM, check Oracle dbhome version and connect to the TEST1 database to check current version.

maw@DBI-LT-MAW2 Nutanix % ssh -i Nutanix_oracle_private_key.pem oracle@10.42.67.48
Last login: Thu Jan 11 16:13:30 2024 from 10.42.202.71

-bash-4.2$ sudo su - oracle
Last login: Sat Jan 13 20:13:02 UTC 2024 from 10.42.202.71 on pts/0

-bash-4.2$ hostname
MAW-Brewtanix-TEST1-VM

-bash-4.2$ $ORACLE_HOME/OPatch/opatch lspatches
29585399;OCW RELEASE UPDATE 19.3.0.0.0 (29585399)
29517242;Database Release Update : 19.3.0.0.190416 (29517242)

OPatch succeeded.

-bash-4.2$ sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Sat Jan 13 20:14:34 2024
Version 19.3.0.0.0

Copyright (c) 1982, 2019, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.3.0.0.0

SQL> set lines 300 pages 500
SQL> select instance_name, host_name from v$instance;

INSTANCE_NAME    HOST_NAME
---------------- ----------------------------------------------------------------
TEST1            MAW-Brewtanix-TEST1-VM

SQL> select patch_id, source_version, target_version, status, description from dba_registry_sqlpatch;

  PATCH_ID SOURCE_VERSION  TARGET_VERSION  STATUS                    DESCRIPTION
---------- --------------- --------------- ------------------------- ----------------------------------------------------------------------------------------------------
  29517242 19.1.0.0.0      19.3.0.0.0      SUCCESS                   Database Release Update : 19.3.0.0.190416 (29517242)

SQL>

Patch one Database Server VM

Let’s select TEST1-VM from Database Server VMs Menu and patch it by clicking the Update button.

Let’s chose our new Software Profile Version, 19.21, and select if we want to upgrade it now or later at a scheduled date and time. The patching can then easily be scheduled to be run in a maintenance windows.

We will chose to patch it now. Confirmation needs to be given by giving the name for the Database Server VM, here MAW-Brewtanix-TEST1-VM. Click the Update button.

From the Database Server VMs menu, we can see that the operation has been started.

From the Operations menu we can see the process of the patching.

And finally, from the Database Server VMs menu, clicking of the concerned VM we can see that this one is now on the last recommended version.

Check recent patched VM and database

We can connect to the recent patched VM.

And check cfgtool logs.

-bash-4.2$ cd /u02/app/oracle/cfgtoollogs/sqlpatch/sqlpatch_120344_2024_01_13_20_42_09/

-bash-4.2$ ls -ltrh
total 33M
-rw------- 1 oracle oinstall  552 Jan 13 20:42 sqlpatch_catcon__catcon_120344.lst
-rw-r--r-- 1 oracle oinstall 1.5K Jan 13 20:42 bootstrap1.sql
-rw-r--r-- 1 oracle oinstall 1.6K Jan 13 20:42 bootstrap1_TEST1.log
-rw-r--r-- 1 oracle oinstall 7.3K Jan 13 20:42 install1.sql
-rw-r--r-- 1 oracle oinstall  326 Jan 13 20:50 sqlpatch_autorecomp_TEST1.log
-rw------- 1 oracle oinstall  23M Jan 13 20:51 sqlpatch_catcon_0.log
-rw-r--r-- 1 oracle oinstall 3.8K Jan 13 20:51 sqlpatch_summary.json
-rw-r--r-- 1 oracle oinstall 632K Jan 13 20:51 sqlpatch_invocation.log
-rw-r--r-- 1 oracle oinstall  133 Jan 13 20:51 sqlpatch_progress.json
-rw-r--r-- 1 oracle oinstall 9.1M Jan 13 20:51 sqlpatch_debug.log

The version of the dbhome, which should now run Oracle 19.21 version.

-bash-4.2$ $ORACLE_HOME/OPatch/opatch lspatches
35655527;OCW RELEASE UPDATE 19.21.0.0.0 (35655527)
35643107;Database Release Update : 19.21.0.0.231017 (35643107)

OPatch succeeded.

-bash-4.2$ hostname
MAW-Brewtanix-TEST1-VM

-bash-4.2$ echo $ORACLE_SID
TEST1

And finally database version, which is now 19.21.

-bash-4.2$ sqlplus / as sysdba

SQL*Plus: Release 19.0.0.0.0 - Production on Sat Jan 13 20:59:50 2024
Version 19.21.0.0.0

Copyright (c) 1982, 2022, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Version 19.21.0.0.0

SQL> set lines 300 pages 500
SQL> select instance_name, host_name from v$instance;

INSTANCE_NAME    HOST_NAME
---------------- ----------------------------------------------------------------
TEST1            MAW-Brewtanix-TEST1-VM

SQL> select patch_id, source_version, target_version, status, description from dba_registry_sqlpatch;

  PATCH_ID SOURCE_VERSION  TARGET_VERSION  STATUS                    DESCRIPTION
---------- --------------- --------------- ------------------------- ----------------------------------------------------------------------------------------------------
  29517242 19.1.0.0.0      19.3.0.0.0      SUCCESS                   Database Release Update : 19.3.0.0.190416 (29517242)
  35643107 19.3.0.0.0      19.21.0.0.0     SUCCESS                   Database Release Update : 19.21.0.0.231017 (35643107)

SQL>

Patch all remaining Database Server VMs in one Maintenance Windows

We can now schedule all 4 remaining Database Server VMs patching in the same Maintenance Windows.

From the Profiles menu, Software submenu, clicking on our concerned Software Profile, MAW-Oracle19c, we can see that we have 4 VMs still running on 19.3 and only 1 running on last 19.21 Oracle version (TEST1 been recently patched).

Entering our Software Profile, we can have a look on the VMs named running each software version.

We need to create a Maintenance Windows. We can do this from the Policies menu, Maintenance Window submenu.

Click +Create button.

Give a name (here Patching-DB), choose scheduling, as weekly for example providing a repeated day and time. Give a time frame for the Maintenance Windows.

Maintenance Windows has been created.

From the Database Server VMs, select I selected the 4 VMs remaining on 19.3 version, chosed Maintenance from the Actions menu.

I selected Database Patching only, as I did not have any new OS version. I chose the appropriate Maintenance Windows that I have just created, and named Patching-DB. And finally associated the VMs to the maintenance windows by clicking on the Associate button.

Selecting the VMs again from the Database Server VMs menu, we can see that they are associated to the Patching-DB maintenance window.

And status becomes QUEUED FOR PATCHING once the maintenance windows starts.

Selecting one VM, we see status of In Progress on the Software Profile Version part.

Same can be seen from the Operations menu.

To wrap up

As we could see it is more than easy and very efficient to patch several Oracle Databases with Nutanix NDB.

L’article Patching Oracle databases with Nutanix NDB est apparu en premier sur dbi Blog.

Create REST API from your database in minute with Feathers.js

Wed, 2024-03-13 03:21

Creating REST APIs is a fairly repetitive task. You’re always rewriting the Create, Read, Update, Delete methods…
In my search for a framework capable of generating a REST API, I discovered Feathers, a simple and useful framework for Node.js.
In this article, we’ll show you how to create a REST API from a PostgreSQL database in just a few minutes.

Introduction to Feathers Feathers.js

Feathers is a framework for creating APIs in TypeScript or JavaScript. It can connect to various SQL (PostgreSQL, MySQL, Oracle, MSSQL…) or NoSQL (MongoDB) databases.
It can generate a REST or real-time API (web-socket), a CLI can be used to speed up development.
Lot of packages provided by the community exist to adding new features to Feather.

Prerequisites

To create a REST API application with Feathers.js, you need Node.js.

First Step with the CLI

I want to create a REST API for my workshop project, the API should exposing a table from my db to list all available workshops.
To start, I’m going to create a project using npm, which installs the CLI locally in the project at the same time.

Run the command :

npm create feathers@latest workshop-api

During the initialisation, answer to the questions :

? Do you want to use JavaScript or TypeScript? TypeScript
? Write a short description Workshop rest API
? Which HTTP framework do you want to use? KoaJS (recommended)
? What APIs do you want to offer? HTTP (REST)
? Which package manager are you using? npm
? Generate end-to-end typed client? Can be used with React, Angular, Vue, React Native, Node.js etc. Yes
? What is your preferred schema (model) definition format? Schemas allow to type, validate, secure and populate your data and configuration TypeBox (recommended)
? Which database are you connecting to? Databases can be added at any time PostgreSQL
? Enter your database connection string postgres://postgres:@localhost:5432/workshop

I choose :

  • TypeScript as language. In my opinion, typing can avoid many bugs
  • KoaJS as HTTP framework (KoaJS is a new framework designed by the team behind Express)
  • Only REST APIs, I have unchecked the real time API, I don’t want to use Web Sockets in my project
  • npm as package manager because, it’s already installed with node.js
  • Generate end-to-end typed client to YES, to be able to use types in front application
  • Typebox as schema (the default schema for Feathers)
  • PostgreSQL as database and the corresponding connection string

Now, initialization is complete. We have the skeleton of our application, but still no exposed service.

Add a service to the application

I use the CLI to generate a new service. This allows me to expose a table from my database via the REST API service.

To do this, into the application folder, simply run the npx command and answer the questions :

npx feathers generate service
? What is the name of your service? workshop
? Which path should the service be registered on? workshop
? Does this service require authentication? No
? What database is the service using? SQL
? Which schema definition format do you want to use? Schemas allow to type, validate, secure and populate data TypeBox  (recommended)

This command generates the files needed to expose a new service. Now I just need to define the structure of my table to have a functional service.

Define the data schema

To add our data model, we edit the generated file src/services/workshop/workshop.schema.ts and update the definition in this block of code :

// Main data model schema
export const workshopSchema = Type.Object(
  {
    id: Type.Number(),
    text: Type.String()
  },
  { $id: 'Workshop', additionalProperties: false }
)

By default the generated schema contains only an id and a text field. We’re going to modify the schema to adapt it to the actual structure of the data in the database.

I used an existing database with a workshop table designed like this :

  • id: primary key, auto increment
  • name: varchar 255, required
  • description: text, required
  • image: varchar 255

I’m changing the schema to match the structure of my database like this :

// Main data model schema
export const workshopSchema = Type.Required(Type.Object(
  {
    id: Type.Number(),
    name: Type.String({
      maxLength: 255
    }),
    description: Type.String({
      maxLength: 4096
    }),
    image: Type.Optional(Type.String({
      format: 'uri'
    }))
  },
  { $id: 'Workshop', additionalProperties: false }
))

In addition, update the columns list in the schema for creation and allowed query properties :

// Schema for creating new entries
export const workshopDataSchema = Type.Pick(workshopSchema, ['name', 'description', 'image'], {
  $id: 'WorkshopData'
})

.....

// Schema for allowed query properties
export const workshopQueryProperties = Type.Pick(workshopSchema, ['id', 'name', 'description', 'image'])

You can use the same process to generate as many services as you like. It’s quick and easy.

Run the Application

Now that the service is configured, I can run your freshly created REST API.

To do that, I run the command :

npm run dev

Once the application has started, simply open a browser and go to the url: http://localhost:3030
This page display only a Feathers.js logo.
To view the data of your db exposed by the service, go to: http://localhost:3030/[my_service name]
In my case, it’s: http://localhost:3030/workshop, the result of the call is a json with the data of my db.

Note about the documentation

The Feathers documentation about applications creation gives this command to start the application:

npm run compile 
npm run migrate 
npm start

The npm run compile an npm start have the same behavior as npm run dev, except that these first commands don’t refresh when code changes. But be careful with npm run migrate.

The npm run migrate, execute Knex.js, a tool for modifying the data structure in the database. Run the migrate command, which deletes nothing, but creates 2 tables in the db for internal use. If you already have a database with data, the migrate command is useless.

Conclusion

Creating a REST API application with Feathers.js is quick and easy, using its cli. It can connect to many databases such as PostgreSQL, MSSQL, Oracle, MongoDB…..
This article was a simple, basic example of API implementation, but Feathers.js can do more. Like authentication, data structure migration or adding a debugging user interface (Swagger) with an additional package.

L’article Create REST API from your database in minute with Feathers.js est apparu en premier sur dbi Blog.

Rancher RKE2 templates – Assign members to clusters

Tue, 2024-03-12 05:23

When testing RKE2 templates, I faced an issue with member assignments. When creating the cluster, a management cluster name is generated with the format c-m-xxxxxxxx, but the ClusterRoleTemplateBinding requires the cluster name to work. After digging into Rancher source code, I found out how to set the management cluster name. So let’s start!

Force the cluster name with RKE2 templates Investigation

When searching how the cluster name is generated when provisioning, I found the following code in the Rancher GitHub repository.

func mgmtClusterName() (string, error) {
	rand, err := randomtoken.Generate()
	if err != nil {
		return "", err
	}
	return name.SafeConcatName("c", "m", rand[:8]), nil
}

From the function mgmtClusterName I was able to find the following code.

if mgmtClusterNameAnnVal, ok := cluster.Annotations[mgmtClusterNameAnn]; ok && mgmtClusterNameAnnVal != "" && newCluster.Name == "" {
	// If the management cluster name annotation is set to a non-empty value, and the mgmt cluster name has not been set yet, set the cluster name to the mgmt cluster name.
	newCluster.Name = mgmtClusterNameAnnVal
} else if newCluster.Name == "" {
	// If the management cluster name annotation is not set and the cluster name has not yet been generated, generate and set a new mgmt cluster name.
	mgmtName, err := mgmtClusterName()
	if err != nil {
		return nil, status, err
	}
	newCluster.Name = mgmtName
}

Which finally leads to the following annotation.

mgmtClusterNameAnn    = "provisioning.cattle.io/management-cluster-name"
Forced the management cluster name

To avoid using the generated Cluster Name given by mgmtClusterName(), we can add the following annotation “provisioning.cattle.io/management-cluster-name” into the cluster.provisioning.cattle.io resources.

We can pick the same code from the Rancher template example in ClusterRoleTemplateBinding and do the following.

apiVersion: provisioning.cattle.io/v1
kind: Cluster
metadata:
  annotations:
    provisioning.cattle.io/management-cluster-name: c-m-{{ trunc 8 (sha256sum (printf "%s/%s" $.Release.Namespace $.Values.cluster.name)) }}
  {{- if .Values.cluster.annotations }}
{{ toYaml .Values.cluster.annotations | indent 4 }}
  {{- end }}

The template code above will ensure that the management cluster name will always be the one we generated ourselves.
Now let’s check the ClusterRoleTemplateBinding resource for automatically assigning users and groups.

RKE2 ClusterRoleTemplateBinding

To predefined users and groups into the cluster, we can use the template clusterroletemplatebinding.yaml.

{{ $root := . }}
{{- range $index, $member := .Values.clusterMembers }}
apiVersion: management.cattle.io/v3
clusterName: c-m-{{ trunc 8 (sha256sum (printf "%s/%s" $root.Release.Namespace $root.Values.cluster.name)) }}
kind: ClusterRoleTemplateBinding
metadata:
  name: ctrb-{{ trunc 8 (sha256sum (printf "%s/%s/%s" $root.Release.Namespace $member.principalName $member.roleTemplateName )) }}
  namespace: c-m-{{ trunc 8 (sha256sum (printf "%s/%s" $root.Release.Namespace $root.Values.cluster.name)) }}
roleTemplateName: {{ $member.roleTemplateName }}
userPrincipalName: {{ $member.principalName }}
{{- end }}

For the metadata.name, I added the RoleTemplateName to avoid identical names if you add the same user with different roles.

In the values.yaml you will need to provide the following information:

clusterMembers:
- principalName: "local://u-xxxxx"
   roleTemplateName: "cluster-member"
- principalName: "local://u-xxxxx"
   roleTemplateName: "cluster-owner"

When using only local users, it is easier as you only specify local:// with the ID of the user. But if you use groups, you may not know what value to provide. The same applies to your custom roles. The easiest way is to manually assign your members, and read the YAML files created.

For this example, I am adding my GitHub group “teamA” as a cluster member, and a local user “autoscaler” as a cluster owner.

Now go to More Resources > RBAC > ClusterRoleBindings and sort by age.
You should find the values to specify for principalName and roleTemplateName.

New issues

When the cluster is created for the first time, Rancher automatically creates the namespace. ClusterRoleTemplateBinding needs to be deployed into this namespace. Therefore it cannot be created at the creation of the cluster. In addition, it needs to wait for the Cluster resources to be provisioned by Rancher.

helm install --generate-name=true --namespace=fleet-default --timeout=10m0s --values=/home/shell/helm/values-dbiservices-template-ec2-0.0.1.yaml --version=0.0.1 --wait=true /home/shell/helm/dbiservices-template-ec2-0.0.1.tgz
2024-02-28T12:57:21.998195671Z Error: INSTALLATION FAILED: 1 error occurred:
2024-02-28T12:57:21.998226718Z 	* namespaces "c-m-dc91e1f4" not found

or 

2024-03-04T11:48:15.121066673Z 	* admission webhook "rancher.cattle.io.clusterroletemplatebindings.management.cattle.io" denied the request: clusterroletemplatebinding.clusterName: Invalid value: "c-m-dc91e1f4": specified cluster c-m-dc91e1f4 not found

Therefore, to ensure the assignment of users, the Helm charts must be updated after being deployed the first time.

In the local > Apps > Installed Apps, in the fleet-default namespace, edit/update the App, and redeploy it.

You don’t need to change any values. It will deploy correctly the ClusterRoleTemplateBinding and assign the users/groups.

helm upgrade --history-max=5 --install=true --namespace=fleet-default --timeout=10m0s --values=/home/shell/helm/values-dbiservices-template-ec2-0.0.1.yaml --version=0.0.1 --wait=true dbiservices-template-ec2-0-1709125040 /home/shell/helm/dbiservices-template-ec2-0.0.1.tgz
2024-02-28T13:02:07.755854608Z checking 6 resources for changes
2024-02-28T13:02:07.770167607Z Patch Amazonec2Config "dbiservices-template-rke2-ec2-template-controlplane" in namespace fleet-default
2024-02-28T13:02:07.795076109Z Patch Amazonec2Config "dbiservices-template-rke2-ec2-template-workers" in namespace fleet-default
2024-02-28T13:02:07.831247897Z Patch Cluster "dbiservices-template-rke2-ec2" in namespace fleet-default
2024-02-28T13:02:07.931325393Z Created a new ClusterRoleTemplateBinding called "ctrb-d4063a0e" in c-m-dc91e1f4
2024-02-28T13:02:07.931347721Z 
2024-02-28T13:02:07.961962070Z Patch ManagedChart "monitoring-crd-dbiservices-template-rke2-ec2" in namespace fleet-default
2024-02-28T13:02:08.027434827Z Patch ManagedChart "monitoring-dbiservices-template-rke2-ec2" in namespace fleet-default
2024-02-28T13:02:08.065622593Z beginning wait for 6 resources with timeout of 10m0s
2024-02-28T13:02:08.126940145Z Release "dbiservices-template-ec2-0-1709125040" has been upgraded. Happy Helming!
2024-02-28T13:02:08.126959636Z NAME: dbiservices-template-ec2-0-1709125040
2024-02-28T13:02:08.126964700Z LAST DEPLOYED: Wed Feb 28 13:02:06 2024
2024-02-28T13:02:08.126969197Z NAMESPACE: fleet-default
2024-02-28T13:02:08.126973071Z STATUS: deployed
2024-02-28T13:02:08.126977312Z REVISION: 2
2024-02-28T13:02:08.126981410Z TEST SUITE: None
2024-02-28T13:02:08.150360499Z 
2024-02-28T13:02:08.150390675Z ---------------------------------------------------------------------
2024-02-28T13:02:08.156224976Z SUCCESS: helm upgrade --history-max=5 --install=true --namespace=fleet-default --timeout=10m0s --values=/home/shell/helm/values-dbiservices-template-ec2-0.0.1.yaml --version=0.0.1 --wait=true dbiservices-template-ec2-0-17091
/home/shell/helm/dbiservices-template-ec2-0.0.1.tgz
2024-02-28T13:02:08.157013523Z ---------------------------------------------------------------------

The following member roles have been created for the clusters and are showing in the cluster configuration.

- principalName: "local://u-g9bq8"
  roleTemplateName: "cluster-owner"
- principalName: "github_team://8426662"
  roleTemplateName: "cluster-member"
- principalName: "local://u-g9bq8"
  roleTemplateName: "rt-tz9xs"
Helm lookup

Due to that issue with the namespace, setting the management cluster name doesn’t bring much advantage, as we need to manually redeploy the App (cluster template) to assign the users/groups.

Therefore we can use Helm function lookup, which can directly read the cluster name from the Kubernetes local cluster. Same as precedent, we need to redeploy the App (cluster template) the first time as it needs multiple resources to be provisioned by Rancher first.

Here is the code for the clusterroletemplatebinding.yaml

{{- $root := . }}
{{- $fetchedcluster :=  (lookup "provisioning.cattle.io/v1" "Cluster" "fleet-default" .Values.cluster.name) }}
{{- if ($fetchedcluster.status| default nil).clusterName | default nil }}
  {{- range $index, $member := .Values.clusterMembers }}
---
apiVersion: management.cattle.io/v3
clusterName: {{ $fetchedcluster.status.clusterName }}
kind: ClusterRoleTemplateBinding
metadata:
  name: ctrb-{{ trunc 8 (sha256sum (printf "%s/%s/%s" $root.Release.Namespace $member.principalName $member.roleTemplateName )) }}
  namespace: {{ $fetchedcluster.status.clusterName }}
roleTemplateName: {{ $member.roleTemplateName }}
userPrincipalName: {{ $member.principalName }}
  {{- end }}
{{- end }}

It will look into fleet-default for the cluster.provisionning.catte.io/v1 that is created by the RKE2 templates itself. On the first deployment of the RKE2 templates, it doesn’t exist yet, therefore an “if” condition is added.
Once the RKE2 template is deployed, you can like precedently, edit the App to redeploy it, and it will then create the ClusterRoleTemplateBinding.

The helm install will show no errors as the ClusterRoleTemplateBinding resources are skipped.

helm install --generate-name=true --namespace=fleet-default --timeout=10m0s --values=/home/shell/helm/values-dbiservices-template-ec2-0.0.1.yaml --version=0.0.1 --wait=true /home/shell/helm/dbiservices-template-ec2-0.0.1.tgz
2024-02-28T13:18:48.886232477Z creating 5 resource(s)
beginning wait for 5 resources with timeout of 10m0s
2024-02-28T13:18:49.098629894Z NAME: dbiservices-template-ec2-0-1709126327
2024-02-28T13:18:49.098705783Z LAST DEPLOYED: Wed Feb 28 13:18:47 2024
NAMESPACE: fleet-default
STATUS: deployed
2024-02-28T13:18:49.098717154Z REVISION: 1
2024-02-28T13:18:49.098720727Z TEST SUITE: None
2024-02-28T13:18:49.126871035Z 
2024-02-28T13:18:49.126936065Z ---------------------------------------------------------------------
2024-02-28T13:18:49.135118662Z SUCCESS: helm install --generate-name=true --namespace=fleet-default --timeout=10m0s --values=/home/shell/helm/values-dbiservices-template-ec2-0.0.1.yaml --version=0.0.1 --wait=true /home/shell/helm/dbiservices-template-ec2-0.0.1.tgz
---------------------------------------------------------------------

And then, when redeploying the App, it does show the creation of the resources as the cluster name now exists.

helm upgrade --history-max=5 --install=true --namespace=fleet-default --timeout=10m0s --values=/home/shell/helm/values-dbiservices-template-ec2-0.0.1.yaml --version=0.0.1 --wait=true dbiservices-template-ec2-0-1709126327 /home/shell/helm/dbiservices-template-ec2-0.0.1.tgz
checking 8 resources for changes
Patch Amazonec2Config "kke-test-template-controlplane" in namespace fleet-default
Patch Amazonec2Config "kke-test-template-workers" in namespace fleet-default
Patch Cluster "kke-test" in namespace fleet-default
Created a new ClusterRoleTemplateBinding called "ctrb-96090621" in c-m-58wcfhnl
2024-02-28T13:24:13.652797054Z 
Created a new ClusterRoleTemplateBinding called "ctrb-2c866242" in c-m-58wcfhnl

Created a new ClusterRoleTemplateBinding called "ctrb-d4063a0e" in c-m-58wcfhnl
2024-02-28T13:24:13.689031588Z 
Patch ManagedChart "monitoring-crd-kke-test" in namespace fleet-default
Patch ManagedChart "monitoring-kke-test" in namespace fleet-default
beginning wait for 8 resources with timeout of 10m0s
Release "dbiservices-template-ec2-0-1709126327" has been upgraded. Happy Helming!
2024-02-28T13:24:13.872941208Z NAME: dbiservices-template-ec2-0-1709126327
2024-02-28T13:24:13.872946743Z LAST DEPLOYED: Wed Feb 28 13:24:11 2024
2024-02-28T13:24:13.872951269Z NAMESPACE: fleet-default
2024-02-28T13:24:13.872955002Z STATUS: deployed
2024-02-28T13:24:13.872958850Z REVISION: 4
2024-02-28T13:24:13.872962507Z TEST SUITE: None

---------------------------------------------------------------------
SUCCESS: helm upgrade --history-max=5 --install=true --namespace=fleet-default --timeout=10m0s --values=/home/shell/helm/values-dbiservices-template-ec2-0.0.1.yaml --version=0.0.1 --wait=true dbiservices-template-ec2-0-1709126327 /home/shell/helm/dbiservices-template-ec2-0.0.1.tgz
---------------------------------------------------------------------
Conclusion

The assignment of users/groups to a cluster through a template is not simple. My approach might be wrong but It was the first solution I thought of when I encountered the problem.
The issue of waiting for the creation (first deployment with Helm) is a little bit annoying but is not much of an issue when you are aware of redeploying the RKE2 template. Also if you use Fleet to continuously deploy/update managed clusters, you could add the values for member configuration after the first deployment to avoid managing through the UI like above.

Below is the link to the GitHub Repository branch for the RKE2 templates using a fixed management cluster name and Helm function lookup.

Sources Blog

L’article Rancher RKE2 templates – Assign members to clusters est apparu en premier sur dbi Blog.

DevOps Demystified for Non-techies

Tue, 2024-03-12 02:19

I’ve recently written 2 blog posts (part-1 and part-2) about DevOps without using its name because it still scares some people. However these posts were meant mainly for a technical audience working in the Information Technology (IT) industry. But what about non-techies (in IT or outside of it) who still need to grasp the gist of DevOps without all the technical jibber-jabber?

Also, if you work as a DevOps consultant like me, have you already tried to explain your job to your love ones and only faced round eyes? I’ve written this blog post for you as well, so read along!

A DevOps Analogy

The easiest way to demystify DevOps is to use an analogy that speaks to everyone: A restaurant! In any type of restaurant you have basically 2 teams: The ones who cook and the ones who serve the customers. First revelation here: DevOps is a way of working so it could apply to everything! DevOps uses a methodology that is represented by an infinite loop as shown below. It follows several steps and we can keep the same name used for IT when it applies to a restaurant.

DevOps analogy with cooking The Cook Team

The first step for the cook team is to plan the menu and all the dishes they intend to deliver to their customer for each meal. They also have to plan for those dishes to be ready in a timely manner after the customer place an order. There is the general plan to create a menu for the customers and there is also a more dynamic plan to be able to cook each dish ordered by the customer through the serving team.

When this plan is settled, the next step in the DevOps methodology would be to code. Yes in the IT world that is what some people are doing behind their computer! Instead, in a restaurant, the cook team will have to buy the ingredients and prepare them. They will cut the vegetables in small chunks, prepare the sauces, put the meat and the fish into the fridge. During the meal service, they continuously prepare some ingredients in advance to be ready quickly when a new dish order arrives.

The ingredients being ready, they can now put everything together and prepare each dish of the menu with them. That is the build step. Each dish is made of the ingredients prepared previously. During the serving meal, they receive an order and just have to build that dish.

Before the cook team hands over the dishes to the serving team, they test it. You may have seen a cook (often the chef) tasting the sauce before putting it on the plate, or checking the temperature of the meat. They want to make sure the dish is perfect before being served to the customer.

The Serving Team

When the cook team is happy with their work, the serving team takes over these dishes. That is the release step. The waitresses and waiters carefully hold these dishes with sometimes several plates on each arm!

The food is then served to the customers all over the restaurant, that is the deploy step. The dishes are carried to the customer according to what they’ve ordered.

That is also the work of the serving team to put the dish on the table of the customer, set the cutlery and fill the glasses. That is the operate step.

Finally, the serving team checks with the customers everything is all right and if they are happy with their dishes. That is the monitor step where they also take notes if there are some leftovers on the plate or if some customers were unhappy.

The serving team also monitors and takes care of new customers entering the restaurant and writes their order. They pass it on to the cook team and another iteration of the DevOps loop starts. So during a meal, the restaurant goes several times through the DevOps loop.

In a restaurant, this whole process starts again and again for each meal every day. The information gathered during the monitor step are used to plan the next meal and do some changes if required.

The DevOps Consultant Role

You’ve seen that the DevOps methodology can apply to running a restaurant! Would you have thought of that?

In a restaurant, we just don’t call this process DevOps but the way it is organized is quite similar. A well managed restaurant is packed with customers and the dishes are delicious and delivered timely. Overall the restaurant is profitable and has a good reputation. It can even earn Michelin stars and be recognized as a top restaurant by the industry. To reach that level, the DevOps loop needs to flow smoothly. It is the goal of a DevOps consultant to help you with this if you are not there yet.

What could possibly go wrong? Well, you may have watched the show “Hell’s Kitchen”! It is about dysfunctional restaurants that need help to make it all work. In that case there may be issues at any steps from planning to monitoring. Gordon Ramsay would act like a DevOps consultant here. Of course the DevOps consultant doesn’t swear and is always polite with its customers!

If you have been to restaurant, you know that sometimes not everything goes right. The dish can be served cold, you wait for an hour before getting served, nobody seems available to take your order,… There could also be a lack of collaboration between the cook team and the serving team. It could take too long to prepare the dishes or too long to bring that dish to the customer. Each team will point to the other when the customer complains.

All these issues occurs when the DevOps loops doesn’t flow smoothly, so you hire a DevOps consultant for help. He will review and investigate each steps of this loop and make suggestions on how to improve it.

DevOps in IT Origin of DevOps

You are now ready to learn more about DevOps in IT! But first, did you know that DevOps is not born in the IT industry? It actually comes from the automotive industry. Yes, DevOps is inspired by the Toyota Production System and its continuous improvement of its manufacturing process to build cars.

DevOps is also seen as an evolution of the Agile methodology. Agile is a project management approach that involves breaking the project into phases and emphasizes continuous collaboration and improvement.

In IT we don’t deliver dishes or cars in the physical world but we deliver applications. An application could be a website for example where customers can buy goods. To create this website we need 2 teams: The developers (Dev) and the operation (Ops). This is where the word DevOps comes from. It binds the Developer and the Operation teams through the DevOps loop to collaborate and work efficiently together. Traditionally these 2 teams worked in silos but often the delivery of the application (or just a simple update of it) were slow. Applying the DevOps methodology can remediate that by accelerating this delivery time (called Time to Market).

DevOps loop steps

You have seen above we can easily translate the DevOps loop steps from IT to the restaurant business. What changes is what you deliver to your customers at the end. A dish for a restaurant or an application for IT.

Working in the IT industry make use of computers (of course) and their numerical tools. DevOps leverage on that at every steps of its loops. There are a lot of tools (too much actually) you can use. Some are quite popular and you may even have heard about them. Let’s have a look at these steps as they are used in IT.

The Dev Team

Planning is basically still done through the usual IT project tools. You list all the tasks to do and put deadlines for each of them. That’s a very quick summary but I don’t want to confuse you more. The Dev Team is involved in this step along with project managers.

Developers codes (means write a sequence of instructions in a programming language) and often use a tool called git to share their work with the other developers in the team. It is a great tool to put the work of all developers together and track the history of the modifications.

When the development team has gathered the final code from everyone, they are just happy to share it! This step is called build where they produce a single package of your application that contains all the code from everyone.

Before they give this package to the operation team, they prefer to do some tests against it to be sure all the features they have coded work well. It would be bad for a customer to click on a button and it doesn’t work. This is the test step.

The developers may have fixed a few bugs found during this testing but they are now confident to have a package that can be used by the customers. They hand that package over to the operation team.

The Ops Team

The operation team get that release and deploy it on an infrastructure (a bunch of computers with some numerical tools that host all of your applications) and make it available to the customer for using it. Usually they apply a deployment strategy as for example to make this application available only to a subset of users. If everything works well, then it will be made available to all users.

The operation team also operate the infrastructure that hosts the application. This infrastructure could be virtual (yes it is like a computer in another computer!) and even be some little virtual pieces called containers. A container can be created by a tool called Docker and if they are a lot of them we need an orchestrator tool called Kubernetes. If this infrastructure is virtual, we can code it as well! We then use a coding language like Ansible for this purpose.

When this application is used by the customers, it is carefully monitored to detect issues or receive feature requests from the customers. That will start another DevOps loop and if everything flows smoothly then the customers will quickly see the results and benefit from them. The infrastructure that hosts your applications is also continuously monitored to prevent (if possible) or react to any failures in it. It is important to keep this infrastructure up and running otherwise your applications can’t run properly.

What is this DevOps pipeline thing?

Even if you are non technical, you may have heard about the word pipeline. Is it related to plumbing or something? Know that a computer scientist (a more polished name than a geek!) only touch a computer. No real tools from the physical world. So where do you place a pipeline in this DevOps loop?

You have to know also that a good computer scientist is a lazy one. He likes to automate everything (means write code) to not do tasks manually and not do something twice. The coding part, he has to do it manually for now. When his code is done, there are some tools to automate the build and test phases. This is what we call a pipeline. The pipeline is itself some code to automate and string several steps of the DevOps loop together. From the end of the code step to the test step everything happens automatically, like a domino effect.

But that’s not all, when the operation team releases that new application, they can also use a pipeline. They are lazy too! So they use a pipeline that will get a release and deploy it automatically!

By putting the developers and the operations team together, they can be even more laziness! They can build a mega pipeline! This pipeline will then start at the end of the code step until the end of the operate step! That is a strong acceleration inside of the loop but you will need to wear a helmet to not crash out of the DevOps loop! By using this mega pipeline, a developer could push his code directly on production… It is like a cook serving his dish directly to the customer!

Of course in reality building this numerical pipeline takes a lot of time and effort.

Wrap up

This blog post was kind of a DevOps “Origin” to help non-techies understand what it is. From an analogy about managing a restaurant with DevOps we moved to how it is used in the IT industry.

If you want to go further, you can now confidently read my blog post about DevOps in IT for more details. You’ll better understand how an IT company can leverage DevOps.

Finally, to keep you well informed and on top of that topic: The name DevOps is becoming replaced by the term “Platform Engineering”. It focus mainly on improving the developers experience but the methodology is the same.

At dbi services we provide DevOps consulting, trainings (on Ansible, GitLab, and Docker and Kubernetes) as well as give webinars and presentations at events. Check out our website and follow us to stay informed!

L’article DevOps Demystified for Non-techies est apparu en premier sur dbi Blog.

Getting started with Greenplum – 5 – Recovering from failed segment nodes

Thu, 2024-03-07 08:19

This is the next post in this little Greenplum series. This time we’ll look at how we can recover from a failed segment. If you are looking for the previous post, they are here: Getting started with Greenplum – 1 – Installation, Getting started with Greenplum – 2 – Initializing and bringing up the cluster, Getting started with Greenplum – 3 – Behind the scenes, Getting started with Greenplum – 4 – Backup & Restore – databases.

Let’s quickly come back to what we’ve deployed currently:

                                        |-------------------|
                             |------6000|primary---------   |
                             |          |     Segment 1 |   |
                             |      7000|mirror<------| |   |
                             |          |-------------------|
                             |                        | |
            |-------------------|                     | |
            |                   |                     | |
        5432|   Coordinator     |                     | |
            |                   |                     | |
            |-------------------|                     | |
                             |                        | |
                             |          |-------------------|
                             |------6000|primary ------ |   |
                                        |     Segment 2 |   |
                                    7000|mirror<--------|   |
                                        |-------------------|

The coordinator host is the entry point for the application and requests are routed to the segment hosts. The idea behind this is, that you can use the power of multiple (segment) hosts to deliver what you’ve asked for. The more segment hosts you add, the more compute resources you can use.

The questions is: How can you recover from a failed segment node? With the current deployment this would reduce the compute resources by 50% and you probably want to have this back online as soon as possible.

To get the current status of your segments you can use “gpstate”:

[gpadmin@cdw ~]$ gpstate
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-Starting gpstate with args: 
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-Gathering data from segments...
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-Greenplum instance status summary
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Coordinator instance                                      = Active
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Coordinator standby                                       = No coordinator standby configured
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total segment instance count from metadata                = 4
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Primary Segment Status
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total primary segments                                    = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total primary segment valid (at coordinator)              = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total primary segment failures (at coordinator)           = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of postmaster.pid files missing              = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of postmaster.pid files found                = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of postmaster.pid PIDs missing               = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of postmaster.pid PIDs found                 = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of /tmp lock files missing                   = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of /tmp lock files found                     = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number postmaster processes missing                 = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number postmaster processes found                   = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Mirror Segment Status
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total mirror segments                                     = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total mirror segment valid (at coordinator)               = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total mirror segment failures (at coordinator)            = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of postmaster.pid files missing              = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of postmaster.pid files found                = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of postmaster.pid PIDs missing               = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of postmaster.pid PIDs found                 = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of /tmp lock files missing                   = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number of /tmp lock files found                     = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number postmaster processes missing                 = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number postmaster processes found                   = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number mirror segments acting as primary segments   = 0
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-   Total number mirror segments acting as mirror segments    = 2
20240307:11:03:16:001723 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------

This confirms that all is fine as of now. There are two primary and two mirror segments and all of them are up and running. You can also ask “gpstate” to only check for segments which have issues:

[gpadmin@cdw ~]$ gpstate -e
20240307:11:10:15:001862 gpstate:cdw:gpadmin-[INFO]:-Starting gpstate with args: -e
20240307:11:10:15:001862 gpstate:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:11:10:15:001862 gpstate:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:11:10:15:001862 gpstate:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:11:10:15:001862 gpstate:cdw:gpadmin-[INFO]:-Gathering data from segments...
20240307:11:10:15:001862 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:10:15:001862 gpstate:cdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20240307:11:10:15:001862 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:10:15:001862 gpstate:cdw:gpadmin-[INFO]:-All segments are running normally

There are some scenarios which can happen to a segment: Either you lose the full node, for whatever reason. Another thing which can happen, is that a specific PostgreSQL instance is not running anymore, either a primary or a mirror segment. Finally it can happen that the PGDATA for specific segment got corrupted.

Let’s start with an segment instance which went down. To force this, let’s kill the mirror instance on the first segment node (sdw1):

[gpadmin@sdw1 ~]$ ps -ef | egrep "7000|mirror" | grep -v grep
gpadmin     1343       1  0 10:50 ?        00:00:00 /usr/local/greenplum-db-7.1.0/bin/postgres -D /data/mirror/gpseg1 -c gp_role=execute
gpadmin     1344    1343  0 10:50 ?        00:00:00 postgres:  7000, logger process   
gpadmin     1346    1343  0 10:50 ?        00:00:00 postgres:  7000, startup   recovering 000000010000000000000004
gpadmin     1348    1343  0 10:50 ?        00:00:00 postgres:  7000, checkpointer   
gpadmin     1349    1343  0 10:50 ?        00:00:00 postgres:  7000, background writer   
gpadmin     1372    1343  0 10:50 ?        00:00:01 postgres:  7000, walreceiver   streaming 0/127CD2A8
[gpadmin@sdw1 ~]$ kill -9 1372 1349 1348 1346 1344 1343
[gpadmin@sdw1 ~]$ ps -ef | egrep "7000|mirror" | grep -v grep
[gpadmin@sdw1 ~]$ 

The coordinator should become aware of this:

[gpadmin@cdw ~]$ gpstate -e
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-Starting gpstate with args: -e
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-Gathering data from segments...
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[WARNING]:-pg_stat_replication shows no standby connections
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-Unsynchronized Segment Pairs
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-   Current Primary   Port   WAL sync remaining bytes   Mirror   Port
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-   sdw2              6000   Unknown                    sdw1     7000
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-Downed Segments (may include segments where status could not be retrieved)
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-   Segment   Port   Config status   Status
20240307:11:44:25:002261 gpstate:cdw:gpadmin-[INFO]:-   sdw1      7000   Down            Down in configuration

This confirms that the segment is down and the coordinator is aware of it. In the most simple case you can just use “gprecoverseg” to recover any failed segments like this:

[gpadmin@cdw ~]$ gprecoverseg
20240307:11:45:33:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Starting gprecoverseg with args: 
20240307:11:45:33:002349 gprecoverseg:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:11:45:33:002349 gprecoverseg:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:11:45:33:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully finished pg_controldata /data/primary/gpseg1 for dbid 3:
stdout: pg_control version number:            12010700
Catalog version number:               302307241
Database system identifier:           7340990201631847636
Database cluster state:               in production
pg_control last modified:             Thu 07 Mar 2024 10:55:32 AM CET
Latest checkpoint location:           0/127CD1E8
Latest checkpoint's REDO location:    0/127CD1B0
Latest checkpoint's REDO WAL file:    000000010000000000000004
Latest checkpoint's TimeLineID:       1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:545
Latest checkpoint's NextGxid:         25
Latest checkpoint's NextOID:          17451
Latest checkpoint's NextRelfilenode:  16392
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        529
Latest checkpoint's oldestXID's DB:   13719
Latest checkpoint's oldestActiveXID:  545
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 13720
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Thu 07 Mar 2024 10:55:32 AM CET
Fake LSN counter for unlogged rels:   0/3E8
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                off
max_connections setting:              750
max_worker_processes setting:         12
max_wal_senders setting:              10
max_prepared_xacts setting:           250
max_locks_per_xact setting:           128
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  32768
Blocks per segment of large relation: 32768
WAL block size:                       32768
Bytes per WAL segment:                67108864
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        8140
Size of a large-object chunk:         8192
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           1
Mock authentication nonce:            ae03e3dc891309211ede650a2e28c1b7dac2e510970a912d9f428761f243896e

stderr: 
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully finished pg_controldata /data/mirror/gpseg1 for dbid 5:
stdout: pg_control version number:            12010700
Catalog version number:               302307241
Database system identifier:           7340990201631847636
Database cluster state:               in archive recovery
pg_control last modified:             Thu 07 Mar 2024 11:00:32 AM CET
Latest checkpoint location:           0/127CD1E8
Latest checkpoint's REDO location:    0/127CD1B0
Latest checkpoint's REDO WAL file:    000000010000000000000004
Latest checkpoint's TimeLineID:       1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:545
Latest checkpoint's NextGxid:         25
Latest checkpoint's NextOID:          17451
Latest checkpoint's NextRelfilenode:  16392
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        529
Latest checkpoint's oldestXID's DB:   13719
Latest checkpoint's oldestActiveXID:  545
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 13720
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Thu 07 Mar 2024 10:55:32 AM CET
Fake LSN counter for unlogged rels:   0/3E8
Minimum recovery ending location:     0/127CD2A8
Min recovery ending loc's timeline:   1
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                off
max_connections setting:              750
max_worker_processes setting:         12
max_wal_senders setting:              10
max_prepared_xacts setting:           250
max_locks_per_xact setting:           128
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  32768
Blocks per segment of large relation: 32768
WAL block size:                       32768
Bytes per WAL segment:                67108864
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        8140
Size of a large-object chunk:         8192
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           1
Mock authentication nonce:            ae03e3dc891309211ede650a2e28c1b7dac2e510970a912d9f428761f243896e

stderr: 
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Heap checksum setting is consistent between coordinator and the segments that are candidates for recoverseg
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Greenplum instance recovery parameters
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Recovery type              = Standard
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Recovery 1 of 1
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Synchronization mode                 = Incremental
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance host                 = sdw1
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance address              = sdw1
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance directory            = /data/mirror/gpseg1
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance port                 = 7000
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance host        = sdw2
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance address     = sdw2
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance directory   = /data/primary/gpseg1
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance port        = 6000
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Target                      = in-place
20240307:11:45:34:002349 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------

Continue with segment recovery procedure Yy|Nn (default=N):

Once we confirm this, the failed instance should recover from the primary segment on the same host:

Continue with segment recovery procedure Yy|Nn (default=N):
> y
20240307:11:48:44:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Starting to create new pg_hba.conf on primary segments
20240307:11:48:44:002349 gprecoverseg:cdw:gpadmin-[INFO]:-killing existing walsender process on primary sdw2:6000 to refresh replication connection
20240307:11:48:44:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully modified pg_hba.conf on primary segments to allow replication connections
20240307:11:48:44:002349 gprecoverseg:cdw:gpadmin-[INFO]:-1 segment(s) to recover
20240307:11:48:44:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Ensuring 1 failed segment(s) are stopped
20240307:11:48:45:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Setting up the required segments for recovery
20240307:11:48:45:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Updating configuration for mirrors
20240307:11:48:45:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Initiating segment recovery. Upon completion, will start the successfully recovered segments
20240307:11:48:45:002349 gprecoverseg:cdw:gpadmin-[INFO]:-era is 5519b53b4b2c1dab_240307105028
sdw1 (dbid 5): skipping pg_rewind on mirror as standby.signal is present
20240307:11:48:46:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Triggering FTS probe
20240307:11:48:46:002349 gprecoverseg:cdw:gpadmin-[INFO]:-********************************
20240307:11:48:46:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Segments successfully recovered.
20240307:11:48:46:002349 gprecoverseg:cdw:gpadmin-[INFO]:-********************************
20240307:11:48:46:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Recovered mirror segments need to sync WAL with primary segments.
20240307:11:48:46:002349 gprecoverseg:cdw:gpadmin-[INFO]:-Use 'gpstate -e' to check progress of WAL sync remaining bytes

Asking for any failed segments again confirms that all went well and the failed instance is back online:

[gpadmin@cdw ~]$ gpstate -e
20240307:11:49:16:002438 gpstate:cdw:gpadmin-[INFO]:-Starting gpstate with args: -e
20240307:11:49:16:002438 gpstate:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:11:49:16:002438 gpstate:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:11:49:16:002438 gpstate:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:11:49:16:002438 gpstate:cdw:gpadmin-[INFO]:-Gathering data from segments...
20240307:11:49:16:002438 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:49:16:002438 gpstate:cdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20240307:11:49:16:002438 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:11:49:16:002438 gpstate:cdw:gpadmin-[INFO]:-All segments are running normally

This was the easy case. What happens if we remove PGGDATA of a primary segment? We’ll use the same node for this test and remove PGDATA of the primary instance on node sdw1:

[gpadmin@sdw1 ~]$ ps -ef | egrep "6000|primary" | grep -v grep
gpadmin     1329       1  0 10:50 ?        00:00:00 /usr/local/greenplum-db-7.1.0/bin/postgres -D /data/primary/gpseg0 -c gp_role=execute
gpadmin     1345    1329  0 10:50 ?        00:00:00 postgres:  6000, logger process   
gpadmin     1352    1329  0 10:50 ?        00:00:00 postgres:  6000, checkpointer   
gpadmin     1353    1329  0 10:50 ?        00:00:00 postgres:  6000, background writer   
gpadmin     1354    1329  0 10:50 ?        00:00:00 postgres:  6000, walwriter   
gpadmin     1355    1329  0 10:50 ?        00:00:00 postgres:  6000, autovacuum launcher   
gpadmin     1356    1329  0 10:50 ?        00:00:00 postgres:  6000, stats collector   
gpadmin     1357    1329  0 10:50 ?        00:00:00 postgres:  6000, logical replication launcher   
gpadmin     1360    1329  0 10:50 ?        00:00:00 postgres:  6000, walsender gpadmin 192.168.122.202(40808) streaming 0/127E5FE8
[gpadmin@sdw1 ~]$ rm -rf /data/primary/gpseg0/*
[gpadmin@sdw1 ~]$ ps -ef | egrep "6000|primary" | grep -v grep
[gpadmin@sdw1 ~]$ 

Again, the coordinator node is of course aware of that:

[gpadmin@cdw ~]$ gpstate -e
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-Starting gpstate with args: -e
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-Gathering data from segments...
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[WARNING]:-pg_stat_replication shows no standby connections
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-Segments with Primary and Mirror Roles Switched
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-   Current Primary   Port   Mirror   Port
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-   sdw2              7000   sdw1     6000
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-Unsynchronized Segment Pairs
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-   Current Primary   Port   WAL sync remaining bytes   Mirror   Port
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-   sdw2              7000   Unknown                    sdw1     6000
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-Downed Segments (may include segments where status could not be retrieved)
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-   Segment   Port   Config status   Status
20240307:15:00:20:004247 gpstate:cdw:gpadmin-[INFO]:-   sdw1      6000   Down            Down in configuration

Trying to recover in the same way as before:

[gpadmin@cdw ~]$ gprecoverseg 
20240307:15:01:37:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Starting gprecoverseg with args: 
20240307:15:01:37:004431 gprecoverseg:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:15:01:37:004431 gprecoverseg:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:15:01:37:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully finished pg_controldata /data/mirror/gpseg0 for dbid 4:
stdout: pg_control version number:            12010700
Catalog version number:               302307241
Database system identifier:           7340990201624057424
Database cluster state:               in production
pg_control last modified:             Thu 07 Mar 2024 02:59:57 PM CET
Latest checkpoint location:           0/127E6088
Latest checkpoint's REDO location:    0/127E6018
Latest checkpoint's REDO WAL file:    000000020000000000000004
Latest checkpoint's TimeLineID:       2
Latest checkpoint's PrevTimeLineID:   2
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:545
Latest checkpoint's NextGxid:         25
Latest checkpoint's NextOID:          17451
Latest checkpoint's NextRelfilenode:  16392
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        529
Latest checkpoint's oldestXID's DB:   13719
Latest checkpoint's oldestActiveXID:  545
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 13720
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Thu 07 Mar 2024 02:59:57 PM CET
Fake LSN counter for unlogged rels:   0/3E8
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                off
max_connections setting:              750
max_worker_processes setting:         12
max_wal_senders setting:              10
max_prepared_xacts setting:           250
max_locks_per_xact setting:           128
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  32768
Blocks per segment of large relation: 32768
WAL block size:                       32768
Bytes per WAL segment:                67108864
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        8140
Size of a large-object chunk:         8192
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           1
Mock authentication nonce:            300975858d3712b8d6ecd6583814e3bc603e12304aa764d1c659721e205dc0ad

stderr: 
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[WARNING]:-cannot access pg_controldata for dbid 2 on host sdw1
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Heap checksum setting is consistent between coordinator and the segments that are candidates for recoverseg
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Greenplum instance recovery parameters
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Recovery type              = Standard
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Recovery 1 of 1
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Synchronization mode                 = Incremental
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance host                 = sdw1
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance address              = sdw1
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance directory            = /data/primary/gpseg0
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance port                 = 6000
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance host        = sdw2
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance address     = sdw2
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance directory   = /data/mirror/gpseg0
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance port        = 7000
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Target                      = in-place
20240307:15:01:38:004431 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------

Continue with segment recovery procedure Yy|Nn (default=N):
> y
20240307:15:01:52:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Starting to create new pg_hba.conf on primary segments
20240307:15:01:52:004431 gprecoverseg:cdw:gpadmin-[INFO]:-killing existing walsender process on primary sdw2:7000 to refresh replication connection
20240307:15:01:52:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully modified pg_hba.conf on primary segments to allow replication connections
20240307:15:01:52:004431 gprecoverseg:cdw:gpadmin-[INFO]:-1 segment(s) to recover
20240307:15:01:52:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Ensuring 1 failed segment(s) are stopped
20240307:15:01:52:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Setting up the required segments for recovery
20240307:15:01:53:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Updating configuration for mirrors
20240307:15:01:53:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Initiating segment recovery. Upon completion, will start the successfully recovered segments
20240307:15:01:53:004431 gprecoverseg:cdw:gpadmin-[INFO]:-era is 5519b53b4b2c1dab_240307105028
sdw1 (dbid 2): pg_rewind: fatal: could not open file "/data/primary/gpseg0/global/pg_control" for reading: No such file or directory
20240307:15:01:53:004431 gprecoverseg:cdw:gpadmin-[INFO]:-----------------------------------------------------------
20240307:15:01:53:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Failed to recover the following segments. You must run either gprecoverseg --differential or gprecoverseg -F for all incremental failures
20240307:15:01:53:004431 gprecoverseg:cdw:gpadmin-[INFO]:- hostname: sdw1; port: 6000; logfile: /home/gpadmin/gpAdminLogs/pg_rewind.20240307_150152.dbid2.out; recoverytype: incremental
20240307:15:01:53:004431 gprecoverseg:cdw:gpadmin-[INFO]:-Triggering FTS probe
20240307:15:01:53:004431 gprecoverseg:cdw:gpadmin-[ERROR]:-gprecoverseg failed. Please check the output for more details.

This fails because “pg_control” is not available anymore. By default recovery will do a incremental recovery and this cannot anymore work. Now we need to do a full recovery:

[gpadmin@cdw ~]$ gprecoverseg -F
20240307:15:03:15:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Starting gprecoverseg with args: -F
20240307:15:03:15:004491 gprecoverseg:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:15:03:15:004491 gprecoverseg:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:15:03:15:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully finished pg_controldata /data/mirror/gpseg0 for dbid 4:
stdout: pg_control version number:            12010700
Catalog version number:               302307241
Database system identifier:           7340990201624057424
Database cluster state:               in production
pg_control last modified:             Thu 07 Mar 2024 03:01:53 PM CET
Latest checkpoint location:           0/127E6180
Latest checkpoint's REDO location:    0/127E6148
Latest checkpoint's REDO WAL file:    000000020000000000000004
Latest checkpoint's TimeLineID:       2
Latest checkpoint's PrevTimeLineID:   2
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:545
Latest checkpoint's NextGxid:         25
Latest checkpoint's NextOID:          17451
Latest checkpoint's NextRelfilenode:  16392
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        529
Latest checkpoint's oldestXID's DB:   13719
Latest checkpoint's oldestActiveXID:  545
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 13720
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Thu 07 Mar 2024 03:01:53 PM CET
Fake LSN counter for unlogged rels:   0/3E8
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                off
max_connections setting:              750
max_worker_processes setting:         12
max_wal_senders setting:              10
max_prepared_xacts setting:           250
max_locks_per_xact setting:           128
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  32768
Blocks per segment of large relation: 32768
WAL block size:                       32768
Bytes per WAL segment:                67108864
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        8140
Size of a large-object chunk:         8192
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           1
Mock authentication nonce:            300975858d3712b8d6ecd6583814e3bc603e12304aa764d1c659721e205dc0ad

stderr: 
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[WARNING]:-cannot access pg_controldata for dbid 2 on host sdw1
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Heap checksum setting is consistent between coordinator and the segments that are candidates for recoverseg
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Greenplum instance recovery parameters
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Recovery type              = Standard
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Recovery 1 of 1
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Synchronization mode                 = Full
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance host                 = sdw1
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance address              = sdw1
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance directory            = /data/primary/gpseg0
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance port                 = 6000
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance host        = sdw2
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance address     = sdw2
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance directory   = /data/mirror/gpseg0
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance port        = 7000
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Target                      = in-place
20240307:15:03:16:004491 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------

Continue with segment recovery procedure Yy|Nn (default=N):
> y
20240307:15:03:21:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Starting to create new pg_hba.conf on primary segments
20240307:15:03:21:004491 gprecoverseg:cdw:gpadmin-[INFO]:-killing existing walsender process on primary sdw2:7000 to refresh replication connection
20240307:15:03:21:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully modified pg_hba.conf on primary segments to allow replication connections
20240307:15:03:21:004491 gprecoverseg:cdw:gpadmin-[INFO]:-1 segment(s) to recover
20240307:15:03:21:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Ensuring 1 failed segment(s) are stopped
20240307:15:03:21:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Setting up the required segments for recovery
20240307:15:03:22:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Updating configuration for mirrors
20240307:15:03:22:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Initiating segment recovery. Upon completion, will start the successfully recovered segments
20240307:15:03:22:004491 gprecoverseg:cdw:gpadmin-[INFO]:-era is 5519b53b4b2c1dab_240307105028
sdw1 (dbid 2): pg_basebackup: base backup completed
20240307:15:03:25:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Triggering FTS probe
20240307:15:03:25:004491 gprecoverseg:cdw:gpadmin-[INFO]:-********************************
20240307:15:03:25:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Segments successfully recovered.
20240307:15:03:25:004491 gprecoverseg:cdw:gpadmin-[INFO]:-********************************
20240307:15:03:25:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Recovered mirror segments need to sync WAL with primary segments.
20240307:15:03:25:004491 gprecoverseg:cdw:gpadmin-[INFO]:-Use 'gpstate -e' to check progress of WAL sync remaining bytes

This worked but now we see this when we ask for failed segment nodes:

[gpadmin@cdw ~]$ gpstate -e
20240307:15:04:35:004552 gpstate:cdw:gpadmin-[INFO]:-Starting gpstate with args: -e
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-Gathering data from segments...
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-Segments with Primary and Mirror Roles Switched
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-   Current Primary   Port   Mirror   Port
20240307:15:04:36:004552 gpstate:cdw:gpadmin-[INFO]:-   sdw2              7000   sdw1     6000

The reason is, that not all instances are in their preferred role anymore:

[gpadmin@cdw ~]$ psql -c "select * from gp_segment_configuration" postgres
 dbid | content | role | preferred_role | mode | status | port | hostname | address |          datadir          
------+---------+------+----------------+------+--------+------+----------+---------+---------------------------
    1 |      -1 | p    | p              | n    | u      | 5432 | cdw      | cdw     | /data/coordinator/gpseg-1
    3 |       1 | p    | p              | s    | u      | 6000 | sdw2     | sdw2    | /data/primary/gpseg1
    5 |       1 | m    | m              | s    | u      | 7000 | sdw1     | sdw1    | /data/mirror/gpseg1
    4 |       0 | p    | m              | s    | u      | 7000 | sdw2     | sdw2    | /data/mirror/gpseg0
    2 |       0 | m    | p              | s    | u      | 6000 | sdw1     | sdw1    | /data/primary/gpseg0

When you are in such a state you should re-balance the segments:

[gpadmin@cdw ~]$ gprecoverseg -r
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Starting gprecoverseg with args: -r
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Greenplum instance recovery parameters
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Recovery type              = Rebalance
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Unbalanced segment 1 of 2
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Unbalanced instance host        = sdw2
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Unbalanced instance address     = sdw2
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Unbalanced instance directory   = /data/mirror/gpseg0
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Unbalanced instance port        = 7000
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Balanced role                   = Mirror
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Current role                    = Primary
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Unbalanced segment 2 of 2
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Unbalanced instance host        = sdw1
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Unbalanced instance address     = sdw1
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Unbalanced instance directory   = /data/primary/gpseg0
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Unbalanced instance port        = 6000
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Balanced role                   = Primary
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Current role                    = Mirror
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[WARNING]:-This operation will cancel queries that are currently executing.
20240307:15:08:32:004692 gprecoverseg:cdw:gpadmin-[WARNING]:-Connections to the database however will not be interrupted.

Continue with segment rebalance procedure Yy|Nn (default=N):
> y
20240307:15:09:53:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Determining primary and mirror segment pairs to rebalance
20240307:15:09:53:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Allowed replay lag during rebalance is 10 GB
20240307:15:09:53:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Stopping unbalanced primary segments...
.
20240307:15:09:55:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Triggering segment reconfiguration
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Starting segment synchronization
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-=============================START ANOTHER RECOVER=========================================
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully finished pg_controldata /data/primary/gpseg0 for dbid 2:
stdout: pg_control version number:            12010700
Catalog version number:               302307241
Database system identifier:           7340990201624057424
Database cluster state:               in production
pg_control last modified:             Thu 07 Mar 2024 03:10:00 PM CET
Latest checkpoint location:           0/18000280
Latest checkpoint's REDO location:    0/18000210
Latest checkpoint's REDO WAL file:    000000030000000000000006
Latest checkpoint's TimeLineID:       3
Latest checkpoint's PrevTimeLineID:   3
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:545
Latest checkpoint's NextGxid:         25
Latest checkpoint's NextOID:          17451
Latest checkpoint's NextRelfilenode:  16392
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        529
Latest checkpoint's oldestXID's DB:   13719
Latest checkpoint's oldestActiveXID:  545
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 13720
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Thu 07 Mar 2024 03:10:00 PM CET
Fake LSN counter for unlogged rels:   0/3E8
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                off
max_connections setting:              750
max_worker_processes setting:         12
max_wal_senders setting:              10
max_prepared_xacts setting:           250
max_locks_per_xact setting:           128
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  32768
Blocks per segment of large relation: 32768
WAL block size:                       32768
Bytes per WAL segment:                67108864
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        8140
Size of a large-object chunk:         8192
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           1
Mock authentication nonce:            300975858d3712b8d6ecd6583814e3bc603e12304aa764d1c659721e205dc0ad

stderr: 
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully finished pg_controldata /data/mirror/gpseg0 for dbid 4:
stdout: pg_control version number:            12010700
Catalog version number:               302307241
Database system identifier:           7340990201624057424
Database cluster state:               shut down
pg_control last modified:             Thu 07 Mar 2024 03:09:54 PM CET
Latest checkpoint location:           0/18000158
Latest checkpoint's REDO location:    0/18000158
Latest checkpoint's REDO WAL file:    000000020000000000000006
Latest checkpoint's TimeLineID:       2
Latest checkpoint's PrevTimeLineID:   2
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:545
Latest checkpoint's NextGxid:         25
Latest checkpoint's NextOID:          17451
Latest checkpoint's NextRelfilenode:  16392
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        529
Latest checkpoint's oldestXID's DB:   13719
Latest checkpoint's oldestActiveXID:  0
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 13720
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Thu 07 Mar 2024 03:09:54 PM CET
Fake LSN counter for unlogged rels:   0/3E8
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                off
max_connections setting:              750
max_worker_processes setting:         12
max_wal_senders setting:              10
max_prepared_xacts setting:           250
max_locks_per_xact setting:           128
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  32768
Blocks per segment of large relation: 32768
WAL block size:                       32768
Bytes per WAL segment:                67108864
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        8140
Size of a large-object chunk:         8192
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           1
Mock authentication nonce:            300975858d3712b8d6ecd6583814e3bc603e12304aa764d1c659721e205dc0ad

stderr: 
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Heap checksum setting is consistent between coordinator and the segments that are candidates for recoverseg
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Greenplum instance recovery parameters
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Recovery type              = Standard
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Recovery 1 of 1
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Synchronization mode                 = Incremental
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance host                 = sdw2
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance address              = sdw2
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance directory            = /data/mirror/gpseg0
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Failed instance port                 = 7000
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance host        = sdw1
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance address     = sdw1
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance directory   = /data/primary/gpseg0
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Source instance port        = 6000
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-   Recovery Target                      = in-place
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:----------------------------------------------------------
20240307:15:10:02:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Starting to create new pg_hba.conf on primary segments
20240307:15:10:03:004692 gprecoverseg:cdw:gpadmin-[INFO]:-killing existing walsender process on primary sdw1:6000 to refresh replication connection
20240307:15:10:03:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Successfully modified pg_hba.conf on primary segments to allow replication connections
20240307:15:10:03:004692 gprecoverseg:cdw:gpadmin-[INFO]:-1 segment(s) to recover
20240307:15:10:03:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Ensuring 1 failed segment(s) are stopped
20240307:15:10:03:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Setting up the required segments for recovery
20240307:15:10:03:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Updating configuration for mirrors
20240307:15:10:03:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Initiating segment recovery. Upon completion, will start the successfully recovered segments
20240307:15:10:03:004692 gprecoverseg:cdw:gpadmin-[INFO]:-era is 5519b53b4b2c1dab_240307105028
sdw2 (dbid 4): pg_rewind: no rewind required
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Triggering FTS probe
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-********************************
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Segments successfully recovered.
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-********************************
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Recovered mirror segments need to sync WAL with primary segments.
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-Use 'gpstate -e' to check progress of WAL sync remaining bytes
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-==============================END ANOTHER RECOVER==========================================
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-******************************************************************
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-The rebalance operation has completed successfully.
20240307:15:10:04:004692 gprecoverseg:cdw:gpadmin-[INFO]:-******************************************************************

Once this completed we are back to normal operations:

[gpadmin@cdw ~]$ gpstate -e
20240307:15:11:15:004815 gpstate:cdw:gpadmin-[INFO]:-Starting gpstate with args: -e
20240307:15:11:15:004815 gpstate:cdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240307:15:11:15:004815 gpstate:cdw:gpadmin-[INFO]:-coordinator Greenplum Version: 'PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.'
20240307:15:11:15:004815 gpstate:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240307:15:11:15:004815 gpstate:cdw:gpadmin-[INFO]:-Gathering data from segments...
20240307:15:11:16:004815 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:15:11:16:004815 gpstate:cdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20240307:15:11:16:004815 gpstate:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240307:15:11:16:004815 gpstate:cdw:gpadmin-[INFO]:-All segments are running normally

[gpadmin@cdw ~]$ psql -c "select * from gp_segment_configuration" postgres
 dbid | content | role | preferred_role | mode | status | port | hostname | address |          datadir          
------+---------+------+----------------+------+--------+------+----------+---------+---------------------------
    1 |      -1 | p    | p              | n    | u      | 5432 | cdw      | cdw     | /data/coordinator/gpseg-1
    3 |       1 | p    | p              | s    | u      | 6000 | sdw2     | sdw2    | /data/primary/gpseg1
    5 |       1 | m    | m              | s    | u      | 7000 | sdw1     | sdw1    | /data/mirror/gpseg1
    2 |       0 | p    | p              | s    | u      | 6000 | sdw1     | sdw1    | /data/primary/gpseg0
    4 |       0 | m    | m              | s    | u      | 7000 | sdw2     | sdw2    | /data/mirror/gpseg0
(5 rows)

What we did here was recovering “in-place”, which means on the the same node. Recovery onto another node is possible as well as long as the new node comes with the same configuration as the current node (Greenplum release, OS version, …).

The important point is, that you definitely should go with mirror segments. Of course you need double the space per node, but it makes recovery a lot easier.

L’article Getting started with Greenplum – 5 – Recovering from failed segment nodes est apparu en premier sur dbi Blog.

Rancher RKE2: Introduction on RKE2 cluster template for AWS EC2

Thu, 2024-03-07 04:42

In Rancher, you can preconfigure your clusters and node configuration. Then your team can start provisioning clusters, with consistency and no misconfiguration. For RKE, templates are easily comprehensible and seamlessly integrated into the Rancher UI, making it user-friendly. For RKE2, we need to make use of the RKE2 cluster template.

Get started with RKE2 cluster template RKE2 cluster template

For RKE2, you need to make use of cluster templates. The cluster template is a Helm chart that you deploy into your Rancher management cluster (local) under the fleet-default namespace. Rancher will provision your cluster in the presence of these resources.

To help you understand how it works, we will deploy one RKE2 cluster on AWS EC2 using a simplified RKE2 cluster template.

Cluster specifications

The Rancher version is 2.8.2.

For our example, you need to specify a VPC ID and subnet ID. I am using a private subnet and all machines will use private IP addresses.

Also, you need to set the region and availability zone. For our example, it is set to the region eu-central-1 and availability zone a.

Templates files

The RKE2 cluster template is available at the following git repository: https://github.com/kkedbi/cluster-template-examples/tree/rke2-ec2

It is a classic Helm repository, for more information about Helm: https://helm.sh/
Let’s look into some files.

.
├── charts 
│   ├── Chart.yaml # A YAML file containing information about the chart
│   ├── README.md
│   ├── questions.yaml
│   ├── templates
│   │   ├── _helpers.tpl
│   │   ├── cluster.yaml
│   │   ├── managedcharts.yaml
│   │   └── nodeconfig-aws.yaml
│   └── values.yaml # The default configuration values for this chart
├── dbiservices-template-ec2-0.0.1.tgz
└── index.yaml

The file Chart.yaml contains the information about the chart. You can define multiple information about your chart like the name, description, versions, etc. (https://helm.sh/docs/topics/charts/).

apiVersion: v1
name: dbiservices-template-ec2
description: Cluster template for amazon ec2 rke2 
version: 0.0.1
annotations:
  catalog.cattle.io/type: cluster-template
  catalog.cattle.io/namespace: fleet-default

Note that the following annotation is mandatory to make the RKE2 Cluster template visible in Rancher.

catalog.cattle.io/type: cluster-template

The directory <templates> contains templates of the Kubernetes resources.

  • cluster.yaml is for the resource cluster.provisioning.cattle.io/v1
    It contains the cluster configuration (kubernetes configuration, machine pools, update strategy, etc.)
  • managedcharts.yaml contains charts to install unto the provisioned cluster.
  • nodeconfig-aws.yaml contains the configuration for AWS EC2 machines.

Those files are forked from the official example GitHub repository of Rancher.
Feel free to create or edit those files to meet your needs.

An easy solution to create your template depending on your needs, is to manually create the RKE2 cluster from Rancher, and download those YAML resources files. In this way, you will be sure which parameters are used and what values to set.

The resources for the cluster and machines are available in the UI.
Cluster Management > Cluster

local > More Resources > Amazonec2Config

Installation and deployment

For the following part, fork the GitHub repository including the branch. That way you can modify the Chart files to experiment and fit your needs.

Let’s start by fixing the VPC configuration of our nodepools. In the values.yaml file, find the following parameters for both nodepools and configure it with your AWS configuration.

# AWS region
region: eu-central-1
# AWS zone for instance (i.e. a,b,c,d,e)
zone: a
# AWS VPC id
vpcId: "vpc-xxxxxxxxxxxx"
# AWS VPC subnet id
subnetId: "subnet-xxxxxxxxxxxx"

Do not hesitate to check the other parameters and modify them depending of your needs.
Once done, save the file, and run the following command to package the chart.

helm package charts
helm repo index .

git add .
git commit -m 'update values.yaml and repackage'
git push

As described in the official documentation, to use RKE2 templates, we need to add the repositories into Rancher.

In Cluster Management > Advanced > Repositories > Create.

Name: dbiservices-rke2
Target: Git repository containing Helm chart or cluster template definitions
Git Repo URL: https://github.com/kkedbi/cluster-template-examples
Git Branch: rke2-ec2

Once the repository is added, the RKE2 template will automatically be available when you create a new cluster.

Click on the dbiservices-template-ec2, it will bring you to the form to configure some of the cluster parameters.
This form is described as YAML code in Charts/questions.yaml. It is entirely customizable to fit your needs. Through the form, we can specify the Kubernetes version, CNI, VPC ID, Region, etc.

In group General, I use an AWS Credentials created in Rancher, chose RKE2 version 1.26, and Canal as Container Network.

Then you can directly configure the nodepool for controlplane and workers.

Once complete, click on install. It will install the Helm chart into the Rancher cluster (local), in the namespace fleet-default.

In case you need to remove the cluster, delete the Apps instead of the cluster. It ensures that all resources created by the Helm chart are deleted. For example, the ManagedChart is not deleted if you delete only the cluster.

Now, the cluster is provisioned with the predefined parameters in values.yaml, in addition to the customized parameters from the question.yaml form.

Conclusion

The usage of templates is not complex.
In the example above, I allowed the user to choose the Kubernetes version, the CNI, the quantity from 1 to 3 for the node pool, and their instance type. You can predefined and fix those values directly in the values.yaml or templates files.
A few fixed parameters were related to the AWS network (VPC, subnet, region, etc.). Therefore the user who uses the template doesn’t need any visibility and information about AWS configuration.

The advantage of templates is to allow users who have the rights, to create a cluster with the correct, standardized configuration, that adheres to the internal IT guidelines of your company.

Additional information

A few modifications are made in the branch rke2-ec2 of the forked repository,

For cluster.yaml, I prefixed the nodepool name with the cluster name. The purpose is to avoid errors if a nodepool already exists with the same name.

      name: {{ $.Values.cluster.name }}-{{ $nodepool.name }}

For nodeconfig-aws.yaml, I added the function quote for the parameters retries, rootSize, and spotPrice to force type string.

spotPrice: {{ quote $nodepool.spotPrice }}
retries: {{ quote $nodepool.retries }}
rootSize: {{ quote $nodepool.rootSize }}

To go further, check the next blog article about assigning members to the cluster from the template.
https://www.dbi-services.com/blog/rancher-rke2-templates-assign-members-to-clusters

Sources Blogs

L’article Rancher RKE2: Introduction on RKE2 cluster template for AWS EC2 est apparu en premier sur dbi Blog.

Kubernetes Networking by Using Cilium – Advanced Level – eBPF Routing

Tue, 2024-03-05 01:18

We have come a long way regarding networking in Kubernetes using Cilium. From a high level picture in this post, we moved to discovering the Kubernetes networking interfaces in this post and dived into linux routing in Kubernetes in this post.

If you still want to know more, then you are in the right place. Fasten your seat belt because in this blog post we will dive deep into the eBPF routing. When using Cilium for networking in your Kubernetes cluster, you will automatically use eBPF for routing between pods. I’ll stick to the same 2 examples of routing between pods on the same node and on different nodes. You’ll then see the continuity on the same drawing that will hopefully help you understand this routing topic completely.

Discovering eBPF eBPF Routing

In my high level blog and in the previous one about routing, I’ve talked about servants that are waiting at some network interfaces to help and direct you in the Kubernetes cluster. I’ve told you they were using a magical eBPF map to guide you through a secret passage toward your destination. Let’s now see what these servants are exactly in our Kubernetes cluster.

A servant is actually an eBPF program attached to a network interface at the kernel level. That is what eBPF allows you to do, modify the kernel dynamically. The advantage is that you can attach your own program (that is C code transformed to machine code) to some events and network interfaces in the kernel and trigger any action you have programmed. An action could be routing a packet or report and act on any interaction with the kernel. This is why eBPF fits nicely for observability because anything happening in a node interact with the kernel. It could be running a process, opening a file, routing a packet,… Regarding routing, it is very fast because it shortcuts the traditional Linux routing process. That is the reason why we couldn’t see all the routing steps in the previous post by using our traditional networking tools. We now need eBPF tools to inspect this routing in more details.

Cilium Agent and eBPF routing

You know from the previous posts that there is a Cilium agent pod on each node in the cluster. This agent takes care of everything about networking on that node and so also eBPF routing. It comes with some eBPF tools packed with it that will allow us to explore it more deeply.

Let’s now see what these servants really look like in our cluster. As a reminder and to gather all the information here, below are all our Cilium agents in our cluster:

$ kubectl get po -n kube-system -owide|grep cilium
cilium-9zh9s                                      1/1     Running   5 (65m ago)   113d   172.18.0.3    mycluster-control-plane   <none>           <none>
cilium-czffc                                      1/1     Running   5 (65m ago)   113d   172.18.0.4    mycluster-worker2         <none>           <none>
cilium-dprvh                                      1/1     Running   5 (65m ago)   113d   172.18.0.2    mycluster-worker          <none>           <none>
cilium-operator-6b865946df-24ljf                  1/1     Running   5 (65m ago)   113d   172.18.0.2    mycluster-worker          <none>           <none>

As before in this series, we will trace the routing from a pod on the node mycluster-worker and so we will need to interact with the Cilium agent of that node: cilium-dprvh. We use into that agent pod the tool bpftool to list all the network interfaces and the eBPF programs attached to them:

$ kubectl exec -it -n kube-system cilium-dprvh -- bpftool net show
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
xdp:

tc:
cilium_net(2) clsact/ingress cil_to_host-cilium_net id 1209
cilium_host(3) clsact/ingress cil_to_host-cilium_host id 1184
cilium_host(3) clsact/egress cil_from_host-cilium_host id 1197
cilium_vxlan(4) clsact/ingress cil_from_overlay-cilium_vxlan id 1154
cilium_vxlan(4) clsact/egress cil_to_overlay-cilium_vxlan id 1155
lxc_health(6) clsact/ingress cil_from_container-lxc_health id 1295
eth0(7) clsact/ingress cil_from_netdev-eth0 id 1223
lxc4a891387ff1a(9) clsact/ingress cil_from_container-lxc4a891387ff1a id 1285
lxc5b7b34955e61(11) clsact/ingress cil_from_container-lxc5b7b34955e61 id 1303
lxc73d2e1d7cf4(13) clsact/ingress cil_from_container-lxc73d2e1d7cf4 id 1294

Yes these programs are our servants! Armed with the knowledge of the network interfaces in a previous post you should recognize their names. The one that will be used at the starting point of our travel is lxc4a891387ff1a. In parenthesis you have the interface id 9 that is on the node side of this link to this container. The Cilium agent assigned to this interface the id 1285. The name of the program attached to this interface is called cil_from_container. From this name, you know for which direction the program will operate. Here it is for the traffic coming from the container to the node, in the other direction there is no processing. As the Cilium community edition is open source, you can read its code directly here. In the file bpf_lxc.c just search for the program name and you will see all the details. Yes, this is advanced but don’t worry, I did the hard work for you, so you just have to read along!

Before we move on to trace the eBPF routing, let’s update our drawing to show these eBPF servants:

Kubernetes networking routing with Cilium using eBPFKubernetes networking routing with Cilium using eBPF

Don’t be confused about the lxc network interface id. For example lxc4a891387ff1a(9) from the command output above. It means it is linked to id 9 which is on the node side, on the interface called lxc….@if8. Id 8 being the id inside of the container (you will see in it 8: eth0@if9).

Pod to pod routing on the same node

I don’t want to lose you with all the details so I’ll give you only the key information to follow along this eBPF routing. Yes, from here it is becoming more complicated but I’ll try to make it smooth for you to enjoy it anyway.

As we did in the previous post, let’s now see how eBPF routes the traffic thanks to its servants from 10.10.2.117 to 10.10.2.121. The Cilium agent gives information to the eBPF program and receives information from them. For exchanging information, it uses what is called maps (yes these are our magic maps!) in the eBPF terminology. There are several of them that are used for different purposes. These maps are stored in the folder /sys/fs/bpf/tc/globals into the Cilium agent pod and we can manually interact with them by using the bpftool. We can then trace the eBPF routing that way. When the packet leaves the container, it reaches the lxc interface on the node and the program cil_from_container is triggered to route that packet. The program sees the destination IP Address is in the same subnet as the source and will then just forward the traffic to the destination lxc network interface by using the map called cilium_lxc.

Below is how we can trace that eBPF routing:

$ kubectl exec -it -n kube-system cilium-dprvh -- bpftool map lookup pinned /sys/fs/bpf/tc/globals/cilium_lxc key hex 0a 0a 02 79 00 00 00 00  00 00 00 00 00 00 00 00 01 00 00 00
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
key:
0a 0a 02 79 00 00 00 00  00 00 00 00 00 00 00 00
01 00 00 00
value:
0b 00 00 00 00 00 33 06  00 00 00 00 00 00 00 00
ea c4 71 d6 4f a0 00 00  f6 87 b6 c3 a6 45 00 00
2c 0d 00 00 00 00 00 00  00 00 00 00 00 00 00 00

We have to give to the bpftool an hexadecimal string that it expects (the details of this format can be found in the source code). The destination IP Address we want to trace is 10.10.2.121 which is 0a 0a 02 79 in hexadecimal. This information is at the beginning of the string in our command.

In the output of this command you can see the key section, which is the hexadecimal string we have provided as well as the result in the value section.

In this result we get the MAC Address of the destination lxc interface on the pod side which is ea c4 71 d6 4f a0. We also get the MAC Address of the destination lxc interface on the node side which is f6 87 b6 c3 a6 45. Finally we get the lxc id of the destination pod which is 33 06. A little trick here is that this value is stored in reverse order (you can do a search about big endian and little endian if you want to know more on this) so we have to read 06 33 which converted to decimal gives 1587. This value is the endpoint ID used by Cilium to identify the pods. You can get this general information in the cluster with the following command:

$ kubectl get ciliumendpoint -n networking101
NAME                        ENDPOINT ID   IDENTITY ID   INGRESS ENFORCEMENT   EGRESS ENFORCEMENT   VISIBILITY POLICY   ENDPOINT STATE   IPV4          IPV6
busybox-c8bbbbb84-fmhwc     897           3372          <status disabled>     <status disabled>    <status disabled>   ready            10.10.1.164
busybox-c8bbbbb84-t6ggh     715           3372          <status disabled>     <status disabled>    <status disabled>   ready            10.10.2.117
netshoot-7d996d7884-fwt8z   1587          10388         <status disabled>     <status disabled>    <status disabled>   ready            10.10.2.121
netshoot-7d996d7884-gcxrm   3564          10388         <status disabled>     <status disabled>    <status disabled>   ready            10.10.1.155

So the destination pod is netshoot-7d996d7884-fwt8z and it is reached directly through its lxc interface. And that’s it! Of course the routing occurs much faster than me explaining it but more importantly it is faster than the traditional Linux routing.

Pod to pod routing on a different node

Let’s now check the eBPF routing from 10.10.2.117 to 10.10.1.155 which is then on a different node. The packet is also intercepted by the program cil_from_container but this time the destination IP Address belongs to another subnet so it is going to use another map called cilium_ipcache to route that packet.

As before, we have to provide an hexadecimal string in the expected format to trace the routing of this packet. Here, our destination IP Address is 10.10.1.155 so 0a 0a 01 9b in hexadecimal.

$ kubectl exec -it -n kube-system cilium-dprvh -- bpftool map lookup pinned /sys/fs/bpf/tc/globals/cilium_ipcache key hex 40 00 00 00 00 00 00 01 0a 0a 01 9b 00 00 00 00 00 00 00 00 00 00 00 00
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
key:
40 00 00 00 00 00 00 01  0a 0a 01 9b 00 00 00 00
00 00 00 00 00 00 00 00
value:
94 28 00 00 ac 12 00 04  00 00 00 00

In the value section we can see the destination IP Address of the node mycluster-worker2 which is ac 12 00 04 (172.18.0.4). From there, the packet is encapsulated and goes straight to the egress VXLAN interface to reach that node. Once there, it is now the eBPF programs of the Cilium agent pod on this node that will be used. Let’s check them as we did before for the other node:

$ kubectl exec -it -n kube-system cilium-czffc -- bpftool net show
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
xdp:

tc:
cilium_net(2) clsact/ingress cil_to_host-cilium_net id 1254
cilium_host(3) clsact/ingress cil_to_host-cilium_host id 1242
cilium_host(3) clsact/egress cil_from_host-cilium_host id 1246
cilium_vxlan(4) clsact/ingress cil_from_overlay-cilium_vxlan id 1163
cilium_vxlan(4) clsact/egress cil_to_overlay-cilium_vxlan id 1164
lxc_health(6) clsact/ingress cil_from_container-lxc_health id 1414
lxc0a97661d8043(8) clsact/ingress cil_from_container-lxc0a97661d8043 id 1387
eth0(9) clsact/ingress cil_from_netdev-eth0 id 1261
lxc174c023046ff(11) clsact/ingress cil_from_container-lxc174c023046ff id 1391
lxce84a702bb02c(13) clsact/ingress cil_from_container-lxce84a702bb02c id 1419

The output is similar to the other node. So, once the packet exits the VXLAN tunnel, it is caught by another program on the ingress VXLAN interface called cil_from_overlay-cilium_vxlan. This program sees the destination IP Address belongs to this node. It will then use the map cilium_lxc to forward the traffic to the lxc interface of the destination pod as we have seen before. Note that there is no programs in the list above called to_container so the packet in not processed further. We can then trace that eBPF routing part as before by using the hexadecimal value of our destination IP Address 10.10.1.155:

$ kubectl exec -it -n kube-system cilium-dprvh -- bpftool map lookup pinned /sys/fs/bpf/tc/globals/cilium_lxc key hex 0a 0a 01 9b 00 00 00 00  00 00 00 00 00 00 00 00 01 00 00 00
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
key:
0a 0a 01 9b 00 00 00 00  00 00 00 00 00 00 00 00
01 00 00 00
value:
0b 00 00 00 00 00 ec 0d  00 00 00 00 00 00 00 00
be 57 3d 54 40 f1 00 00  92 65 df 09 dd 28 00 00
2c 0d 00 00 00 00 00 00  00 00 00 00 00 00 00 00

The result gives us the following information:

  • 92 65 df 09 dd 28: This is the MAC Address of the destination lxc interface on the node side
  • be 57 3d 54 40 f1: This is the MAC Address of the destination lxc interface on the pod side
  • ec 0d: We reverse it to 0d ec which we convert to 3564. This is the Cilium endpoint of the destination pod netshoot-7d996d7884-gcxrm as we have seen previously (you can check the output of the command above and find that pod).

As before, from here the packet is directly forwarded to this destination pod. This is how an eBPF routing plan comes up together!

Wrap up

Wow! Congratulations if you’ve reached that point! You are champions! You now have the complete and detailed picture of how the basic networking between pods is working with Cilium in a Kubernetes cluster. There is more to cover as I’ve mentioned network policies in a previous post but we didn’t talk about services, ingress or name resolution. Also in addition to these basic Kubernetes networking topics, Cilium provides a lot of other features that enrich what can be done in a cluster. So stay tuned!

L’article Kubernetes Networking by Using Cilium – Advanced Level – eBPF Routing est apparu en premier sur dbi Blog.

Getting started with Greenplum – 4 – Backup & Restore – databases

Mon, 2024-03-04 04:15

This is the fourth part of the Greenplum blog series, the previous ones are here: Getting started with Greenplum – 1 – Installation, Getting started with Greenplum – 2 – Initializing and bringing up the cluster, Getting started with Greenplum – 3 – Behind the scenes. In this blog we’ll look at how you are supposed to backup and restore a Greenplum cluster.

If you restarted the cluster nodes and log on to the systems again, you’ll notice that the instances are not running:

[gpadmin@cdw ~]$ ps -ef | grep postgres
gpadmin     1285    1233  0 09:43 pts/0    00:00:00 grep --color=auto potsgres
[gpadmin@cdw ~]$ psql postgres
psql: error: could not connect to server: No such file or directory
        Is the server running locally and accepting
        connections on Unix domain socket "/tmp/.s.PGSQL.5432"?

Starting and stopping the cluster is done with “gpstart” and “gpstop”, so starting it up is just a matter of this:

[gpadmin@cdw ~]$ which gpstart
/usr/local/greenplum-db-7.1.0/bin/gpstart
[gpadmin@cdw ~]$ gpstart
20240301:09:45:37:001290 gpstart:cdw:gpadmin-[INFO]:-Starting gpstart with args: 
20240301:09:45:37:001290 gpstart:cdw:gpadmin-[INFO]:-Gathering information and validating the environment...
20240301:09:45:37:001290 gpstart:cdw:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240301:09:45:37:001290 gpstart:cdw:gpadmin-[INFO]:-Greenplum Catalog Version: '302307241'
20240301:09:45:37:001290 gpstart:cdw:gpadmin-[INFO]:-Starting Coordinator instance in admin mode
20240301:09:45:37:001290 gpstart:cdw:gpadmin-[INFO]:-CoordinatorStart pg_ctl cmd is env GPSESSID=0000000000 GPERA=None $GPHOME/bin/pg_ctl -D /data/coordinator/gpseg-1/ -l /data/coordinator/gpseg-1//log/startup.log -w -t 600 -o " -c gp_role=utility " start
20240301:09:45:37:001290 gpstart:cdw:gpadmin-[INFO]:-Obtaining Greenplum Coordinator catalog information
20240301:09:45:37:001290 gpstart:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240301:09:45:37:001290 gpstart:cdw:gpadmin-[INFO]:-Setting new coordinator era
20240301:09:45:37:001290 gpstart:cdw:gpadmin-[INFO]:-Coordinator Started...
20240301:09:45:38:001290 gpstart:cdw:gpadmin-[INFO]:-Shutting down coordinator
20240301:09:45:38:001290 gpstart:cdw:gpadmin-[INFO]:---------------------------
20240301:09:45:38:001290 gpstart:cdw:gpadmin-[INFO]:-Coordinator instance parameters
20240301:09:45:38:001290 gpstart:cdw:gpadmin-[INFO]:---------------------------
20240301:09:45:38:001290 gpstart:cdw:gpadmin-[INFO]:-Database                 = template1
20240301:09:45:38:001290 gpstart:cdw:gpadmin-[INFO]:-Coordinator Port              = 5432
20240301:09:45:38:001290 gpstart:cdw:gpadmin-[INFO]:-Coordinator directory         = /data/coordinator/gpseg-1/
20240301:09:45:38:001290 gpstart:cdw:gpadmin-[INFO]:-Timeout                  = 600 seconds
20240301:09:45:38:001290 gpstart:cdw:gpadmin-[INFO]:-Coordinator standby           = Off 
20240301:09:45:38:001290 gpstart:cdw:gpadmin-[INFO]:---------------------------------------
20240301:09:45:38:001290 gpstart:cdw:gpadmin-[INFO]:-Segment instances that will be started
20240301:09:45:38:001290 gpstart:cdw:gpadmin-[INFO]:---------------------------------------
20240301:09:45:38:001290 gpstart:cdw:gpadmin-[INFO]:-   Host   Datadir                Port   Role
20240301:09:45:38:001290 gpstart:cdw:gpadmin-[INFO]:-   sdw1   /data/primary/gpseg0   6000   Primary
20240301:09:45:38:001290 gpstart:cdw:gpadmin-[INFO]:-   sdw2   /data/mirror/gpseg0    7000   Mirror
20240301:09:45:38:001290 gpstart:cdw:gpadmin-[INFO]:-   sdw2   /data/primary/gpseg1   6000   Primary
20240301:09:45:38:001290 gpstart:cdw:gpadmin-[INFO]:-   sdw1   /data/mirror/gpseg1    7000   Mirror

Continue with Greenplum instance startup Yy|Nn (default=N):
> Y
20240301:09:45:49:001290 gpstart:cdw:gpadmin-[INFO]:-Commencing parallel primary and mirror segment instance startup, please wait...
.
20240301:09:45:50:001290 gpstart:cdw:gpadmin-[INFO]:-Process results...
20240301:09:45:50:001290 gpstart:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240301:09:45:50:001290 gpstart:cdw:gpadmin-[INFO]:-   Successful segment starts                                            = 4
20240301:09:45:50:001290 gpstart:cdw:gpadmin-[INFO]:-   Failed segment starts                                                = 0
20240301:09:45:50:001290 gpstart:cdw:gpadmin-[INFO]:-   Skipped segment starts (segments are marked down in configuration)   = 0
20240301:09:45:50:001290 gpstart:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240301:09:45:50:001290 gpstart:cdw:gpadmin-[INFO]:-Successfully started 4 of 4 segment instances 
20240301:09:45:50:001290 gpstart:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240301:09:45:50:001290 gpstart:cdw:gpadmin-[INFO]:-Starting Coordinator instance cdw directory /data/coordinator/gpseg-1/ 
20240301:09:45:50:001290 gpstart:cdw:gpadmin-[INFO]:-CoordinatorStart pg_ctl cmd is env GPSESSID=0000000000 GPERA=5519b53b4b2c1dab_240301094537 $GPHOME/bin/pg_ctl -D /data/coordinator/gpseg-1/ -l /data/coordinator/gpseg-1//log/startup.log -w -t 600 -o " -c gp_role=dispatch " start
20240301:09:45:50:001290 gpstart:cdw:gpadmin-[INFO]:-Command pg_ctl reports Coordinator cdw instance active
20240301:09:45:50:001290 gpstart:cdw:gpadmin-[INFO]:-Connecting to db template1 on host localhost
20240301:09:45:50:001290 gpstart:cdw:gpadmin-[INFO]:-No standby coordinator configured.  skipping...
20240301:09:45:50:001290 gpstart:cdw:gpadmin-[INFO]:-Database successfully started

Before we look at how you can recover from failed coordinator or segment nodes we’ll look at how you are supposed to backup the database(s). The tool for backup databases in Greenplum is called “gpbackup“.

We’ll use the database we’ve created in the previous post to create a table containing some sample data. In a Greenplum system tables are distributed across the segments and when you create a table you have three options on how you want this to happen:

  • DISTRIBUTED BY: You choose the column which will be used to distribute the data
  • DISTRIBUTED RANDOMLY: Use this if there is no unique column
  • DISTRIBUTED REPLICATED: Every row is distributed to all segments

As the system works best (performance wise) when you have the same amount of data on all the segment nodes we’ll go with the first method and create the table like this:

[gpadmin@cdw ~]$ psql d
psql (12.12)
Type "help" for help.

d=# create table t1 ( id int primary key
                    , dummy text 
                    ) distributed by (id);
CREATE TABLE

For populating the table we will use the standard generate_series PostgreSQL function:

d=# insert into t1 
    select i, md5(i::text) 
      from generate_series(1,1000000) i;
INSERT 0 1000000

When we try to backup this database with “gpbackup”, there is our first surprise. This utility is not available by default in the open source version of Greenplum:

[gpadmin@cdw ~]$ which gpbackup
/usr/bin/which: no gpbackup in (/usr/local/greenplum-db-7.1.0/bin:/home/gpadmin/.local/bin:/home/gpadmin/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin)
[gpadmin@cdw ~]$ find /usr/local/ -name gpbackup 

We could still use the standard pg_dump utility from PostgreSQL:

[gpadmin@cdw ~]$ pg_dump d > d.sql
[gpadmin@cdw ~]$ tail -10 d.sql 
--

ALTER TABLE ONLY public.t1
    ADD CONSTRAINT t1_pkey PRIMARY KEY (id);


--
-- Greenplum Database database dump complete
--

In the same way you can use pg_dumpall to get the global objects:

[gpadmin@cdw ~]$ pg_dumpall --globals-only > globals.sql
[gpadmin@cdw ~]$ cat globals.sql 
--
-- Greenplum Database cluster dump
--

SET default_transaction_read_only = off;

SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;

--
-- Roles
--

CREATE ROLE gpadmin;
ALTER ROLE gpadmin WITH SUPERUSER INHERIT CREATEROLE CREATEDB LOGIN REPLICATION BYPASSRLS PASSWORD 'md5b44a9b06d576a0b083cd60e5f875cf48';

--
-- PostgreSQL database cluster dump complete
--

As this is standard PostgreSQL stuff we’ll not look into that any further. Re-loading can either be done with “psql” or “pg_restore” as usual.

For getting the gpbackup onto the system we need Go. The default version of Go which comes with Rocky Linux 9 is fine, so we can install it with dnf and fetch the latest release of gpbackup afterwards:

[gpadmin@cdw ~]$ sudo dnf install golang -y
[gpadmin@cdw ~]$ go version
go version go1.20.10 linux/amd64
[gpadmin@cdw ~]$ wget https://github.com/greenplum-db/gpbackup/releases/download/1.30.3/gpbackup_binaries_rhel9.tar.gz
[gpadmin@cdw ~]$ tar axf gpbackup_binaries_rhel9.tar.gz
[gpadmin@cdw ~]$ ./gpbackup --version
gpbackup version 1.30.3

In the most simple form a backup can be taken like this:

[gpadmin@cdw ~]$ mkdir backup
[gpadmin@cdw ~]$ ./gpbackup --dbname d --backup-dir backup/
20240301:11:01:33 gpbackup:gpadmin:cdw:004933-[CRITICAL]:-backup/ is not an absolute path.
[gpadmin@cdw ~]$ ./gpbackup --dbname d --backup-dir ~/backup/
20240301:11:01:43 gpbackup:gpadmin:cdw:004942-[INFO]:-gpbackup version = 1.30.3
20240301:11:01:43 gpbackup:gpadmin:cdw:004942-[INFO]:-Greenplum Database Version = 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source
20240301:11:01:43 gpbackup:gpadmin:cdw:004942-[INFO]:-Starting backup of database d
20240301:11:01:43 gpbackup:gpadmin:cdw:004942-[INFO]:-Backup Timestamp = 20240301110143
20240301:11:01:43 gpbackup:gpadmin:cdw:004942-[INFO]:-Backup Database = d
20240301:11:01:43 gpbackup:gpadmin:cdw:004942-[INFO]:-Gathering table state information
20240301:11:01:43 gpbackup:gpadmin:cdw:004942-[INFO]:-Acquiring ACCESS SHARE locks on tables
Locks acquired:  1 / 1 [================================================================] 100.00% 0s
20240301:11:01:43 gpbackup:gpadmin:cdw:004942-[INFO]:-Gathering additional table metadata
20240301:11:01:43 gpbackup:gpadmin:cdw:004942-[INFO]:-Getting storage information
20240301:11:01:43 gpbackup:gpadmin:cdw:004942-[INFO]:-Metadata will be written to /home/gpadmin/backup/gpseg-1/backups/20240301/20240301110143/gpbackup_20240301110143_metadata.sql
20240301:11:01:43 gpbackup:gpadmin:cdw:004942-[INFO]:-Writing global database metadata
20240301:11:01:43 gpbackup:gpadmin:cdw:004942-[INFO]:-Global database metadata backup complete
20240301:11:01:43 gpbackup:gpadmin:cdw:004942-[INFO]:-Writing pre-data metadata
20240301:11:01:43 gpbackup:gpadmin:cdw:004942-[INFO]:-Pre-data metadata metadata backup complete
20240301:11:01:43 gpbackup:gpadmin:cdw:004942-[INFO]:-Writing post-data metadata
20240301:11:01:43 gpbackup:gpadmin:cdw:004942-[INFO]:-Post-data metadata backup complete
20240301:11:01:43 gpbackup:gpadmin:cdw:004942-[INFO]:-Writing data to file
Tables backed up:  1 / 1 [=================================================================] 100.00%
[-----------------------------------------------------------------------------------------=]   0.00%
20240301:11:01:44 gpbackup:gpadmin:cdw:004942-[INFO]:-Data backup complete
20240301:11:01:45 gpbackup:gpadmin:cdw:004942-[INFO]:-Found neither /usr/local/greenplum-db-7.1.0/bin/gp_email_contacts.yaml nor /home/gpadmin/gp_email_contacts.yaml
20240301:11:01:45 gpbackup:gpadmin:cdw:004942-[INFO]:-Email containing gpbackup report /home/gpadmin/backup/gpseg-1/backups/20240301/20240301110143/gpbackup_20240301110143_report will not be sent
20240301:11:01:45 gpbackup:gpadmin:cdw:004942-[INFO]:-Backup completed successfully

The global stuff went into “/home/gpadmin/backup/gpseg-1/backups/20240301/20240301110143/gpbackup_20240301110143_metadata.sql”:

[gpadmin@cdw ~]$ cat /home/gpadmin/backup/gpseg-1/backups/20240301/20240301110143/gpbackup_20240301110143_metadata.sql | egrep -v "^$|^#"
SET client_encoding = 'UTF8';
ALTER RESOURCE QUEUE pg_default WITH (ACTIVE_STATEMENTS=20);
ALTER RESOURCE GROUP admin_group SET CPU_MAX_PERCENT 1;
ALTER RESOURCE GROUP admin_group SET CPU_WEIGHT 100;
ALTER RESOURCE GROUP default_group SET CPU_MAX_PERCENT 1;
ALTER RESOURCE GROUP default_group SET CPU_WEIGHT 100;
ALTER RESOURCE GROUP system_group SET CPU_MAX_PERCENT 1;
ALTER RESOURCE GROUP system_group SET CPU_WEIGHT 100;
ALTER RESOURCE GROUP default_group SET CPU_WEIGHT 100;
ALTER RESOURCE GROUP default_group SET CONCURRENCY 20;
ALTER RESOURCE GROUP default_group SET CPU_MAX_PERCENT 20;
ALTER RESOURCE GROUP admin_group SET CPU_WEIGHT 100;
ALTER RESOURCE GROUP admin_group SET CONCURRENCY 10;
ALTER RESOURCE GROUP admin_group SET CPU_MAX_PERCENT 10;
ALTER RESOURCE GROUP system_group SET CPU_WEIGHT 100;
ALTER RESOURCE GROUP system_group SET CONCURRENCY 0;
ALTER RESOURCE GROUP system_group SET CPU_MAX_PERCENT 10;
CREATE ROLE gpadmin;
ALTER ROLE gpadmin WITH SUPERUSER INHERIT CREATEROLE CREATEDB LOGIN REPLICATION PASSWORD 'md5b44a9b06d576a0b083cd60e5f875cf48' RESOURCE QUEUE pg_default RESOURCE GROUP admin_group;
CREATE DATABASE d TEMPLATE template0;
ALTER DATABASE d OWNER TO gpadmin;
COMMENT ON SCHEMA public IS 'standard public schema';
ALTER SCHEMA public OWNER TO gpadmin;
REVOKE ALL ON SCHEMA public FROM PUBLIC;
REVOKE ALL ON SCHEMA public FROM gpadmin;
GRANT ALL ON SCHEMA public TO PUBLIC;
GRANT ALL ON SCHEMA public TO gpadmin;
CREATE SCHEMA IF NOT EXISTS gp_toolkit;
SET search_path=gp_toolkit,pg_catalog;
CREATE EXTENSION IF NOT EXISTS gp_toolkit WITH SCHEMA gp_toolkit;
SET search_path=pg_catalog;
COMMENT ON EXTENSION gp_toolkit IS 'various GPDB administrative views/functions';
CREATE TABLE public.t1 (
        id integer NOT NULL,
        dummy text
) DISTRIBUTED BY (id);
ALTER TABLE public.t1 OWNER TO gpadmin;
ALTER TABLE ONLY public.t1 ADD CONSTRAINT t1_pkey PRIMARY KEY (id);

The local backup directory on the coordinator does not contain any user data, only meta data:

[gpadmin@cdw ~]$ ls -la backup/gpseg-1/backups/20240301/20240301110143/
total 16
drwxr-xr-x 2 gpadmin gpadmin  171 Mar  1 11:01 .
drwxr-xr-x 3 gpadmin gpadmin   28 Mar  1 11:01 ..
-r--r--r-- 1 gpadmin gpadmin  742 Mar  1 11:01 gpbackup_20240301110143_config.yaml
-r--r--r-- 1 gpadmin gpadmin 1935 Mar  1 11:01 gpbackup_20240301110143_metadata.sql
-r--r--r-- 1 gpadmin gpadmin 1965 Mar  1 11:01 gpbackup_20240301110143_report
-r--r--r-- 1 gpadmin gpadmin 4048 Mar  1 11:01 gpbackup_20240301110143_toc.yaml

The actual data is on the segment nodes:

[gpadmin@sdw1 ~]$ ls backup/gpseg0/backups/20240301/20240301110143/
gpbackup_0_20240301110143_17122.gz
[gpadmin@sdw2 ~]$ ls  backup/gpseg1/backups/20240301/20240301110143/
gpbackup_1_20240301110143_17122.gz

Restoring that is done with “gprestore” passing in the timestamp (the directory name) of the backup:

[gpadmin@cdw ~]$ psql -c "drop database d" postgres
DROP DATABASE
[gpadmin@cdw ~]$ ./gprestore --backup-dir ~/backup/ --timestamp 20240301110143
20240301:11:18:41 gprestore:gpadmin:cdw:005223-[INFO]:-Restore Key = 20240301110143
20240301:11:18:41 gprestore:gpadmin:cdw:005223-[INFO]:-gpbackup version = 1.30.3
20240301:11:18:41 gprestore:gpadmin:cdw:005223-[INFO]:-gprestore version = 1.30.3
20240301:11:18:41 gprestore:gpadmin:cdw:005223-[INFO]:-Greenplum Database Version = 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source
20240301:11:18:41 gprestore:gpadmin:cdw:005223-[CRITICAL]:-Database "d" does not exist. Use the --create-db flag to create "d" as part of the restore process.
20240301:11:18:41 gprestore:gpadmin:cdw:005223-[INFO]:-Found neither /usr/local/greenplum-db-7.1.0/bin/gp_email_contacts.yaml nor /home/gpadmin/gp_email_contacts.yaml
20240301:11:18:41 gprestore:gpadmin:cdw:005223-[INFO]:-Email containing gprestore report /home/gpadmin/backup/gpseg-1/backups/20240301/20240301110143/gprestore_20240301110143_20240301111841_report will not be sent
[gpadmin@cdw ~]$ ./gprestore --backup-dir ~/backup/ --timestamp 20240301110143 --create-db
20240301:11:18:47 gprestore:gpadmin:cdw:005245-[INFO]:-Restore Key = 20240301110143
20240301:11:18:47 gprestore:gpadmin:cdw:005245-[INFO]:-gpbackup version = 1.30.3
20240301:11:18:47 gprestore:gpadmin:cdw:005245-[INFO]:-gprestore version = 1.30.3
20240301:11:18:47 gprestore:gpadmin:cdw:005245-[INFO]:-Greenplum Database Version = 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source
20240301:11:18:48 gprestore:gpadmin:cdw:005245-[INFO]:-Creating database
20240301:11:18:52 gprestore:gpadmin:cdw:005245-[INFO]:-Database creation complete for: d
20240301:11:18:52 gprestore:gpadmin:cdw:005245-[INFO]:-Restoring pre-data metadata
Pre-data objects restored:  8 / 8 [=====================================================] 100.00% 0s
20240301:11:18:52 gprestore:gpadmin:cdw:005245-[INFO]:-Pre-data metadata restore complete
Tables restored:  1 / 1 [==================================================================] 100.00%
[-----------------------------------------------------------------------------------------=]   0.00%
20240301:11:18:53 gprestore:gpadmin:cdw:005245-[INFO]:-Data restore complete
20240301:11:18:53 gprestore:gpadmin:cdw:005245-[INFO]:-Restoring post-data metadata
Post-data objects restored:  1 / 1 [====================================================] 100.00% 0s
20240301:11:18:54 gprestore:gpadmin:cdw:005245-[INFO]:-Post-data metadata restore complete
20240301:11:18:54 gprestore:gpadmin:cdw:005245-[INFO]:-Found neither /usr/local/greenplum-db-7.1.0/bin/gp_email_contacts.yaml nor /home/gpadmin/gp_email_contacts.yaml
20240301:11:18:54 gprestore:gpadmin:cdw:005245-[INFO]:-Email containing gprestore report /home/gpadmin/backup/gpseg-1/backups/20240301/20240301110143/gprestore_20240301110143_20240301111847_report will not be sent
20240301:11:18:54 gprestore:gpadmin:cdw:005245-[INFO]:-Restore completed successfully

Of course you should make sure that the backup directories are separate mount points and are not local on the nodes. There are also some storage plugins you might want to consider.

According to the documentation you should not use pg_basebackup to backup segment instances, so doing physical backups and point in time recoveries is not an option.

In the next post we’ll look at how we can recover from a failed segment node.

L’article Getting started with Greenplum – 4 – Backup & Restore – databases est apparu en premier sur dbi Blog.

Getting started with Greenplum – 3 – Behind the scenes

Fri, 2024-03-01 01:19

If you followed part 1 and part 2 of this little blog series you now have a running Greenplum system. There is one coordinator host and there are two segment hosts. In this post we’ll look at what really was initialized, how that looks on disk and how the PostgreSQL instances communicate with each other.

Lets start with a simple overview of what we have now:


                               |-------------------|
                               |                   |
                               |     Segment 1     |
                               |                   |
                               |-------------------|
                                        /
   |-------------------|               /
   |                   |              /
   |   Coordinator     |--------------
   |                   |              \
   |-------------------|               \
                                        \
                               |-------------------|
                               |                   |
                               |     Segment 2     |
                               |                   |
                               |-------------------|

We have the coordinator node on the left, somehow connected to the two segment nodes on the right. As Greenplum is based on PostgreSQL we should easily be able to find out on which port the coordinator instance is listening:

[gpadmin@cdw ~]$ psql -c "show port" postgres
 port 
------
 5432
(1 row)

Another bit of information we can get out of the Greenplum catalog is, how the instances are connected between the nodes:

postgres=# select * from gp_segment_configuration order by dbid;
 dbid | content | role | preferred_role | mode | status | port | hostname | address |          datadir          
------+---------+------+----------------+------+--------+------+----------+---------+---------------------------
    1 |      -1 | p    | p              | n    | u      | 5432 | cdw      | cdw     | /data/coordinator/gpseg-1
    2 |       0 | p    | p              | s    | u      | 6000 | sdw1     | sdw1    | /data/primary/gpseg0
    3 |       1 | p    | p              | s    | u      | 6000 | sdw2     | sdw2    | /data/primary/gpseg1
    4 |       0 | m    | m              | s    | u      | 7000 | sdw2     | sdw2    | /data/mirror/gpseg0
    5 |       1 | m    | m              | s    | u      | 7000 | sdw1     | sdw1    | /data/mirror/gpseg1
(5 rows)

(All the cluster views are documented here)

Putting this information into the picture from above, gives us this:


                                        |-------------------|
                                    6000|primary            |
                                        |     Segment 1     |
                                    7000|mirror             |
                                        |-------------------|
                                                 /
            |-------------------|               /
            |                   |              /
        5432|   Coordinator     |--------------
            |                   |              \
            |-------------------|               \
                                                 \
                                        |-------------------|
                                    6000|primary            |
                                        |     Segment 2     |
                                    7000|mirror             |
                                        |-------------------|

Looking at the PostgreSQL configuration of the the two mirror instances we can see that the mirror on segment 1 is replicating from the primary from segment 2. The mirror on segment 2 is replicating from the primary on segment 1:

[gpadmin@sdw1 gpseg1]$ hostname
sdw1
[gpadmin@sdw1 gpseg1]$ pwd
/data/mirror/gpseg1
[gpadmin@sdw1 gpseg1]$ grep conninfo postgresql.*conf
postgresql.auto.conf:primary_conninfo = 'user=gpadmin passfile=''/home/gpadmin/.pgpass'' host=sdw2 port=6000 sslmode=prefer sslcompression=0 gssencmode=prefer krbsrvname=postgres target_session_attrs=any application_name=gp_walreceiver'
postgresql.conf:#primary_conninfo = ''                  # connection string to sending server

[gpadmin@sdw2 gpseg0]$ hostname
sdw2
[gpadmin@sdw2 gpseg0]$ pwd
/data/mirror/gpseg0
[gpadmin@sdw2 gpseg0]$ grep conninfo postgresql.*conf
postgresql.auto.conf:primary_conninfo = 'user=gpadmin passfile=''/home/gpadmin/.pgpass'' host=sdw1 port=6000 sslmode=prefer sslcompression=0 gssencmode=prefer krbsrvname=postgres target_session_attrs=any application_name=gp_walreceiver'
postgresql.conf:#primary_conninfo = ''                  # connection string to sending server

This means that the mirror instances are just normal PostgreSQL replicas:

                                        |-------------------|
                             |------6000|primary---------   |
                             |          |     Segment 1 |   |
                             |      7000|mirror<------| |   |
                             |          |-------------------|
                             |                        | |
            |-------------------|                     | |
            |                   |                     | |
        5432|   Coordinator     |                     | |
            |                   |                     | |
            |-------------------|                     | |
                             |                        | |
                             |          |-------------------|
                             |------6000|primary ------ |   |
                                        |     Segment 2 |   |
                                    7000|mirror<--------|   |
                                        |-------------------|

Both replicas run in synchronous mode, which can be seen in gp_stat_replication:

postgres=# select * from gp_stat_replication ;
 gp_segment_id | pid  | usesysid | usename | application_name |   client_addr   | client_hostname | client_port |         backend_start         | backend_xmin |   state   | sent_lsn  | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state |          reply_time           | sync_error 
---------------+------+----------+---------+------------------+-----------------+-----------------+-------------+-------------------------------+--------------+-----------+-----------+-----------+-----------+------------+-----------+-----------+------------+---------------+------------+-------------------------------+------------
             1 | 6156 |       10 | gpadmin | gp_walreceiver   | 192.168.122.201 |                 |       18462 | 2024-02-29 12:51:32.070682+01 |              | streaming | 0/C000158 | 0/C000158 | 0/C000158 | 0/C000158  |           |           |            |             1 | sync       | 2024-02-29 14:04:45.448297+01 | none
             0 | 6599 |       10 | gpadmin | gp_walreceiver   | 192.168.122.202 |                 |       58176 | 2024-02-29 12:51:32.612645+01 |              | streaming | 0/C000158 | 0/C000158 | 0/C000158 | 0/C000158  |           |           |            |             1 | sync       | 2024-02-29 14:04:44.243651+01 | none

When one of the primary segments fails, the attached mirror segment (which is a replica) will take over.

By default direct connections to either a primary or a mirror segment are not allowed / possible:

[gpadmin@sdw1 gpseg1]$ psql -p 7000
psql: error: FATAL:  the database system is in recovery mode
DETAIL:  last replayed record at 0/C000158
- VERSION: PostgreSQL 12.12 (Greenplum Database 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source) on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit compiled on Jan 19 2024 06:51:45 Bhuvnesh C.
[gpadmin@sdw1 gpseg1]$ psql -p 6000 postgres
psql: error: FATAL:  connections to primary segments are not allowed
DETAIL:  This database instance is running as a primary segment in a Greenplum cluster and does not permit direct connections.
HINT:  To force a connection anyway (dangerous!), use utility mode.

The consequence is, that the coordinator node is the only entry point into the Greenplum system. Nothing is supposed to happen directly on any of the segments.

Lets create a new database on the coordinator host and then retrieve the OID this database got:

postgres=# create database d;
CREATE DATABASE
postgres=# \l
                                List of databases
   Name    |  Owner  | Encoding |   Collate   |    Ctype    |  Access privileges  
-----------+---------+----------+-------------+-------------+---------------------
 d         | gpadmin | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 postgres  | gpadmin | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 template0 | gpadmin | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/gpadmin         +
           |         |          |             |             | gpadmin=CTc/gpadmin
 template1 | gpadmin | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/gpadmin         +
           |         |          |             |             | gpadmin=CTc/gpadmin
(4 rows)

postgres=# select oid,datname from pg_database;
  oid  |  datname  
-------+-----------
 13720 | postgres
     1 | template1
 17121 | d
 13719 | template0
(4 rows)

We can’t directly connect to the primary and mirror segments, but we can verify if the database got replicated on the file system on one of the segment nodes:

[gpadmin@sdw2 data]$ ls -la mirror/gpseg0/base/
total 52
drwx------  6 gpadmin gpadmin   54 Feb 29 14:26 .
drwx------ 21 gpadmin gpadmin 4096 Feb 29 13:01 ..
drwx------  2 gpadmin gpadmin 8192 Feb 29 12:51 1
drwx------  2 gpadmin gpadmin 8192 Feb 29 12:51 13719
drwx------  2 gpadmin gpadmin 8192 Feb 29 12:51 13720
drwx------  2 gpadmin gpadmin 8192 Feb 29 14:26 17121
[gpadmin@sdw2 data]$ ls -la primary/gpseg1/base/
total 52
drwx------  6 gpadmin gpadmin   54 Feb 29 14:26 .
drwx------ 21 gpadmin gpadmin 4096 Feb 29 13:01 ..
drwx------  2 gpadmin gpadmin 8192 Feb 29 12:51 1
drwx------  2 gpadmin gpadmin 8192 Feb 29 12:51 13719
drwx------  2 gpadmin gpadmin 8192 Feb 29 12:51 13720
drwx------  2 gpadmin gpadmin 8192 Feb 29 14:26 17121

Not a big surprise, the new database is of course there on the segment, otherwise the whole setup would not make much sense.

In the next post we’ll look at backup and recovery of such a system.

L’article Getting started with Greenplum – 3 – Behind the scenes est apparu en premier sur dbi Blog.

Migration from Non-CDB to Multitenant : CDB is using local undo, but no undo tablespace found in the PDB

Thu, 2024-02-29 10:34

During a past migration test from on-premises to ExaCC, I faced a pdb violation, stating, CDB is using local undo, but no undo tablespace found in the PDB, after having run noncdb_to_pdb.sql script.

Read more: Migration from Non-CDB to Multitenant : CDB is using local undo, but no undo tablespace found in the PDB Explanation

The problem comes from the fact that the non-cdb source is a single instance Database (cluster_database=false), which is converted to a PDB hosted in a RAC CDB, therefore having 2 instances for the PDB, but there is only one UNDO tablespace. Please see below how to quickly resolve this problem.

SQL> select status, message from pdb_plug_in_violations;

STATUS    MESSAGE
--------- ------------------------------------------------------------------------------------------------------------------------
PENDING   CDB is using local undo, but no undo tablespace found in the PDB.

1 rows selected.

Solution

I first check, and confirmed that the CDB$ROOT was set with local undo enabled. This would mean that each container (PDB) will use their own UNDO.

SQL> show con_name

CON_NAME
------------------------------
CDB$ROOT

SQL> select property_name, property_value
  2  from   database_properties
  3  where  property_name = 'LOCAL_UNDO_ENABLED';

PROPERTY_NAME                  PROPERTY_VALUE
------------------------------ ------------------------------
LOCAL_UNDO_ENABLED             TRUE

This CDB is a RAC database, with 2 nodes, so each PDB should have 2 UNDO tablespace.

SQL> select a.con_id, b.name, tablespace_name
  2  from   cdb_tablespaces a, v$pdbs b
  3  where  a.con_id=b.con_id and contents = 'UNDO'
  4  order by con_id;

    CON_ID NAME                      TABLESPACE_NAME
---------- ------------------------- ------------------------------
         3 PDB1           UNDOTBS2
         3 PDB1           UNDOTBS1
         4 PDB2           UNDO_2
         4 PDB2           UNDOTBS1
         5 PDB3           UNDOTBS1

This is what we can see for PDB1 and PDB2. But PDB3 is only having one UNDO tablespace.

This is what we can confirm, connecting to the PDB3 and listing tablespaces.

SQL> alter session set container=PDB3;

Session altered.

SQL> @/u02/app/oracle/local/dmk_sql/sql/qdbstbssize.sql

PL/SQL procedure successfully completed.


                             Nb      Extent Segment    Alloc.      Space        Max. Percent Block
Name                      files Type Mgmnt  Mgmnt    Size (GB)  Free (GB)  Size (GB)  used % size  Log Encrypt Compress
------------------------- ----- ---- ------ ------- ---------- ---------- ---------- ------- ----- --- ------- --------
APP                           1 DATA LM-SYS MANUAL         .02        .02        .02    9.38 8 KB  YES YES     NO
APP1                          2 DATA LM-SYS AUTO         17.00      16.88      17.00     .68 8 KB  YES YES     NO
APP1_INDEX                    2 DATA LM-SYS AUTO         17.00      16.93      17.00     .41 8 KB  YES YES     NO
...
SYSAUX                        1 DATA LM-SYS AUTO          2.00        .89       2.00   55.71 8 KB  YES YES     NO
SYSTEM                        1 DATA LM-SYS MANUAL        3.00       2.16       3.00   27.94 8 KB  YES YES     NO
TEMP                          1 TEMP LM-UNI MANUAL         .49       2.44        .49 -399.00 8 KB  NO  YES     NO
UNDOTBS1                      1 UNDO LM-SYS MANUAL        1.00        .52       1.00   47.83 8 KB  YES YES     NO
USERS                         1 DATA LM-SYS AUTO           .01        .01        .01   34.38 8 KB  YES YES     NO
...
                          -----                     ---------- ---------- ----------
TOTAL                        19                         104.75      92.96     104.75

16 rows selected.

This is also confirmed checking undo_tablespace parameter for both instances.

SQL> select inst_id, name, value from gv$parameter where upper(name)='UNDO_TABLESPACE';

   INST_ID NAME                      VALUE
---------- ------------------------- ------------------------------------------------------------
         1 undo_tablespace           UNDOTBS1
         2 undo_tablespace

Same for UNDO datafile.

SQL> select file_name from dba_data_files where tablespace_name like '%UNDO%';

FILE_NAME
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+DATAC1/CDB_NAME/7314A6F9C85A0827E0538E08440AC21A/DATAFILE/undotbs1.536.1159283017

SQL>

Still connected to PDB3, let’s create a new UNDO tablespace.

SQL> create undo tablespace UNDOTBS2 datafile '+DATAC1' size 1G;

Tablespace created.

We have 2 UNDO datafiles now.

SQL>  select file_name from dba_data_files where tablespace_name like '%UNDO%';

FILE_NAME
----------------------------------------------------------------------------------------------------
+DATAC1/CDB_NAME/7314A6F9C85A0827E0538E08440AC21A/DATAFILE/undotbs1.536.1159283017
+DATAC1/CDB_NAME/7314A6F9C85A0827E0538E08440AC21A/DATAFILE/undotbs2.550.1159291013

SQL>

We will assign the new UNDO tablespace to Instance 2.

SQL> alter system set undo_tablespace=UNDOTBS2 container=current sid='CDB_NAME2' scope=both;

System altered.

We can check UNDO_TABLESPACE parameter which is not reflecting the changes for the moment.

SQL> select inst_id, name, value from gv$parameter where upper(name)='UNDO_TABLESPACE';

   INST_ID NAME                      VALUE
---------- ------------------------- ------------------------------------------------------------
         2 undo_tablespace
         1 undo_tablespace           UNDOTBS1

I restarted the PDB3.

SQL> alter pluggable database PDB3 close instances=all;

Pluggable database altered.

SQL> alter pluggable database PDB3 open instances=all;

Pluggable database altered.

And could check that now both instances have got an UNDO tablespace assigned.

SQL> select inst_id, name, value from gv$parameter where upper(name)='UNDO_TABLESPACE';

   INST_ID NAME                      VALUE
---------- ------------------------- ------------------------------------------------------------
         1 undo_tablespace           UNDOTBS1
         2 undo_tablespace           UNDOTBS2

And I finally checked that the pdb violation is resolved.

SQL> select status, message from pdb_plug_in_violations;

STATUS    MESSAGE
--------- ------------------------------------------------------------------------------------------------------------------------
RESOLVED  CDB is using local undo, but no undo tablespace found in the PDB.

1 rows selected.

L’article Migration from Non-CDB to Multitenant : CDB is using local undo, but no undo tablespace found in the PDB est apparu en premier sur dbi Blog.

Migration from Non-CDB to Multitenant : Wallet Key Needed

Thu, 2024-02-29 06:22

I have been recently able to make several migration tests from On-Premises to Exadata, and during some of my tests I faced a PDB_PLUG_IN_VIOLATIONS with cause been “Wallet Key Needed” when converting a migrated Non-CDB to PDB.

Read more: Migration from Non-CDB to Multitenant : Wallet Key Needed Explanation

This violation comes from the fact that the source Non-CDB database migrated on the Exadata from On-Premises was encrypted. The message is clear, the source wallet needs to be exported and imported in the new PDB after having run noncdb_to_pdb.sql script.

SQL> select name, cause, type, message, status from PDB_PLUG_IN_VIOLATIONS where type = 'ERROR' and status  'RESOLVED';

NAME                 CAUSE                                                            TYPE      MESSAGE                                                                STATUS
-------------------- ---------------------------------------------------------------- --------- ---------------------------------------------------------------------- ---------
PDB_NAME       Wallet Key Needed                                                ERROR     PDB needs to import keys from source.                                  PENDING

Export wallet from the Non-CDB

Connecting to the Non-CDB source database, I can check the wallet configuration.

SQL> select WRL_PARAMETER, WRL_TYPE,WALLET_TYPE, status from V$ENCRYPTION_WALLET;

WRL_PARAMETER                                                WRL_TYPE             WALLET_TYPE          STATUS
------------------------------------------------------------ -------------------- -------------------- ------------------------------
/var/opt/oracle/dbaas_acfs/DB_NAME/wallet_root/tde/             FILE                 AUTOLOGIN            OPEN

So I tried to export my wallet.

SQL> administer key management
  2  export encryption keys with secret ""
  3  to '/var/opt/oracle/dbaas_acfs/DB_NAME/DB_NAME_to_PDB_wallet.p12'
  4  identified by "************"
  5  /
administer key management
*
ERROR at line 1:
ORA-28417: password-based keystore is not open

But I faced following error:

ORA-28417: password-based keystore is not open

This is because my wallet is opened in autologin and not as password file.

So let’s close it and open it as password file.

To do so it should be sufficient to run a keystore close before running a keystore open.

SQL> administer key management set keystore close;

keystore altered.

I was also trying moving the autologin cwallet.sso file and see how that works.

I first closed my Source Non-CDB database.

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

Then I renamed the cwallet.sso file.

oracle@ExaCC-cl01n1:/var/opt/oracle/dbaas_acfs/DB_NAME/wallet_root/tde/ [PDB_NAME (CDB$ROOT)] mv cwallet.sso cwallet.sso.no_auto

I started again the database.

SQL> startup
ORACLE instance started.

Total System Global Area 3.7572E+10 bytes
Fixed Size                 13653168 bytes
Variable Size            4697620480 bytes
Database Buffers         3.2749E+10 bytes
Redo Buffers              111153152 bytes
Database mounted.
Database opened.

I checked the wallet type and its status.

SQL> select WRL_PARAMETER, WRL_TYPE,WALLET_TYPE, status from V$ENCRYPTION_WALLET;

WRL_PARAMETER                                                WRL_TYPE             WALLET_TYPE          STATUS
------------------------------------------------------------ -------------------- -------------------- ------------------------------
/var/opt/oracle/dbaas_acfs/DB_NAME/wallet_root/tde/             FILE                 UNKNOWN              CLOSED

As expected it was closed. I opened it.

SQL> ADMINISTER KEY MANAGEMENT SET KEYSTORE OPEN IDENTIFIED BY "*****************";

keystore altered.

Checked the wallet status again.

SQL> select WRL_PARAMETER, WRL_TYPE,WALLET_TYPE, status from V$ENCRYPTION_WALLET;

WRL_PARAMETER                                                WRL_TYPE             WALLET_TYPE          STATUS
------------------------------------------------------------ -------------------- -------------------- ------------------------------
/var/opt/oracle/dbaas_acfs/DB_NAME/wallet_root/tde/             FILE                 PASSWORD             OPEN

Now the wallet is opened with password as needed. And I tried to export the wallet again.

SQL> administer key management
  2  export encryption keys with secret ""
  3  to '/var/opt/oracle/dbaas_acfs/DB_NAME/bak_DB_NAME_to_PDB_wallet.p12'
  4  identified by "************"
  5  /
administer key management
*
ERROR at line 1:
ORA-46644: creation or open of file to store the exported keys failed

I got a new error:

ORA-46644: creation or open of file to store the exported keys failed

This might be due to user permissions to write in the folder. Let’s save it in /tmp.

SQL> administer key management
  2  export encryption keys with secret ""
  3  to '/tmp/DB_NAME.p12'
  4  identified by "************"
  5  /

keystore altered.

This time it worked.

I shutdown the database again.

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

Moved the autologin wallet file back.

oracle@ExaCC-cl01n1:/var/opt/oracle/dbaas_acfs/DB_NAME/wallet_root/tde/ [PDB_NAME (CDB$ROOT)] mv cwallet.sso.no_auto cwallet.sso

Started the database again.

SQL> startup
ORACLE instance started.

Total System Global Area 3.7572E+10 bytes
Fixed Size                 13653168 bytes
Variable Size            4697620480 bytes
Database Buffers         3.2749E+10 bytes
Redo Buffers              111153152 bytes
Database mounted.
Database opened.

And checked that my wallet has been automatically opened again.

SQL> select WRL_PARAMETER, WRL_TYPE,WALLET_TYPE, status from V$ENCRYPTION_WALLET;

WRL_PARAMETER                                                WRL_TYPE             WALLET_TYPE          STATUS
------------------------------------------------------------ -------------------- -------------------- ------------------------------
/var/opt/oracle/dbaas_acfs/DB_NAME/wallet_root/tde/             FILE                 AUTOLOGIN            OPEN

Import the wallet in the PDB

Now we need to import the wallet in the PDB

I connected to the PDB.

SQL> alter session set container=PDB_NAME;

Session altered.

SQL> show pdbs

    CON_ID CON_NAME                       OPEN MODE  RESTRICTED
---------- ------------------------------ ---------- ----------
         4 PDB_NAME                 READ WRITE YES

Check the wallet status.

SQL> select WRL_PARAMETER, WRL_TYPE,WALLET_TYPE, status from V$ENCRYPTION_WALLET;

WRL_PARAMETER                                                WRL_TYPE             WALLET_TYPE          STATUS
------------------------------------------------------------ -------------------- -------------------- ------------------------------
                                                             FILE                 AUTOLOGIN            OPEN_NO_MASTER_KEY

I tried to import the wallet, expected it to fail as the wallet is with a autologin type.

SQL> administer key management
  2  import encryption keys with secret ""
  3  from '/tmp/DB_NAME.p12'
  4  identified by "**************"
  5  with backup USING 'pre-import-PDB_NAME'
  6  /
administer key management
*
ERROR at line 1:
ORA-28417: password-based keystore is not open

So I need to connect to the CDB$ROOT to open the wallet with the password. I checked and could see that the wallet is opened automatically for the CDB$ROOT and all PDB. The CDB$ROOT wallet is shared with the PDB.

SQL> alter session set container=cdb$root;

Session altered.

SQL> select CON_ID, WRL_PARAMETER, WRL_TYPE,WALLET_TYPE, status from V$ENCRYPTION_WALLET;

    CON_ID WRL_PARAMETER                                                WRL_TYPE             WALLET_TYPE          STATUS
---------- ------------------------------------------------------------ -------------------- -------------------- ------------------------------
         1 /var/opt/oracle/dbaas_acfs/CDB_NAME/wallet_root/tde/         FILE                 AUTOLOGIN            OPEN
         2                                                              FILE                 AUTOLOGIN            OPEN
         3                                                              FILE                 AUTOLOGIN            OPEN
         4                                                              FILE                 AUTOLOGIN            OPEN_NO_MASTER_KEY

So I closed the wallet, opened it with password and check its status.

SQL> administer key management set keystore close;

keystore altered.

SQL> administer key management set keystore open identified by "*************";

keystore altered.

SQL> select CON_ID, WRL_PARAMETER, WRL_TYPE,WALLET_TYPE, status from V$ENCRYPTION_WALLET;

    CON_ID WRL_PARAMETER                                                WRL_TYPE             WALLET_TYPE          STATUS
---------- ------------------------------------------------------------ -------------------- -------------------- ------------------------------
         1 /var/opt/oracle/dbaas_acfs/CDB_NAME/wallet_root/tde/         FILE                 PASSWORD             OPEN
         2                                                              FILE                 UNKNOWN              CLOSED
         3                                                              FILE                 UNKNOWN              CLOSED
         4                                                              FILE                 UNKNOWN              CLOSED

I connected to the PDB and tried to import the wallet.

SQL> alter session set container=PDB_NAME;

Session altered.

SQL> administer key management
  2  import encryption keys with secret ""
  3  from '/tmp/DB_NAME.p12'
  4  identified by "**************"
  5  with backup USING 'pre-import-PDB_NAME'
  6  /
administer key management
*
ERROR at line 1:
ORA-46658: keystore not open in the container

Which was not working as expected as the wallet is closed for the PDB. So I opened it with password.

SQL> administer key management set keystore open identified by "***************";

keystore altered.

SQL> select CON_ID, WRL_PARAMETER, WRL_TYPE,WALLET_TYPE, status from V$ENCRYPTION_WALLET;

    CON_ID WRL_PARAMETER                                                WRL_TYPE             WALLET_TYPE          STATUS
---------- ------------------------------------------------------------ -------------------- -------------------- ------------------------------
         4                                                              FILE                 PASSWORD             OPEN_NO_MASTER_KEY

And could now successfully import it.

SQL> administer key management
  2  import encryption keys with secret ""
  3  from '/tmp/DB_NAME.p12'
  4  identified by "******************"
  5  with backup USING 'pre-import-PDB_NAME'
  6  /

keystore altered.

I checked the wallet status, and all good, it changed from OPEN_NO_MASTER_KEY to OPEN.

SQL> select CON_ID, WRL_PARAMETER, WRL_TYPE,WALLET_TYPE, status from V$ENCRYPTION_WALLET;

    CON_ID WRL_PARAMETER                                                WRL_TYPE             WALLET_TYPE          STATUS
---------- ------------------------------------------------------------ -------------------- -------------------- ------------------------------
         4                                                              FILE                 PASSWORD             OPEN

I connected to the CDB$ROOT again. I checked the wallet status and could see it is open with password for the CDB$ROOT and new PDB and closed for the others, which might impact them. Not a problem in my case, as all here, is only used for tests for migration.

SQL> alter session set container=cdb$root;

Session altered.

SQL> select CON_ID, WRL_PARAMETER, WRL_TYPE,WALLET_TYPE, status from V$ENCRYPTION_WALLET;

    CON_ID WRL_PARAMETER                                                WRL_TYPE             WALLET_TYPE          STATUS
---------- ------------------------------------------------------------ -------------------- -------------------- ------------------------------
         1 /var/opt/oracle/dbaas_acfs/CDB_NAME/wallet_root/tde/         FILE                 PASSWORD             OPEN
         2                                                              FILE                 UNKNOWN              CLOSED
         3                                                              FILE                 UNKNOWN              CLOSED
         4                                                              FILE                 PASSWORD             OPEN

I close the wallet.

SQL> administer key management set keystore close identified by "******************";

keystore altered.

And could check that the wallet has been automatically opened again for the CDB$ROOT and all PDB.

SQL> select CON_ID, WRL_PARAMETER, WRL_TYPE,WALLET_TYPE, status from V$ENCRYPTION_WALLET;

    CON_ID WRL_PARAMETER                                                WRL_TYPE             WALLET_TYPE          STATUS
---------- ------------------------------------------------------------ -------------------- -------------------- ------------------------------
         1 /var/opt/oracle/dbaas_acfs/CDB_NAME/wallet_root/tde/         FILE                 AUTOLOGIN            OPEN
         2                                                              FILE                 AUTOLOGIN            OPEN
         3                                                              FILE                 AUTOLOGIN            OPEN
         4                                                              FILE                 AUTOLOGIN            OPEN

I restarted the PDBs and could see that the wallet key needed pdb violations has been resolved.

To wrap up

Converting an encrypted Non-CDB to PDB will require the wallet, containing the master key needed to encrypt/decrypt, to be exported from the source database and imported to the new PDB.

Last but not least, for me it was a lab, so no risk, but it is important to mention than we need to work carefully with the wallet and have good backup because loosing a wallet can be dramatical and can result in no way to access your data any more. Pay attention.

L’article Migration from Non-CDB to Multitenant : Wallet Key Needed est apparu en premier sur dbi Blog.

Getting started with Greenplum – 2 – Initializing and bringing up the cluster

Thu, 2024-02-29 06:11

In the last post we’ve configured the operating system for Greenplum and completed the installation. In this post we’ll create the so called “Data Storage Areas” (which is just a mount point or directory) and initialize the cluster. All the work is performed on the “Coordinator Host” and “gpssh” is used to perform the work on the remote systems.

For this playground environment we’ll just use a directory for the storage area. In a real setup you should of course use a dedicated, separate mount point. We start on the coordinator node:

[gpadmin@rocky9-gp7-master ~]$ sudo mkdir -p /data/coordinator
[gpadmin@rocky9-gp7-master ~]$ sudo chown gpadmin:gpadmin /data/coordinator/

Using “gpssh” we do the same on the two segment hosts:

[gpadmin@rocky9-gp7-master ~]$ gpssh -h rocky9-gp7-segment1 -e "sudo mkdir -p /data/coordinator"
[rocky9-gp7-segment1] sudo mkdir -p /data/coordinator
[gpadmin@rocky9-gp7-master ~]$ gpssh -h rocky9-gp7-segment2 -e "sudo mkdir -p /data/coordinator"
[rocky9-gp7-segment2] sudo mkdir -p /data/coordinator
[gpadmin@rocky9-gp7-master ~]$ gpssh -h rocky9-gp7-segment1 -e "sudo chown gpadmin:gpadmin /data/coordinator/"
[rocky9-gp7-segment1] sudo chown gpadmin:gpadmin /data/coordinator/
[gpadmin@rocky9-gp7-master ~]$ gpssh -h rocky9-gp7-segment2 -e "sudo chown gpadmin:gpadmin /data/coordinator/"
[rocky9-gp7-segment2] sudo chown gpadmin:gpadmin /data/coordinator/

This storage area is used to store system catalog tables and metadata. It is not used to store any user data.

The storage areas on the segment hosts will store user data, so they need to be bigger. All of the segment nodes should provide a storage area for the so called “primary segments”. Those segments are the active ones and will be used by default for serving client requests. In addition there should be a storage area for so called “mirror segments”. Those segments will be used in case the primary segment becomes unavailable. For that reason a mirror segment must always be on another host than it’s primary segment (more on that later).

Before we use “gpssh” to do this, let’s create a file which only contains the host names of the segment hosts:

[gpadmin@rocky9-gp7-master ~]$ echo "rocky9-gp7-segment1
rocky9-gp7-segment2" > ~/hostfile_gpssh_segonly

Having this in place we can easily create the directories on the segment nodes:

[gpadmin@rocky9-gp7-master ~]$ gpssh -f hostfile_gpssh_segonly -e 'sudo mkdir -p /data/primary'
[rocky9-gp7-segment1] sudo mkdir -p /data/primary
[rocky9-gp7-segment2] sudo mkdir -p /data/primary
[gpadmin@rocky9-gp7-master ~]$ gpssh -f hostfile_gpssh_segonly -e 'sudo mkdir -p /data/mirror'
[rocky9-gp7-segment1] sudo mkdir -p /data/mirror
[rocky9-gp7-segment2] sudo mkdir -p /data/mirror
[gpadmin@rocky9-gp7-master ~]$ gpssh -f hostfile_gpssh_segonly -e 'sudo chown gpadmin:gpadmin /data/*'
[rocky9-gp7-segment2] sudo chown gpadmin:gpadmin /data/*
[rocky9-gp7-segment1] sudo chown gpadmin:gpadmin /data/*

Greenplum comes with a utility can you can use to validate your systems when it comes to network, disk and memory performance. The utility is called “gpcheckperf” and this, e.g., will run a network performance test:

[gpadmin@rocky9-gp7-master ~]$ gpcheckperf -f hostfile_exkeys -r N -d /tmp
[INFO] --buffer-size value is not specified or invalid. Using default (8 kilobytes)
/usr/local/greenplum-db-7.1.0/bin/gpcheckperf -f hostfile_exkeys -r N -d /tmp
-------------------
--  NETPERF TEST
-------------------

====================
==  RESULT 2024-02-28T16:41:39.049314
====================
Netperf bisection bandwidth test
rocky9-gp7-master -> rocky9-gp7-segment1 = 1971.150000
rocky9-gp7-segment2 -> rocky9-gp7-master = 1688.660000
rocky9-gp7-segment1 -> rocky9-gp7-master = 1310.830000
rocky9-gp7-master -> rocky9-gp7-segment2 = 1377.070000

Summary:
sum = 6347.71 MB/sec
min = 1310.83 MB/sec
max = 1971.15 MB/sec
avg = 1586.93 MB/sec
median = 1688.66 MB/sec

[Warning] connection between rocky9-gp7-segment2 and rocky9-gp7-master is no good
[Warning] connection between rocky9-gp7-segment1 and rocky9-gp7-master is no good
[Warning] connection between rocky9-gp7-master and rocky9-gp7-segment2 is no good

I don’t care about this warnings because this is just a test, you should care if you do a real setup, of course. Running a disk I/O test can be done like this (this will run dd tests on all the segment nodes):

[gpadmin@rocky9-gp7-master ~]$ gpcheckperf -f hostfile_gpssh_segonly -r ds -D -d /data/primary -d /data/mirror
[INFO] --buffer-size value is not specified or invalid. Using default (8 kilobytes)
/usr/local/greenplum-db-7.1.0/bin/gpcheckperf -f hostfile_gpssh_segonly -r ds -D -d /data/primary -d /data/mirror
[Warning] Using 7650140160 bytes for disk performance test. This might take some time
--------------------
--  DISK WRITE TEST
--------------------
--------------------
--  DISK READ TEST
--------------------
--------------------
--  STREAM TEST
--------------------

====================
==  RESULT 2024-02-28T16:49:58.607351
====================

 disk write avg time (sec): 109.30
 disk write tot bytes: 15300296704
 disk write tot bandwidth (MB/s): 133.51
 disk write min bandwidth (MB/s): 66.31 [rocky9-gp7-segment1]
 disk write max bandwidth (MB/s): 67.19 [rocky9-gp7-segment2]
 -- per host bandwidth --
    disk write bandwidth (MB/s): 66.31 [rocky9-gp7-segment1]
    disk write bandwidth (MB/s): 67.19 [rocky9-gp7-segment2]


 disk read avg time (sec): 58.48
 disk read tot bytes: 15300296704
 disk read tot bandwidth (MB/s): 250.04
 disk read min bandwidth (MB/s): 119.41 [rocky9-gp7-segment1]
 disk read max bandwidth (MB/s): 130.63 [rocky9-gp7-segment2]
 -- per host bandwidth --
    disk read bandwidth (MB/s): 130.63 [rocky9-gp7-segment2]
    disk read bandwidth (MB/s): 119.41 [rocky9-gp7-segment1]


 stream tot bandwidth (MB/s): 66240.30
 stream min bandwidth (MB/s): 32732.80 [rocky9-gp7-segment1]
 stream max bandwidth (MB/s): 33507.50 [rocky9-gp7-segment2]
 -- per host bandwidth --
    stream bandwidth (MB/s): 32732.80 [rocky9-gp7-segment1]
    stream bandwidth (MB/s): 33507.50 [rocky9-gp7-segment2]

Assuming that we’re happy with the performance statistics we can proceed and initialize the cluster. With a community PostgreSQL installation you would do this with initdb, and actually initdb and many other utilities you know from PostgreSQL are available on the system:

[gpadmin@rocky9-gp7-master ~]$ ls -la /usr/local/greenplum-db/bin/
total 81796
drwxr-xr-x  8 gpadmin gpadmin     4096 Feb 28 16:36 .
drwxr-xr-x 11 gpadmin gpadmin     4096 Feb 28 14:52 ..
-rwxr-xr-x  1 gpadmin gpadmin    66665 Feb  8 21:01 analyzedb
-rwxr-xr-x  1 gpadmin gpadmin   259104 Feb  8 21:01 clusterdb
-rwxr-xr-x  1 gpadmin gpadmin   254416 Feb  8 21:01 createdb
-rwxr-xr-x  1 gpadmin gpadmin   265176 Feb  8 21:01 createuser
-rwxr-xr-x  1 gpadmin gpadmin   238480 Feb  8 21:01 dropdb
-rwxr-xr-x  1 gpadmin gpadmin   238352 Feb  8 21:01 dropuser
-rwxr-xr-x  1 gpadmin gpadmin  2754648 Feb  8 21:01 ecpg
-rwxr-xr-x  1 gpadmin gpadmin    17248 Feb  8 21:01 gpactivatestandby
-rwxr-xr-x  1 gpadmin gpadmin      494 Feb  8 21:01 gpaddmirrors
-rwxr-xr-x  1 gpadmin gpadmin   137764 Feb  8 21:01 gpcheckcat
drwxr-xr-x  3 gpadmin gpadmin     4096 Feb 28 14:52 gpcheckcat_modules
-rwxr-xr-x  1 gpadmin gpadmin    29980 Feb  8 21:01 gpcheckperf
-rwxr-xr-x  1 gpadmin gpadmin     6682 Feb  8 21:01 gpcheckresgroupimpl
-rwxr-xr-x  1 gpadmin gpadmin     3230 Feb  8 21:01 gpcheckresgroupv2impl
-rwxr-xr-x  1 gpadmin gpadmin    23374 Feb  8 21:01 gpconfig
drwxr-xr-x  3 gpadmin gpadmin     4096 Feb 28 14:52 gpconfig_modules
-rwxr-xr-x  1 gpadmin gpadmin    13754 Feb  8 21:01 gpdeletesystem
-rwxr-xr-x  1 gpadmin gpadmin   114969 Feb  8 21:01 gpexpand
-rwxr-xr-x  1 gpadmin gpadmin   407208 Feb  8 21:01 gpfdist
-rwxr-xr-x  1 gpadmin gpadmin    34959 Feb  8 21:01 gpinitstandby
-rwxr-xr-x  1 gpadmin gpadmin    83564 Feb  8 21:01 gpinitsystem
-rwxr-xr-x  1 gpadmin gpadmin      189 Feb  8 21:01 gpload
-rw-r--r--  1 gpadmin gpadmin      202 Feb  8 21:01 gpload.bat
-rwxr-xr-x  1 gpadmin gpadmin   113900 Feb  8 21:01 gpload.py
-rwxr-xr-x  1 gpadmin gpadmin    21018 Feb  8 21:01 gplogfilter
-rwxr-xr-x  1 gpadmin gpadmin    15333 Feb  8 21:01 gpmemreport
-rwxr-xr-x  1 gpadmin gpadmin     8032 Feb  8 21:01 gpmemwatcher
-rwxr-xr-x  1 gpadmin gpadmin    21646 Feb  8 21:01 gpmovemirrors
-rwxr-xr-x  1 gpadmin gpadmin      548 Feb  8 21:01 gprecoverseg
-rwxr-xr-x  1 gpadmin gpadmin     1162 Feb  8 21:01 gpreload
-rwxr-xr-x  1 gpadmin gpadmin    10723 Feb  8 21:01 gpsd
-rwxr-xr-x  1 gpadmin gpadmin     9258 Feb  8 21:01 gpssh
-rwxr-xr-x  1 gpadmin gpadmin    32516 Feb  8 21:01 gpssh-exkeys
drwxr-xr-x  3 gpadmin gpadmin       70 Feb 28 14:52 gpssh_modules
-rwxr-xr-x  1 gpadmin gpadmin    37579 Feb  8 21:01 gpstart
-rwxr-xr-x  1 gpadmin gpadmin      422 Feb  8 21:01 gpstate
-rwxr-xr-x  1 gpadmin gpadmin    45588 Feb  8 21:01 gpstop
-rwxr-xr-x  1 gpadmin gpadmin     4074 Feb  8 21:01 gpsync
-rwxr-xr-x  1 gpadmin gpadmin   528656 Feb  8 21:01 initdb
drwxr-xr-x  4 gpadmin gpadmin     4096 Feb 28 14:52 lib
-rwxr-xr-x  1 gpadmin gpadmin    17611 Feb  8 21:01 minirepro
-rwxr-xr-x  1 gpadmin gpadmin   163568 Feb  8 21:01 pg_archivecleanup
-rwxr-xr-x  1 gpadmin gpadmin   459656 Feb  8 21:01 pg_basebackup
-rwxr-xr-x  1 gpadmin gpadmin   667784 Feb  8 21:01 pgbench
-rwxr-xr-x  1 gpadmin gpadmin   224176 Feb  8 21:01 pg_checksums
-rwxr-xr-x  1 gpadmin gpadmin   150736 Feb  8 21:01 pg_config
-rwxr-xr-x  1 gpadmin gpadmin   177072 Feb  8 21:01 pg_controldata
-rwxr-xr-x  1 gpadmin gpadmin   235296 Feb  8 21:01 pg_ctl
-rwxr-xr-x  1 gpadmin gpadmin  1591264 Feb  8 21:01 pg_dump
-rwxr-xr-x  1 gpadmin gpadmin   371784 Feb  8 21:01 pg_dumpall
-rwxr-xr-x  1 gpadmin gpadmin   239264 Feb  8 21:01 pg_isready
-rwxr-xr-x  1 gpadmin gpadmin   327200 Feb  8 21:01 pg_receivewal
-rwxr-xr-x  1 gpadmin gpadmin   331168 Feb  8 21:01 pg_recvlogical
-rwxr-xr-x  1 gpadmin gpadmin   211880 Feb  8 21:01 pg_resetwal
-rwxr-xr-x  1 gpadmin gpadmin   764392 Feb  8 21:01 pg_restore
-rwxr-xr-x  1 gpadmin gpadmin   480400 Feb  8 21:01 pg_rewind
-rwxr-xr-x  1 gpadmin gpadmin   171944 Feb  8 21:01 pg_test_fsync
-rwxr-xr-x  1 gpadmin gpadmin   144336 Feb  8 21:01 pg_test_timing
-rwxr-xr-x  1 gpadmin gpadmin   606048 Feb  8 21:01 pg_upgrade
-rwxr-xr-x  1 gpadmin gpadmin   454504 Feb  8 21:01 pg_waldump
-rwxr-xr-x  1 gpadmin gpadmin 67633848 Feb  8 21:01 postgres
lrwxrwxrwx  1 gpadmin gpadmin        8 Feb  8 21:01 postmaster -> postgres
-rwxr-xr-x  1 gpadmin gpadmin  1826136 Feb  8 21:01 psql
drwxr-xr-x  2 gpadmin gpadmin       35 Feb 28 14:52 __pycache__
-rwxr-xr-x  1 gpadmin gpadmin   267224 Feb  8 21:01 reindexdb
drwxr-xr-x  2 gpadmin gpadmin       20 Feb 28 14:52 stream
-rwxr-xr-x  1 gpadmin gpadmin   287832 Feb  8 21:01 vacuumdb

The Greenplum system will work across multiple nodes and all of them will host PostgreSQL instances (called segment and coordinator instances). To make this easier to setup Greenplum comes with its own version of “initdb” which is called “gpinitsystem”:

[gpadmin@rocky9-gp7-master ~]$ gpinitsystem --version
gpinitsystem 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source

Before the system can be initialized we need to create the Greenplum database configuration file. There is a template we can use as a starting point:

[gpadmin@rocky9-gp7-master ~]$ echo $GPHOME
/usr/local/greenplum-db-7.1.0
[gpadmin@rocky9-gp7-master ~]$ mkdir /home/gpadmin/gpconfigs/
[gpadmin@rocky9-gp7-master ~]$ cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/gpconfigs/gpinitsystem_config
[gpadmin@rocky9-gp7-master ~]$ vi /home/gpadmin/gpconfigs/gpinitsystem_config

For the scope of this demo system all which needs to be adjusted are the data and mirror directories:

[gpadmin@rocky9-gp7-master ~]$ egrep "DATA_DIRECTORY|MIRROR_DATA_DIRECTORY|MIRROR_PORT_BASE" /home/gpadmin/gpconfigs/gpinitsystem_config | egrep -v "^#"
declare -a DATA_DIRECTORY=(/data/primary)
MIRROR_PORT_BASE=7000
declare -a MIRROR_DATA_DIRECTORY=(/data/mirror)

This config and the hosts file which contains the segments need to be passed to “gpinitsystem”:

[gpadmin@rocky9-gp7-master ~]$ gpinitsystem -c gpconfigs/gpinitsystem_config -h hostfile_gpssh_segonly
20240229:11:39:11:001290 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Checking configuration parameters, please wait...
20240229:11:39:11:001290 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Reading Greenplum configuration file gpconfigs/gpinitsystem_config
20240229:11:39:11:001290 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Locale has not been set in gpconfigs/gpinitsystem_config, will set to default value
20240229:11:39:11:001290 gpinitsystem:rocky9-gp7-master:gpadmin-[WARN]:-Coordinator hostname cdw does not match hostname output
20240229:11:39:11:001290 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Checking to see if cdw can be resolved on this host
ssh: Could not resolve hostname cdw: Name or service not known
ssh: Could not resolve hostname cdw: Name or service not known
20240229:11:39:20:001290 gpinitsystem:rocky9-gp7-master:gpadmin-[FATAL]:-Coordinator hostname in configuration file is cdw
20240229:11:39:20:001290 gpinitsystem:rocky9-gp7-master:gpadmin-[FATAL]:-Operating system command returns rocky9-gp7-master.it.dbi-services.com
20240229:11:39:20:001290 gpinitsystem:rocky9-gp7-master:gpadmin-[FATAL]:-Unable to resolve cdw on this host
20240229:11:39:20:001290 gpinitsystem:rocky9-gp7-master:gpadmin-[FATAL]:-Coordinator hostname in gpinitsystem configuration file must be cdw Script Exiting!

It seems the hostname of the coordinator node needs to be “cdw”, so lets add this to the host files on all nodes:

[gpadmin@rocky9-gp7-master ~]$ sudo vi /etc/hosts
[gpadmin@rocky9-gp7-master ~]$ cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.122.200 rocky9-gp7-master rocky9-gp7-master.it.dbi-services.com cdw cdw.it.dbi-services.com
192.168.122.201 rocky9-gp7-segment1 rocky9-gp7-segment1.it.dbi-services.com
192.168.122.202 rocky9-gp7-segment2 rocky9-gp7-segment2.it.dbi-services.com

Running it once more and it looks much better:

[gpadmin@rocky9-gp7-master ~]$ gpinitsystem -c gpconfigs/gpinitsystem_config -h hostfile_gpssh_segonly
20240229:11:45:23:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Checking configuration parameters, please wait...
20240229:11:45:23:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Reading Greenplum configuration file gpconfigs/gpinitsystem_config
20240229:11:45:23:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Locale has not been set in gpconfigs/gpinitsystem_config, will set to default value
20240229:11:45:23:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[WARN]:-Coordinator hostname cdw does not match hostname output
20240229:11:45:23:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Checking to see if cdw can be resolved on this host
The authenticity of host 'cdw (192.168.122.200)' can't be established.
ED25519 key fingerprint is SHA256:Tdo3AwqH109Mgc30keTbDcusFii8PSft0FXWTUS0Tb0.
This host key is known by the following other names/addresses:
    ~/.ssh/known_hosts:1: rocky9-gp7-segment1
    ~/.ssh/known_hosts:4: rocky9-gp7-segment2
    ~/.ssh/known_hosts:5: rocky9-gp7-master
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
Warning: Permanently added 'cdw' (ED25519) to the list of known hosts.
20240229:11:45:31:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Can resolve cdw to this host
20240229:11:45:31:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-No DATABASE_NAME set, will exit following template1 updates
20240229:11:45:31:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-COORDINATOR_MAX_CONNECT not set, will set to default value 250
20240229:11:45:31:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Checking configuration parameters, Completed
20240229:11:45:31:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Commencing multi-home checks, please wait...
..
20240229:11:45:31:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Configuring build for standard array
20240229:11:45:31:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Commencing multi-home checks, Completed
20240229:11:45:31:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Building primary segment instance array, please wait...
....
20240229:11:45:32:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Building group mirror array type , please wait...
....
20240229:11:45:34:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Checking Coordinator host
20240229:11:45:34:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Checking new segment hosts, please wait...
........
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Checking new segment hosts, Completed
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Greenplum Database Creation Parameters
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:---------------------------------------
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Coordinator Configuration
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:---------------------------------------
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Coordinator hostname       = cdw
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Coordinator port           = 5432
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Coordinator instance dir   = /data/coordinator/gpseg-1
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Coordinator LOCALE         = 
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Greenplum segment prefix   = gpseg
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Coordinator Database       = 
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Coordinator connections    = 250
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Coordinator buffers        = 128000kB
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Segment connections        = 750
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Segment buffers            = 128000kB
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Encoding                   = UNICODE
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Postgres param file        = Off
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Initdb to be used          = /usr/local/greenplum-db-7.1.0/bin/initdb
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-GP_LIBRARY_PATH is         = /usr/local/greenplum-db-7.1.0/lib
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-HEAP_CHECKSUM is           = on
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-HBA_HOSTNAMES is           = 0
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Ulimit check               = Passed
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Array host connect type    = Single hostname per node
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Coordinator IP address [1]      = ::1
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Coordinator IP address [2]      = 192.168.122.200
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Coordinator IP address [3]      = fe80::5054:ff:fe5d:fef7
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Standby Coordinator             = Not Configured
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Number of primary segments = 2
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Total Database segments    = 4
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Trusted shell              = ssh
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Number segment hosts       = 2
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Mirror port base           = 7000
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Number of mirror segments  = 2
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Mirroring config           = ON
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Mirroring type             = Group
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:----------------------------------------
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Greenplum Primary Segment Configuration
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:----------------------------------------
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-rocky9-gp7-segment1.it.dbi-services.com     6000    rocky9-gp7-segment1     /data/primary/gpseg0        2
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-rocky9-gp7-segment1.it.dbi-services.com     6001    rocky9-gp7-segment1     /data/primary/gpseg1        3
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-rocky9-gp7-segment2.it.dbi-services.com     6000    rocky9-gp7-segment2     /data/primary/gpseg2        4
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-rocky9-gp7-segment2.it.dbi-services.com     6001    rocky9-gp7-segment2     /data/primary/gpseg3        5
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:---------------------------------------
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Greenplum Mirror Segment Configuration
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:---------------------------------------
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-rocky9-gp7-segment2.it.dbi-services.com     7000    rocky9-gp7-segment2     /data/mirror/gpseg0 6
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-rocky9-gp7-segment2.it.dbi-services.com     7001    rocky9-gp7-segment2     /data/mirror/gpseg1 7
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-rocky9-gp7-segment1.it.dbi-services.com     7000    rocky9-gp7-segment1     /data/mirror/gpseg2 8
20240229:11:45:39:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-rocky9-gp7-segment1.it.dbi-services.com     7001    rocky9-gp7-segment1     /data/mirror/gpseg3 9

Continue with Greenplum creation Yy|Nn (default=N):

Confirming the question leads to this:

> Y
20240229:11:48:12:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Building the Coordinator instance database, please wait...
20240229:11:48:14:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Starting the Coordinator in admin mode
20240229:11:48:14:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Commencing parallel build of primary segment instances
20240229:11:48:14:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Spawning parallel processes    batch [1], please wait...
....
20240229:11:48:14:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
.........
20240229:11:48:23:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:------------------------------------------------
20240229:11:48:23:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Parallel process exit status
20240229:11:48:23:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:------------------------------------------------
20240229:11:48:23:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Total processes marked as completed           = 4
20240229:11:48:23:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Total processes marked as killed              = 0
20240229:11:48:23:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Total processes marked as failed              = 0
20240229:11:48:23:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:------------------------------------------------
20240229:11:48:23:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Removing back out file
20240229:11:48:23:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-No errors generated from parallel processes
20240229:11:48:23:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Restarting the Greenplum instance in production mode
20240229:11:48:23:006091 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Starting gpstop with args: -a -l /home/gpadmin/gpAdminLogs -m -d /data/coordinator/gpseg-1
20240229:11:48:23:006091 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Gathering information and validating the environment...
20240229:11:48:23:006091 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Obtaining Greenplum Coordinator catalog information
20240229:11:48:23:006091 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240229:11:48:23:006091 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240229:11:48:23:006091 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Commencing Coordinator instance shutdown with mode='smart'
20240229:11:48:23:006091 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Coordinator segment instance directory=/data/coordinator/gpseg-1
20240229:11:48:24:006091 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Stopping coordinator segment and waiting for user connections to finish ...
server shutting down
20240229:11:48:25:006091 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Attempting forceful termination of any leftover coordinator process
20240229:11:48:25:006091 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Terminating processes for segment /data/coordinator/gpseg-1
20240229:11:48:26:006330 gpstart:rocky9-gp7-master:gpadmin-[INFO]:-Starting gpstart with args: -a -l /home/gpadmin/gpAdminLogs -d /data/coordinator/gpseg-1
20240229:11:48:26:006330 gpstart:rocky9-gp7-master:gpadmin-[INFO]:-Gathering information and validating the environment...
20240229:11:48:26:006330 gpstart:rocky9-gp7-master:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240229:11:48:26:006330 gpstart:rocky9-gp7-master:gpadmin-[INFO]:-Greenplum Catalog Version: '302307241'
20240229:11:48:26:006330 gpstart:rocky9-gp7-master:gpadmin-[INFO]:-Starting Coordinator instance in admin mode
20240229:11:48:26:006330 gpstart:rocky9-gp7-master:gpadmin-[INFO]:-CoordinatorStart pg_ctl cmd is env GPSESSID=0000000000 GPERA=None $GPHOME/bin/pg_ctl -D /data/coordinator/gpseg-1 -l /data/coordinator/gpseg-1/log/startup.log -w -t 600 -o " -c gp_role=utility " start
20240229:11:48:26:006330 gpstart:rocky9-gp7-master:gpadmin-[INFO]:-Obtaining Greenplum Coordinator catalog information
20240229:11:48:26:006330 gpstart:rocky9-gp7-master:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240229:11:48:26:006330 gpstart:rocky9-gp7-master:gpadmin-[INFO]:-Setting new coordinator era
20240229:11:48:26:006330 gpstart:rocky9-gp7-master:gpadmin-[INFO]:-Coordinator Started...
The authenticity of host 'rocky9-gp7-segment2.it.dbi-services.com (192.168.122.202)' can't be established.
ED25519 key fingerprint is SHA256:Tdo3AwqH109Mgc30keTbDcusFii8PSft0FXWTUS0Tb0.
This host key is known by the following other names/addresses:
    ~/.ssh/known_hosts:1: rocky9-gp7-segment1
    ~/.ssh/known_hosts:4: rocky9-gp7-segment2
    ~/.ssh/known_hosts:5: rocky9-gp7-master
    ~/.ssh/known_hosts:6: cdw
The authenticity of host 'rocky9-gp7-segment1.it.dbi-services.com (192.168.122.201)' can't be established.
ED25519 key fingerprint is SHA256:Tdo3AwqH109Mgc30keTbDcusFii8PSft0FXWTUS0Tb0.
This host key is known by the following other names/addresses:
    ~/.ssh/known_hosts:1: rocky9-gp7-segment1
    ~/.ssh/known_hosts:4: rocky9-gp7-segment2
    ~/.ssh/known_hosts:5: rocky9-gp7-master
    ~/.ssh/known_hosts:6: cdw
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes

20240229:11:51:06:006330 gpstart:rocky9-gp7-master:gpadmin-[WARNING]:-One or more hosts are not reachable via SSH.
20240229:11:51:06:006330 gpstart:rocky9-gp7-master:gpadmin-[WARNING]:-Host rocky9-gp7-segment1.it.dbi-services.com is unreachable
20240229:11:51:06:006330 gpstart:rocky9-gp7-master:gpadmin-[WARNING]:-Marking segment 2 down because rocky9-gp7-segment1.it.dbi-services.com is unreachable
20240229:11:51:06:006330 gpstart:rocky9-gp7-master:gpadmin-[CRITICAL]:-gpstart failed. (Reason=''NoneType' object has no attribute 'getSegmentHostName'') exiting...
20240229:11:51:06:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[WARN]:
20240229:11:51:06:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[WARN]:-Failed to start Greenplum instance; review gpstart output to
20240229:11:51:06:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[WARN]:- determine why gpstart failed and reinitialize cluster after resolving
20240229:11:51:06:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[WARN]:- issues.  Not all initialization tasks have completed so the cluster
20240229:11:51:06:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[WARN]:- should not be used.
20240229:11:51:06:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[WARN]:-gpinitsystem will now try to stop the cluster
20240229:11:51:06:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[WARN]:
20240229:11:51:06:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Starting gpstop with args: -a -l /home/gpadmin/gpAdminLogs -i -d /data/coordinator/gpseg-1
20240229:11:51:06:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Gathering information and validating the environment...
20240229:11:51:06:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Obtaining Greenplum Coordinator catalog information
20240229:11:51:06:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240229:11:51:06:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240229:11:51:06:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Commencing Coordinator instance shutdown with mode='immediate'
20240229:11:51:06:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Coordinator segment instance directory=/data/coordinator/gpseg-1

20240229:11:51:07:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Attempting forceful termination of any leftover coordinator process
20240229:11:51:07:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Terminating processes for segment /data/coordinator/gpseg-1
20240229:11:51:08:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-No standby coordinator host configured
20240229:11:51:08:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Targeting dbid [2, 3, 4, 5] for shutdown
20240229:11:51:08:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Commencing parallel segment instance shutdown, please wait...
20240229:11:51:08:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-0.00% of jobs completed
20240229:11:51:09:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-100.00% of jobs completed
20240229:11:51:09:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-----------------------------------------------------
20240229:11:51:09:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-   Segments stopped successfully      = 4
20240229:11:51:09:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-   Segments with errors during stop   = 0
20240229:11:51:09:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-----------------------------------------------------
20240229:11:51:09:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Successfully shutdown 4 of 4 segment instances 
20240229:11:51:09:006412 gpstop:rocky9-gp7-master:gpadmin-[INFO]:-Database successfully shutdown with no errors reported
20240229:11:51:09:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[INFO]:-Successfully shutdown the Greenplum instance
20240229:11:51:09:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[WARN]:
20240229:11:51:09:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[WARN]:-Failed to start Greenplum instance; review gpstart output to
20240229:11:51:09:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[WARN]:- determine why gpstart failed and reinitialize cluster after resolving
20240229:11:51:09:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[WARN]:- issues.  Not all initialization tasks have completed so the cluster
20240229:11:51:09:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[WARN]:- should not be used.
20240229:11:51:09:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[WARN]:
20240229:11:51:09:001611 gpinitsystem:rocky9-gp7-master:gpadmin-[FATAL]: starting new instance failed; Script Exiting!

This happens when you do not fully read the documentation: “The Greenplum Database host naming convention for the coordinator host is cdw and for the standby coordinator host is scdw.

The segment host naming convention is sdwN where sdw is a prefix and N is an integer. For example, segment host names would be sdw1sdw2 and so on. NIC bonding is recommended for hosts with multiple interfaces, but when the interfaces are not bonded, the convention is to append a dash (-) and number to the host name. For example, sdw1-1 and sdw1-2 are the two interface names for host sdw1.”

So, lets fix this (also change the hostname on each node):

[gpadmin@rocky9-gp7-master ~]$ sudo vi /etc/hosts
[gpadmin@rocky9-gp7-master ~]$ cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.122.200 cdw cdw.it.dbi-services.com
192.168.122.201 sdw1 sdw1.it.dbi-services.com
192.168.122.202 sdw2 sdw2.it.dbi-services.com

[gpadmin@rocky9-gp7-master ~]$ cat hostfile_gpssh_segonly 
sdw1
sdw2

[gpadmin@rocky9-gp7-master ~]$ cat hostfile_exkeys 
cdw
sdw1
sdw2

Before initializing the system again, lets cleanup what was already created:

[gpadmin@rocky9-gp7-master ~]$ gpssh -f hostfile_gpssh_segonly -e "rm -rf /data/primary/*; rm -rf /data/mirror/*; rm -rf /data/coordinator/*"
[sdw2] rm -rf /data/primary/*; rm -rf /data/mirror/*; rm -rf /data/coordinator/*
[sdw1] rm -rf /data/primary/*; rm -rf /data/mirror/*; rm -rf /data/coordinator/*
[gpadmin@rocky9-gp7-master ~]$ rm -rf /data/coordinator/*

Next try:

[gpadmin@cdw ~]$ gpinitsystem -c gpconfigs/gpinitsystem_config -h hostfile_gpssh_segonly
20240229:12:51:03:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Checking configuration parameters, please wait...
20240229:12:51:03:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Reading Greenplum configuration file gpconfigs/gpinitsystem_config
20240229:12:51:03:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Locale has not been set in gpconfigs/gpinitsystem_config, will set to default value
20240229:12:51:03:013410 gpinitsystem:cdw:gpadmin-[INFO]:-No DATABASE_NAME set, will exit following template1 updates
20240229:12:51:03:013410 gpinitsystem:cdw:gpadmin-[INFO]:-COORDINATOR_MAX_CONNECT not set, will set to default value 250
20240229:12:51:03:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Checking configuration parameters, Completed
20240229:12:51:03:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Commencing multi-home checks, please wait...
..
20240229:12:51:04:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Configuring build for standard array
20240229:12:51:04:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Commencing multi-home checks, Completed
20240229:12:51:04:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Building primary segment instance array, please wait...
..
20240229:12:51:04:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Building group mirror array type , please wait...
..
20240229:12:51:05:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Checking Coordinator host
20240229:12:51:05:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Checking new segment hosts, please wait...
....
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Checking new segment hosts, Completed
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Greenplum Database Creation Parameters
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:---------------------------------------
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Coordinator Configuration
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:---------------------------------------
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Coordinator hostname       = cdw
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Coordinator port           = 5432
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Coordinator instance dir   = /data/coordinator/gpseg-1
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Coordinator LOCALE         = 
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Greenplum segment prefix   = gpseg
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Coordinator Database       = 
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Coordinator connections    = 250
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Coordinator buffers        = 128000kB
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Segment connections        = 750
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Segment buffers            = 128000kB
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Encoding                   = UNICODE
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Postgres param file        = Off
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Initdb to be used          = /usr/local/greenplum-db-7.1.0/bin/initdb
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-GP_LIBRARY_PATH is         = /usr/local/greenplum-db-7.1.0/lib
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-HEAP_CHECKSUM is           = on
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-HBA_HOSTNAMES is           = 0
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Ulimit check               = Passed
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Array host connect type    = Single hostname per node
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Coordinator IP address [1]      = ::1
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Coordinator IP address [2]      = 192.168.122.200
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Coordinator IP address [3]      = fe80::5054:ff:fe5d:fef7
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Standby Coordinator             = Not Configured
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Number of primary segments = 1
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Total Database segments    = 2
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Trusted shell              = ssh
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Number segment hosts       = 2
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Mirror port base           = 7000
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Number of mirror segments  = 1
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Mirroring config           = ON
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Mirroring type             = Group
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:----------------------------------------
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Greenplum Primary Segment Configuration
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:----------------------------------------
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-sdw1  6000    sdw1    /data/primary/gpseg0   2
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-sdw2  6000    sdw2    /data/primary/gpseg1   3
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:---------------------------------------
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Greenplum Mirror Segment Configuration
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:---------------------------------------
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-sdw2  7000    sdw2    /data/mirror/gpseg0    4
20240229:12:51:08:013410 gpinitsystem:cdw:gpadmin-[INFO]:-sdw1  7000    sdw1    /data/mirror/gpseg1    5

Continue with Greenplum creation Yy|Nn (default=N):
> y
20240229:12:51:12:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Building the Coordinator instance database, please wait...
20240229:12:51:13:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Starting the Coordinator in admin mode
20240229:12:51:13:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Commencing parallel build of primary segment instances
20240229:12:51:13:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Spawning parallel processes    batch [1], please wait...
..
20240229:12:51:13:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
.......
20240229:12:51:20:013410 gpinitsystem:cdw:gpadmin-[INFO]:------------------------------------------------
20240229:12:51:20:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Parallel process exit status
20240229:12:51:20:013410 gpinitsystem:cdw:gpadmin-[INFO]:------------------------------------------------
20240229:12:51:20:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Total processes marked as completed           = 2
20240229:12:51:20:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Total processes marked as killed              = 0
20240229:12:51:20:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Total processes marked as failed              = 0
20240229:12:51:20:013410 gpinitsystem:cdw:gpadmin-[INFO]:------------------------------------------------
20240229:12:51:20:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Removing back out file
20240229:12:51:20:013410 gpinitsystem:cdw:gpadmin-[INFO]:-No errors generated from parallel processes
20240229:12:51:20:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Restarting the Greenplum instance in production mode
20240229:12:51:20:016290 gpstop:cdw:gpadmin-[INFO]:-Starting gpstop with args: -a -l /home/gpadmin/gpAdminLogs -m -d /data/coordinator/gpseg-1
20240229:12:51:20:016290 gpstop:cdw:gpadmin-[INFO]:-Gathering information and validating the environment...
20240229:12:51:20:016290 gpstop:cdw:gpadmin-[INFO]:-Obtaining Greenplum Coordinator catalog information
20240229:12:51:20:016290 gpstop:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240229:12:51:20:016290 gpstop:cdw:gpadmin-[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240229:12:51:20:016290 gpstop:cdw:gpadmin-[INFO]:-Commencing Coordinator instance shutdown with mode='smart'
20240229:12:51:20:016290 gpstop:cdw:gpadmin-[INFO]:-Coordinator segment instance directory=/data/coordinator/gpseg-1
20240229:12:51:20:016290 gpstop:cdw:gpadmin-[INFO]:-Stopping coordinator segment and waiting for user connections to finish ...
server shutting down
20240229:12:51:21:016290 gpstop:cdw:gpadmin-[INFO]:-Attempting forceful termination of any leftover coordinator process
20240229:12:51:21:016290 gpstop:cdw:gpadmin-[INFO]:-Terminating processes for segment /data/coordinator/gpseg-1
20240229:12:51:23:016530 gpstart:cdw:gpadmin-[INFO]:-Starting gpstart with args: -a -l /home/gpadmin/gpAdminLogs -d /data/coordinator/gpseg-1
20240229:12:51:23:016530 gpstart:cdw:gpadmin-[INFO]:-Gathering information and validating the environment...
20240229:12:51:23:016530 gpstart:cdw:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 7.1.0 build commit:e7c2b1f14bb42a1018ac57d14f4436880e0a0515 Open Source'
20240229:12:51:23:016530 gpstart:cdw:gpadmin-[INFO]:-Greenplum Catalog Version: '302307241'
20240229:12:51:23:016530 gpstart:cdw:gpadmin-[INFO]:-Starting Coordinator instance in admin mode
20240229:12:51:23:016530 gpstart:cdw:gpadmin-[INFO]:-CoordinatorStart pg_ctl cmd is env GPSESSID=0000000000 GPERA=None $GPHOME/bin/pg_ctl -D /data/coordinator/gpseg-1 -l /data/coordinator/gpseg-1/log/startup.log -w -t 600 -o " -c gp_role=utility " start
20240229:12:51:23:016530 gpstart:cdw:gpadmin-[INFO]:-Obtaining Greenplum Coordinator catalog information
20240229:12:51:23:016530 gpstart:cdw:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
20240229:12:51:23:016530 gpstart:cdw:gpadmin-[INFO]:-Setting new coordinator era
20240229:12:51:23:016530 gpstart:cdw:gpadmin-[INFO]:-Coordinator Started...
20240229:12:51:23:016530 gpstart:cdw:gpadmin-[INFO]:-Shutting down coordinator
20240229:12:51:24:016530 gpstart:cdw:gpadmin-[INFO]:-Commencing parallel segment instance startup, please wait...
20240229:12:51:25:016530 gpstart:cdw:gpadmin-[INFO]:-Process results...
20240229:12:51:25:016530 gpstart:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240229:12:51:25:016530 gpstart:cdw:gpadmin-[INFO]:-   Successful segment starts                                            = 2
20240229:12:51:25:016530 gpstart:cdw:gpadmin-[INFO]:-   Failed segment starts                                                = 0
20240229:12:51:25:016530 gpstart:cdw:gpadmin-[INFO]:-   Skipped segment starts (segments are marked down in configuration)   = 0
20240229:12:51:25:016530 gpstart:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240229:12:51:25:016530 gpstart:cdw:gpadmin-[INFO]:-Successfully started 2 of 2 segment instances 
20240229:12:51:25:016530 gpstart:cdw:gpadmin-[INFO]:-----------------------------------------------------
20240229:12:51:25:016530 gpstart:cdw:gpadmin-[INFO]:-Starting Coordinator instance cdw directory /data/coordinator/gpseg-1 
20240229:12:51:25:016530 gpstart:cdw:gpadmin-[INFO]:-CoordinatorStart pg_ctl cmd is env GPSESSID=0000000000 GPERA=b37f5ee82ead4186_240229125123 $GPHOME/bin/pg_ctl -D /data/coordinator/gpseg-1 -l /data/coordinator/gpseg-1/log/startup.log -w -t 600 -o " -c gp_role=dispatch " start
20240229:12:51:25:016530 gpstart:cdw:gpadmin-[INFO]:-Command pg_ctl reports Coordinator cdw instance active
20240229:12:51:25:016530 gpstart:cdw:gpadmin-[INFO]:-Connecting to db template1 on host localhost
20240229:12:51:25:016530 gpstart:cdw:gpadmin-[INFO]:-No standby coordinator configured.  skipping...
20240229:12:51:25:016530 gpstart:cdw:gpadmin-[INFO]:-Database successfully started
20240229:12:51:25:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Completed restart of Greenplum instance in production mode
20240229:12:51:25:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Creating core GPDB extensions
20240229:12:51:25:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Importing system collations
20240229:12:51:27:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Commencing parallel build of mirror segment instances
20240229:12:51:27:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Spawning parallel processes    batch [1], please wait...
..
20240229:12:51:27:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
......
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:------------------------------------------------
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Parallel process exit status
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:------------------------------------------------
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Total processes marked as completed           = 2
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Total processes marked as killed              = 0
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Total processes marked as failed              = 0
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:------------------------------------------------
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Scanning utility log file for any warning messages
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[WARN]:-*******************************************************
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[WARN]:-Scan of log file indicates that some warnings or errors
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[WARN]:-were generated during the array creation
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Please review contents of log file
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-/home/gpadmin/gpAdminLogs/gpinitsystem_20240229.log
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-To determine level of criticality
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-These messages could be from a previous run of the utility
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-that was called today!
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[WARN]:-*******************************************************
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Greenplum Database instance successfully created
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-------------------------------------------------------
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-To complete the environment configuration, please 
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-update gpadmin .bashrc file with the following
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-1. Ensure that the greenplum_path.sh file is sourced
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-2. Add "export COORDINATOR_DATA_DIRECTORY=/data/coordinator/gpseg-1"
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-   to access the Greenplum scripts for this instance:
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-   or, use -d /data/coordinator/gpseg-1 option for the Greenplum scripts
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-   Example gpstate -d /data/coordinator/gpseg-1
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Script log file = /home/gpadmin/gpAdminLogs/gpinitsystem_20240229.log
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-To remove instance, run gpdeletesystem utility
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-To initialize a Standby Coordinator Segment for this Greenplum instance
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Review options for gpinitstandby
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-------------------------------------------------------
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-The Coordinator /data/coordinator/gpseg-1/pg_hba.conf post gpinitsystem
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-has been configured to allow all hosts within this new
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-array to intercommunicate. Any hosts external to this
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-new array must be explicitly added to this file
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-Refer to the Greenplum Admin support guide which is
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-located in the /usr/local/greenplum-db-7.1.0/docs directory
20240229:12:51:34:013410 gpinitsystem:cdw:gpadmin-[INFO]:-------------------------------------------------------

All fine. The last step from the documentation is to set the time zone with “gpconfig”. To list the current time zone which is used by the system:

[gpadmin@cdw ~]$ gpconfig -s TimeZone
Values on all segments are consistent
GUC              : TimeZone
Coordinator value: Europe/Zurich
Segment     value: Europe/Zurich

To set the time zone:

[gpadmin@cdw ~]$ gpconfig -c TimeZone -v 'Europe/Zurich'
Environment Variable COORDINATOR_DATA_DIRECTORY not set!
[gpadmin@cdw ~]$ export COORDINATOR_DATA_DIRECTORY=/data/coordinator/gpseg-1/
[gpadmin@cdw ~]$ gpconfig -c TimeZone -v 'Europe/Zurich'
20240229:13:01:03:017941 gpconfig:cdw:gpadmin-[INFO]:-completed successfully with parameters '-c TimeZone -v Europe/Zurich'

That’s it for the scope of this post. In the next post we’ll look in more detail what got created and how the PostgreSQL instances interact with each other.

L’article Getting started with Greenplum – 2 – Initializing and bringing up the cluster est apparu en premier sur dbi Blog.

Getting started with Greenplum – 1 – Installation

Wed, 2024-02-28 09:14

Because PostgreSQL is fully open source there are many forks of it. One of them is called Greenplum which describes itself as “an advanced, fully featured, open source data warehouse, based on PostgreSQL”. Sounds interesting, so lets give it a try. This will be a series of blog posts and in this first one we’re going to prepare the operating system, install the software and verify the installation afterwards.

What follows is basically a short version of the installation guide which you can find here.

One of the requirements is either to disable SELinux or to configure it properly for the Greenplum installation. As this is only a playground, let’s do it the easy way and just disable it. This can be done by setting SELinux to “disabled” in /etc/sysconfig/selinux and reboot the system (I am using Rocky Linux 9 here):

[gpadmin@rocky9-gp7-master ~]$ grep -w SELINUX /etc/sysconfig/selinux 
# SELINUX= can take one of these three values:
# NOTE: Up to RHEL 8 release included, SELINUX=disabled would also
SELINUX=disabled
[root@rocky9-gp7-master ~]$ reboot
[root@rocky9-gp7-master ~]$ getenforce 
Disabled

The same for the local firewall, either disable it or configure it properly:

[root@rocky9-gp7-master ~]$ systemctl stop firewalld
[root@rocky9-gp7-master ~]$ systemctl disable firewalld
Removed "/etc/systemd/system/multi-user.target.wants/firewalld.service".
Removed "/etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service".

To avoid DNS the hosts file on all my three nodes looks like this:

[root@rocky9-gp7-master ~]$ cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.122.200 rocky9-gp7-master rocky9-gp7-master.it.dbi-services.com
192.168.122.201 rocky9-gp7-segment1 rocky9-gp7-segment1.it.dbi-services.com
192.168.122.202 rocky9-gp7-segment2 rocky9-gp7-segment2.it.dbi-services.com

The first node is the so called “Coordinator Host”. This one will receive all the client requests and route them to one of the so called “Segment Hosts”. In this case there are two segment nodes and those will host the actual data.

For the kernel & system requirements this are the recommended settings:

[root@rocky9-gp7-master ~]$ cat /etc/sysctl.conf
# kernel.shmall = _PHYS_PAGES / 2 # See Shared Memory Pages
kernel.shmall = 197951838
# kernel.shmmax = kernel.shmall * PAGE_SIZE 
kernel.shmmax = 810810728448
kernel.shmmni = 4096
vm.overcommit_memory = 2 # See Segment Host Memory
vm.overcommit_ratio = 95 # See Segment Host Memory

net.ipv4.ip_local_port_range = 10000 65535 # See Port Settings
kernel.sem = 250 2048000 200 8192
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
kernel.core_pattern=/var/core/core.%h.%t
net.ipv4.tcp_syncookies = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.ipv4.ipfrag_high_thresh = 41943040
net.ipv4.ipfrag_low_thresh = 31457280
net.ipv4.ipfrag_time = 60
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.swappiness = 10
vm.zone_reclaim_mode = 0
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
vm.dirty_background_ratio = 0 # See System Memory
vm.dirty_ratio = 0
vm.dirty_background_bytes = 1610612736
vm.dirty_bytes = 4294967296
[root@rocky9-gp7-master ~]$ sysctl -p
[root@rocky9-gp7-master ~]$ egrep "^\*" /etc/security/limits.conf
* soft nofile 524288
* hard nofile 524288
* soft nproc 131072
* hard nproc 131072
* soft  core unlimited

Another requirement is, that rc.local needs to be enabled or, in other words, it needs to be executable when the systems are starting up:

[root@rocky9-gp7-master ~]$ chmod +x /etc/rc.d/rc.local
[root@rocky9-gp7-master ~]$ reboot

As usual on system swhich host a database it is recommended to disable transparent huge pages (this required a reboot as well):

[root@rocky9-gp7-master ~]$ grubby --update-kernel=ALL --args="transparent_hugepage=never"

Deactivate systemd’s IPC object removal (this is already the default on Rocky Linux 9, but anyway):

[root@rocky9-gp7-master ~]$ sed -i 's/#RemoveIPC=no/RemoveIPC=no/g' /etc/systemd/logind.conf 
[root@rocky9-gp7-master ~]$ systemctl restart systemd-logind.service

As Greenplum should run under a dedicated user, let’s create it:

[root@rocky9-gp7-master ~]$ groupadd gpadmin
[root@rocky9-gp7-master ~]$ useradd -g gpadmin -m gpadmin
[root@rocky9-gp7-master ~]$ passwd gpadmin
Changing password for user gpadmin.
New password: 
BAD PASSWORD: The password fails the dictionary check - it is based on a dictionary word
Retype new password: 
passwd: all authentication tokens updated successfully.

sudo configuration is optional, but as it makes life a lot easier, lets configure this as well for the gpadmin user:

[root@rocky9-gp7-master ~]$ grep gpadmin /etc/sudoers
gpadmin ALL=(ALL)       NOPASSWD: ALL

The installation of Greenplum is just a matter of installing the rpm, which can be downloaded from the project’s Github repository:

[root@rocky9-gp7-master ~]$ su - gpadmin
Last login: Wed Feb 28 14:48:01 CET 2024 on pts/0
[gpadmin@rocky9-gp7-master ~]$ ls -l
total 50320
-rw-r--r-- 1 gpadmin gpadmin 51527129 Feb 28 14:50 open-source-greenplum-db-7.1.0-el9-x86_64.rpm
[gpadmin@rocky9-gp7-master ~]$ sudo dnf localinstall ./open-source-greenplum-db-7.1.0-el9-x86_64.rpm 
Rocky Linux 9 - BaseOS                                                                                14 kB/s | 4.1 kB     00:00    
Rocky Linux 9 - BaseOS                                                                               5.6 MB/s | 2.2 MB     00:00    
Rocky Linux 9 - AppStream                                                                             22 kB/s | 4.5 kB     00:00    
Rocky Linux 9 - AppStream                                                                             12 MB/s | 7.4 MB     00:00    
Rocky Linux 9 - Extras                                                                               6.7 kB/s | 2.9 kB     00:00    
Rocky Linux 9 - Extras                                                                                24 kB/s |  14 kB     00:00    
Dependencies resolved.
=====================================================================================================================================
 Package                                  Architecture       Version                                  Repository                Size
=====================================================================================================================================
Installing:
 open-source-greenplum-db-7               x86_64             7.1.0-1.el9                              @commandline              49 M
Installing dependencies:
 annobin                                  x86_64             12.12-1.el9                              appstream                977 k
 apr                                      x86_64             1.7.0-12.el9_3                           appstream                122 k
 apr-util                                 x86_64             1.6.1-23.el9                             appstream                 94 k
 apr-util-bdb                             x86_64             1.6.1-23.el9                             appstream                 12 k
...
  tar-2:1.34-6.el9_1.x86_64                                       unzip-6.0-56.el9.x86_64                                           
  zip-3.0-35.el9.x86_64                                          

Complete!

[gpadmin@rocky9-gp7-master ~]$ sudo chown -R gpadmin:gpadmin /usr/local/greenplum-db*

(the last step could also be done automatically by the package, but as it is not and the documentation recommends doing it, lets do so)

As password-less ssh is a requirement as well, let’s generate ssh keys on the coordinator node, create the authorized_keys file and then copy over the whole “.ssh” directory to the other nodes. Once this is done, password-less SSH connections should already work between the nodes:

[gpadmin@rocky9-gp7-master ~]$ ssh-keygen 
Generating public/private rsa key pair.
Enter file in which to save the key (/home/gpadmin/.ssh/id_rsa): 
Created directory '/home/gpadmin/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/gpadmin/.ssh/id_rsa
Your public key has been saved in /home/gpadmin/.ssh/id_rsa.pub
The key fingerprint is:
SHA256:+8uzanzyzxWSPhvtEHQiZ3s5+qSM9/YdUshPjEz4ojg gpadmin@rocky9-gp7-master.it.dbi-services.com
The key's randomart image is:
+---[RSA 3072]----+
|                 |
|            .    |
|          ..=..  |
|           ===+. |
|        S  .=*=+ |
|        .....*+o |
|       E..  *.+o |
|        =oo+.@o o|
|       ..=B**o+.o|
+----[SHA256]-----+

[gpadmin@rocky9-gp7-master ~]$ ssh-keygen 
[gpadmin@rocky9-gp7-master ~]$ scp -r .ssh/ rocky9-gp7-segment1:/home/gpadmin/
[gpadmin@rocky9-gp7-master ~]$ scp -r .ssh/ rocky9-gp7-segment2:/home/gpadmin/
[gpadmin@rocky9-gp7-master ~]$ ssh rocky9-gp7-segment1
Last login: Wed Feb 28 15:04:18 2024 from 192.168.122.200
[gpadmin@rocky9-gp7-segment1 ~]$ 
logout
Connection to rocky9-gp7-segment1 closed.
[gpadmin@rocky9-gp7-master ~]$ ssh rocky9-gp7-segment2
Last login: Wed Feb 28 14:50:50 2024
[gpadmin@rocky9-gp7-segment2 ~]$ 
logout
Connection to rocky9-gp7-segment2 closed.

To verify the SSH setup there is utility called “gpssh”. Before using this create a file called “hostfile_exkeys” and add all the host names which will be part of the cluster:

[gpadmin@rocky9-gp7-master ~]$ echo "rocky9-gp7-master
rocky9-gp7-segment1
rocky9-gp7-segment2" > /home/gpadmin/hostfile_exkeys

Testing the SSH setup can then be done by asking “gpssh” to execute commands on all the nodes like this:

[gpadmin@rocky9-gp7-master ~]$ /usr/local/greenplum-db/bin/gpssh -f /home/gpadmin/hostfile_exkeys -e 'ls -l /usr/local/greenplum-db'
Traceback (most recent call last):
  File "/usr/local/greenplum-db/bin/gpssh", line 32, in <module>
    from gppylib.util import ssh_utils
ModuleNotFoundError: No module named 'gppylib'

… and this fails. The reason is that the Greenplum environment is not yet set properly. This can be done by sourcing “greenplum_path.sh” into the gpadmin user’s environment:

[gpadmin@rocky9-gp7-master ~]$ tail -1 .bash_profile 
. /usr/local/greenplum-db/greenplum_path.sh
[gpadmin@rocky9-gp7-master ~]$ /usr/local/greenplum-db/bin/gpssh -f /home/gpadmin/hostfile_exkeys -e 'ls -l /usr/local/greenplum-db'

This fails again with:

Traceback (most recent call last):
  File "/usr/local/greenplum-db/bin/gpssh", line 32, in <module>
    from gppylib.util import ssh_utils
  File "/usr/local/greenplum-db-7.1.0/lib/python/gppylib/util/ssh_utils.py", line 13, in <module>
    from gppylib.commands.unix import Hostname, Echo
  File "/usr/local/greenplum-db-7.1.0/lib/python/gppylib/commands/unix.py", line 18, in <module>
    from pkg_resources import parse_version
ModuleNotFoundError: No module named 'pkg_resources'

The reason is, that the python3-setuptools package is not installed on the system. So, lets do this and try again:

[gpadmin@rocky9-gp7-master ~]$ sudo dnf install -y python3-setuptools
[gpadmin@rocky9-gp7-master ~]$ /usr/local/greenplum-db/bin/gpssh -f /home/gpadmin/hostfile_exkeys -e 'ls -l /usr/local/greenplum-db'
[rocky9-gp7-segment1] ls -l /usr/local/greenplum-db
[rocky9-gp7-segment1] lrwxrwxrwx 1 gpadmin gpadmin 29 Feb 28 14:53 /usr/local/greenplum-db -> /usr/local/greenplum-db-7.1.0
[  rocky9-gp7-master] ls -l /usr/local/greenplum-db
[  rocky9-gp7-master] lrwxrwxrwx 1 gpadmin gpadmin 29 Feb 28 14:52 /usr/local/greenplum-db -> /usr/local/greenplum-db-7.1.0
[rocky9-gp7-segment2] ls -l /usr/local/greenplum-db
[rocky9-gp7-segment2] lrwxrwxrwx 1 gpadmin gpadmin 29 Feb 28 14:53 /usr/local/greenplum-db -> /usr/local/greenplum-db-7.1.0

Now everything looks fine and we can proceed with creating the “Data Storage Areas”, but this will be the topic of the next post.

L’article Getting started with Greenplum – 1 – Installation est apparu en premier sur dbi Blog.

SQL Server: Manage large data ranges using partitioning

Mon, 2024-02-26 10:59
Introduction: 

When it comes to moving ranges of data with many rows across different tables in SQL-Server, the partitioning functionality of SQL-Server can provide a good solution for manageability and performance optimizing. In this blog we will look at the different advantages and the concept of partitioning in SQL-Server.

Concept overview: 

In SQL-Server, partitioning can divide the data of an index or table into smaller units. These units are called partitions. For that purpose, every row is assigned to a range and every range in turn is assigned to a specific partition. Practically there are two main components: The partition function and the partition scheme.  

The partition function defines the range borders through boundary values and thus the number of partitions, in consideration with the data values, as well. You can define a partition function either as “range right” or “range left”. The main difference is how the boundary value gets treated. In a range right partition function, the boundary value is the first value of the next partition while in a range left partition function the boundary value is the last value of the previous partition. For example: 

We want to partition a table by year and the datatype of the column where we want to apply the partition function has the datatype “date”. Totally we have entries for the year 2023 and 2024 which means, we want 2x partitions. In a range right function, the boundary value must be the first day of the year 2024 whereas in a range left function the boundary value must be the last day of the year 2023.  

See example below: 

 Partition right  Partition left

The partition scheme is used to map the different partitions, which are defined through the partition function, to multiple or one filegroup.

Main benefits of partitioning:

There are multiple scenarios where performance or manageability of a data model can be increased through partitioning. The main advantage of partitioning is that it reduces the contention on the whole table as a database object and restricts it to the partition level when performing operations on the corresponding data range. Partitioning also facilitates data transfer with the “switch partition” statement, this statement performs a switch-in or switch-out of o whole partition. Through that, a large amount of data can be transferred very quickly.

Demo Lab:

For demo purposes I created the following script, which will create three tables with 5 million rows of historical data from 11 years in the past until today:

USE [master] 
GO 
 
--Create Test Database 
CREATE DATABASE [TestPartition] 
GO 
 
--Change Recovery Model 
ALTER DATABASE [TestPartition] SET RECOVERY SIMPLE WITH NO_WAIT 
GO 
 
--Create Tables 
Use [TestPartition] 
GO 
 
CREATE TABLE [dbo].[Table01_HEAP]( 
[Entry_Datetime] [datetime] NOT NULL, 
[Entry_Text] [nvarchar](50) NULL 
) 
GO 
 
CREATE TABLE [dbo].[Table01_CLUSTEREDINDEX]( 
[Entry_Datetime] [datetime] NOT NULL, 
[Entry_Text] [nvarchar](50) NULL 
) 
GO 
 
CREATE TABLE [dbo].[Table01_PARTITIONED]( 
[Entry_Datetime] [datetime] NOT NULL, 
[Entry_Text] [nvarchar](50) NULL 
) 
GO 
 
--GENERATE DATA 
 
declare @date as datetime 
 
declare @YearSubtract int 
declare @DaySubtract int 
declare @HourSubtract int 
declare @MinuteSubtract int  
declare @SecondSubtract int  
declare @MilliSubtract int 
 
--Specifiy how many Years backwards data should be generated 
declare @YearsBackward int 
set @YearsBackward = 11 
 
--Specifiy how many rows of data should be generated 
declare @rows2generate int 
set @rows2generate = 5000000 
 
 
declare @counter int 
set @counter = 1 
 
--generate data entries 
while @counter <= @rows2generate  
begin 
 
--Year 
Set @YearSubtract = floor(rand() * (@YearsBackward - 0 + 1)) + 0 
--Day 
Set @DaySubtract = floor(rand() * (365 - 0 + 1)) + 0 
--Hour 
Set @HourSubtract = floor(rand() * (24 - 0 + 1)) + 0 
--Minute 
Set @MinuteSubtract = floor(rand() * (60 - 0 + 1)) + 0 
--Second 
Set @SecondSubtract = floor(rand() * (60 - 0 + 1)) + 0 
--Milisecond 
Set @MilliSubtract = floor(rand() * (1000 - 0 + 1)) + 0 
 
 
set @date = Dateadd(YEAR, -@YearSubtract , Getdate()) 
set @date = Dateadd(DAY, -@DaySubtract , @date) 
set @date = Dateadd(HOUR, -@HourSubtract , @date) 
set @date = Dateadd(MINUTE, -@MinuteSubtract , @date) 
set @date = Dateadd(SECOND, -@SecondSubtract , @date) 
set @date = Dateadd(MILLISECOND, @MilliSubtract , @date) 
 
insert into Table01_HEAP (Entry_Datetime, Entry_Text) 
Values (@date, 'This is a entry from ' + convert(nvarchar, @date, 29)) 
 
set @counter = @counter + 1 
 
end 
 
--COPY DATA TO OTHER TABLES 
 
INSERT INTO dbo.Table01_CLUSTEREDINDEX 
  (Entry_Datetime, Entry_Text) 
SELECT Entry_Datetime, Entry_Text 
  FROM Table01_HEAP 
 
INSERT INTO dbo.Table01_PARTITIONED 
  (Entry_Datetime, Entry_Text) 
SELECT Entry_Datetime, Entry_Text 
  FROM Table01_HEAP 
 
--Create Clustered Indexes for dbo.Table01_CLUSTEREDINDEX and dbo.Table01_PARTITIONED 
 
CREATE CLUSTERED INDEX [ClusteredIndex_Table01_CLUSTEREDINDEX] ON [dbo].[Table01_CLUSTEREDINDEX] 
( 
[Entry_Datetime] ASC 
 
) on [PRIMARY] 
GO 
 
CREATE CLUSTERED INDEX [ClusteredIndex_Table01_PARTITIONED] ON [dbo].[Table01_PARTITIONED] 
( 
[Entry_Datetime] ASC 
 
) on [PRIMARY] 
GO 

The tables have the same data in it. The difference between the tables is, that one is a heap, one has clustered index and one has a clustered Index which will be partitioned in the next step:

 Generated data

After the tables are created with the corresponding data and indexes, the partition function and scheme must be created. This was done by the following script:

-- Create Partition Function as range right for every Year -10 Years 
Create Partition Function [myPF01_datetime] (datetime) 
AS Range Right for Values ( 
DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) + 0, 0), DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 1, 0), DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 2, 0), 
DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 3, 0), DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 4, 0), DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 5, 0),  
DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 6, 0), DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 7, 0), DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 8, 0),  
DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 9, 0), DATEADD(yy, DATEDIFF(yy, 0, GETDATE()) - 10, 0) 
); 
GO 
 
-- Create Partition Scheme for Partition Function myPF01_datetime 
CREATE PARTITION SCHEME [myPS01_datetime] 
AS PARTITION myPF01_datetime ALL TO ([PRIMARY]) 
GO 

I have used the DATEADD() function in combination with the DATEDIFF() function to retrieve the first millisecond of the year as datetime data type and that for the last 10 years and used this as range right boundary values. For sure it is also possible to hard code the boundary values like ‘2014-01-01 00:00:00.000’ but I prefer to keep it as dynamically as possible. At the end it is the same result:

 Select Dateadd - function

After creating the partition function, I have created the partition scheme. The partition scheme is mapped to the partition function. In my case I assign every partition to the primary filegroup. It is also possible to split the partitions across multiple filegroups.

As far as the partition function and scheme are created successfully it can be applied to the existing table: Table01_PARTITIONED. For achieving that, the clustered index of the table must be recreated on the partition scheme instead of the primary filegroup: 

-- Apply partitiononing on Table: Table01_PARTITIONED through recreating the Tables Clustered Index ClusteredIndex_Table01_PARTITIONED on Partition Scheme myPS01_datetime 
CREATE CLUSTERED INDEX [ClusteredIndex_Table01_PARTITIONED] ON [dbo].[Table01_PARTITIONED] 
( 
[Entry_Datetime] ASC 
 
) with (DROP_EXISTING = ON) on myPS01_datetime(Entry_Datetime);  
GO 

After doing that, the Table Table01_PARTITIONED has multiple partitions while the other tables have still only one partition: 

 Partitions of partitioned table  Partitions of clustered index table

There are at all 12 partitions for every year between 2014 and 2024 as well as one for every entry which has an earlier datetime than 2014-01-01 00:00:00.000 and one for every entry that has a later datetime value than 2024-01-01 00:00:00.000 while partition nr. 1 has the earliest data and partition nr. 12 has the latest data in it. See below: 

 Content of partition 1  content of partition 12 DEMO Tests:

First, I want to compare the performance when moving outdated data, which is older than 2014-01-01 00:00:00.000, from the table itself to a history table. For that purpose, I created a history table with the same data structure as the table Table01_CLUSTEREDINDEX:

Use [TestPartition] 
GO 
 
--Create History Table 
CREATE TABLE [dbo].[Table01_HISTORY01]( 
[Entry_Datetime] [datetime] NOT NULL, 
[Entry_Text] [nvarchar](50) NULL 
) 
GO 
 
--Create Clustered Indexes for dbo.Table01_HISTORY01 
CREATE CLUSTERED INDEX [ClusteredIndex_Table01_HISTORY01] ON [dbo].[Table01_HISTORY01] 
( 
[Entry_Datetime] ASC 
 
) on [PRIMARY] 
GO 

I am starting first with the table with the clustered index with a classic “insert into select” statement:

 Select insert data into history

We can see that we have 10932 reads in total and a total query run time of 761 milliseconds.

 Execution plan select insert

In the execution plan, we can see that a classical Index seek operation occurred. Which means, the database engine seeked for every row which has a datetime value previous to 2014-01-01 00:00:00.000 and wrote it into the history table.

For the delete operation we can see similar results:

 delete rows  Delete rows

Totally 785099 rows where moved and we have in the table Table01_CLUSTEREDINDEX no older entries than 2014-01-01 00:00:00.000 anymore:

 Verify table content

Next let us compare the data movement when using a “switch partition” statement. For switching a partition from a partitioned source table to a nonpartitioned destination table, we need to use the partition number of the source table. For that I run the following query:

 Switch partition

We can see that the partition number 1 was moved within 2 milliseconds. Compared to the previous query where it took 761 milliseconds for inserting the data and an additional 596 milliseconds for deleting the data, the switch partition operation is obviously much faster. But why is this the case? – that’s because switching partitions is a metadata operation. It does not seeking through an index (or even worse – scanning a table) and write every row one by one, instead it changes the metadata of the partition and remaps the partition to the target table. 

And as we can see, we have the same result:

 verify table content

Another big advantage is when it comes to deleting a whole data range. For example: Let us delete the entries of the year 2017 – we do not need them anymore.

For the table with the clustered Index, we must use a statement like this:

 delete operation

We can see that we have here a query runtime of 355 milliseconds and 68351 page reads in total for the delete operation with the clustered index. 

For the partitioned table instead, we can use a truncate operation on the specific partition. That’s because the partition is treated as a own physical unit and can for that be truncated.

And as we should know: Truncating is much faster, because this operation is deallocating the pages and writes only one transaction log entry for the page deallocation while a delete operation is going row by row and writes every row deletion in the transaction log.

So, let us try: The year 2017 is 7 years back so let us verify, that the right data range will be deleted:

 verify partition content

We can see with the short query above: 7 Years back, that would be the partition nr. 5 and the data range seems to be right.  So, let us truncate:

 Truncate partition

And we can see to truncate all the entries from the year 2017, the database engine took 1 millisecond compared to the 355 seconds for the delete operation again much faster.

Next: let’s see, how we can change the lock behavior of SQL-Server through partitioning. For that I ran the following update query for updating every entry text for dates which are younger than May 2018:

 Update data entries

While the update operation above was running, I queried the DMV sys.dm_tran_locks in another session for checking the locks my update operation above is holding: 

 lock contention

And we can see that we have a lot of page locks and also an exclusive lock on the object itself (in this case the Table01_HEAP).  That is because of SQL-Servers lock escalation behavior.

I ran the same update operation on the partitioned table but before I changed the lock escalation setting of the table from default value “table” to “auto”. This is necessary for enable locking on partition level: 

 Update lock escalation

And when I’m querying the dmv again while the update operation above is running, I get the following result:

 lock contention

We can see that we have no exclusive look on abject level anymore, we have an intended exclusive look, which will not prevent other transactions from accessing the data (as far as it has no other look on a more granular level). Instead, we have multiple exclusive looks on multiple resources called HOBT. And when we take a look at the “resource_associated_entity_id” and using them for querying the sys.partitions table, we can see the following information’s: 

 locked partitions

These resources locked through the update operation on the partitioned table are the partitions associated with the table. So, SQL-Server locked the partitions instead of locking the whole table. This has the advantage that locking happens in a more granular context which prevents lock contention on the table itself. 

Conclusion:

Partitioning can be a very powerful and useful functionality in SQL-Server when used in an appropriate situation. Especially when it comes to regular operations on whole data ranges, partitioning can be used for enhancing performance and manageability. With partitioning, it’s also possible to distribute the data of a table over multiple files groups. Additionally with splitting and merging partitions it’s possible to maintain partitions for growing or shrinking data.

L’article SQL Server: Manage large data ranges using partitioning est apparu en premier sur dbi Blog.

Kubernetes Networking by Using Cilium – Intermediate Level – Traditional Linux Routing

Mon, 2024-02-26 02:34

Welcome back in this blog post series about Kubernetes Networking by using Cilium. In the previous post about network interfaces, we’ve looked at how we can identify all the interfaces that will be involved in the routing between pods. I’ve also explained the routing in a Kubernetes cluster with Cilium in a non technical language in this blog post. Let’s now see it into actions for the techies!

Below is the drawing of where we left off:

We will continue to use the same method, you are the packet that will travel from your apartment (pod) 10.10.2.117 on the top left to other pods in this Kubernetes cluster. But first, let’s take this opportunity to talk about namespace and enrich our drawing with a new analogy.

Routing between namespaces

A namespace is a logical group of objects that provide isolation in the cluster. However, by default, all pods can communicate together in a “vanilla” Kubernetes. Whatever they belong to the same namespace or not. So this isolation provided by namespace doesn’t mean the pods can’t communicate together. To allow or deny such communication, you will need to create network policies. That could be the topic for another blog post!

We can use the analogy of a namespace being the same floor number of all building of our cluster. All apartments on the same floor in each building will be logically grouped into the same namespace. This is what we can see below in our namespace called networking101:

$ kubectl get po -n networking101 -owide
NAME                        READY   STATUS    RESTARTS       AGE    IP            NODE                NOMINATED NODE   READINESS GATES
busybox-c8bbbbb84-fmhwc     1/1     Running   1 (125m ago)   4d1h   10.10.1.164   mycluster-worker2   <none>           <none>
busybox-c8bbbbb84-t6ggh     1/1     Running   1 (125m ago)   4d1h   10.10.2.117   mycluster-worker    <none>           <none>
netshoot-7d996d7884-fwt8z   1/1     Running   0              103m   10.10.2.121   mycluster-worker    <none>           <none>
netshoot-7d996d7884-gcxrm   1/1     Running   0              103m   10.10.1.155   mycluster-worker2   <none>           <none>

That’s our 4 apartments / pods on the same floor, grouped together in one namespace:

The routing process doesn’t care about the pod’s namespace, only its destination IP Address will be used. Let’s now see how we can go from the apartment 10.10.2.117 to the apartment 10.10.2.121 in the same building (node).

Pod to pod routing on the same node

From the pod 10.10.2.117, you’ve then decided to go to pay a visit to 10.10.2.121. You first look at the routing table in order to know how to reach this destination. But you can’t go out if you don’t also have the MAC Address of your destination. You need both destination information (IP Address and MAC Address) before you can start to travel. You then look at the ARP table to find out this information. The ARP table contains the known mapping of a MAC Address to an IP Address in your IP subnet. If it is not there, you send first a scout to knock at the door of each apartment in your community until you find the MAC Address of your destination. This is called the ARP request. When the scout comes back with that information, you write it into the ARP table. You thank the scout for his help and are now ready to start your travel by exiting the pod.

Let’s see how we can trace this in our source pod 10.10.2.117

$ kubectl exec -it -n networking101 busybox-c8bbbbb84-t6ggh -- ip route
default via 10.10.2.205 dev eth0
10.10.2.205 dev eth0 scope link

Very simple routing instruction! For every destination, you go through 10.10.2.205 by using your only network interface eth0 in the pod. You can see from the drawing above that 10.10.2.205 is the IP Address of the cilium_host. You then check your ARP table:

$ kubectl exec -it -n networking101 busybox-c8bbbbb84-t6ggh -- arp -a

The arp -a command list the content of the ARP table and we can see there is nothing in there.

A way to send a scout out is by using the arping tool toward our destination. You may have noticed that for my pods I’m using busybox and netshoot images. Both provide networking tools that are useful for troubleshooting:

$ kubectl exec -it -n networking101 busybox-c8bbbbb84-t6ggh -- arping 10.10.2.121
ARPING 10.10.2.121 from 10.10.2.117 eth0
Unicast reply from 10.10.2.121 [d6:21:74:eb:67:6b] 0.028ms
Unicast reply from 10.10.2.121 [d6:21:74:eb:67:6b] 0.092ms
Unicast reply from 10.10.2.121 [d6:21:74:eb:67:6b] 0.123ms
^CSent 3 probe(s) (1 broadcast(s))
Received 3 response(s) (0 request(s), 0 broadcast(s))

We now have the piece of information that was missing, the MAC address of our destination. We can then just check it is written into our ARP table of our source pod:

$ kubectl exec -it -n networking101 busybox-c8bbbbb84-t6ggh -- arp -a
? (10.10.2.205) at d6:21:74:eb:67:6b [ether]  on eth0

Here it is! However you may wonder why we don’t see here the IP Address of our destination 10.10.2.121 right? In traditional networking this is what you will see but here we are in a Kubernetes cluster and we are using Cilium that is taking care of the networking in it. Also we have seen above from the routing table of the source pod that for every destination we go to this cilium_host interface.

So the cilium_host on that node is attracting all the traffic even for communication between pods in the same IP subnet.

As a side note, below is a command where you can quickly display all the IP Addresses of the cilium_host and the nodes in your cluster in one shot:

$ kubectl get ciliumnodes
NAME                      CILIUMINTERNALIP   INTERNALIP   AGE
mycluster-control-plane   10.10.0.54         172.18.0.3   122d
mycluster-worker          10.10.2.205        172.18.0.2   122d
mycluster-worker2         10.10.1.55        172.18.0.4   122d

In traditional networking, doing L2 switching, the MAC Address of the destination is the one related to the destination IP Address. That is not the case here in Kubernetes networking. So which interface has the MAC Address d6:21:74:eb:67:6b ? Let’s respond to that question immediately:

$ sudo docker exec -it mycluster-worker ip a | grep -iB1 d6:21:74:eb:67:6b
9: lxc4a891387ff1a@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether d6:21:74:eb:67:6b brd ff:ff:ff:ff:ff:ff link-netns cni-67a5da05-a221-ade5-08dc-64808339ad05

That is the LXC interface of the node as it is indeed our next step from the source pod to reach our destination. You’ve learned from my first post blog of this networking series that there is a servant waiting here at the LXC interface to direct us toward our destination.

From there, we don’t see much of the travel to the destination from the traditional Linux routing point of view. This is because the routing is done by the Cilium agent using eBPF. As the destination is in the same IP subnet as the source, the Cilium agent just switch it directly to the destination LXC interface and then reach the destination pod.

When the destination pod responds to the source, the same process occurs and for the sake of completeness let’s look at the routing table and ARP table in the destination pod:

$ kubectl exec -it -n networking101 netshoot-7d996d7884-fwt8z -- ip route
default via 10.10.2.205 dev eth0 mtu 1450
10.10.2.205 dev eth0 scope link

$ kubectl exec -it -n networking101 netshoot-7d996d7884-fwt8z -- arp -a
? (10.10.2.205) at 92:65:df:09:dd:28 [ether]  on eth0

$ sudo docker exec -it mycluster-worker ip a | grep -iB1 92:65:df:09:dd:28
13: lxce84a702bb02c@if12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 92:65:df:09:dd:28 brd ff:ff:ff:ff:ff:ff link-netns cni-d259ef79-a81c-eba6-1255-6e46b8d1c779

So from the traditional Linux routing point of view, everything goes to the cilium_host and the destination MAC address is the LXC interface of the node that is linked to our pod. This is exactly the same we have seen with our source pod.

Pod to pod routing on a different node

Let’s now have a look at how we could reach the pod 10.10.1.155 from the source pod 10.10.2.117 which is hosted in another node. The routing is the same at the beginning but when talking to the servant at the LXC interface, he sees that the destination IP Address doesn’t belong to the same IP subnet and so directs us to the cilium_host in the Lobby. From there we are routed to the cilium_vxlan interface to reach the node that host our destination pod.

Let’s now have a look at the routing table of the host:

$ sudo docker exec -it mycluster-worker ip route
default via 172.18.0.1 dev eth0
10.10.0.0/24 via 10.10.2.205 dev cilium_host proto kernel src 10.10.2.205 mtu 1450
10.10.1.0/24 via 10.10.2.205 dev cilium_host proto kernel src 10.10.2.205 mtu 1450
10.10.2.0/24 via 10.10.2.205 dev cilium_host proto kernel src 10.10.2.205
10.10.2.205 dev cilium_host proto kernel scope link
172.18.0.0/16 dev eth0 proto kernel scope link src 172.18.0.2

We don’t see much here as the routing is using eBPF and is managed by the Cilium agent as we’ve seen before.

As a side note and to share everything with you, the output of the network interfaces as well as the ip route in the Cilium agent pod is identical to the one of the node. This is because at startup the Cilium agent provides these information to the node. You can check the Cilium agent with the following commands:

$ kubectl exec -it -n kube-system cilium-dprvh -- ip a
$ kubectl exec -it -n kube-system cilium-dprvh -- ip route

So you go through the VXLAN tunnel and you reach the node mycluster-worker2. Here is the routing table of this node:

$ sudo docker exec -it mycluster-worker2 ip route
default via 172.18.0.1 dev eth0
10.10.0.0/24 via 10.10.1.55 dev cilium_host proto kernel src 10.10.1.55 mtu 1450
10.10.1.0/24 via 10.10.1.55 dev cilium_host proto kernel src 10.10.1.55
10.10.1.55 dev cilium_host proto kernel scope link
10.10.2.0/24 via 10.10.1.55 dev cilium_host proto kernel src 10.10.1.55 mtu 1450
172.18.0.0/16 dev eth0 proto kernel scope link src 172.18.0.4

Again from the traditional Linux routing point of view there isn’t much to see, except that all the traffic for the pods subnet are going to the cilium_host that is managed by the Cilium agent. This is identical as what we’ve learned in the other node. When we reach the cilium_vxlan interface, a servant is waiting for us with his magical eBPF map and directs us through a secret passage to the LXC corridor interface of the top left pod where we can reach our destination.

Wrap up

We’ve explored all that can be seen in routing from the traditional Linux point of view by using the common networking tools.

Maybe you feel frustrated to not understand it completely because there are some gaps in this step-by-step packet routing? Cilium uses eBPF for routing the packets so it adds some complexity to the routing understanding. However it is much faster than the traditional Linux routing due to the secret passages opened by the eBPF servants.

If you want to know more about this, don’t miss my next blog post where I’ll dive deep into the meanders of eBPF routing. See you there!

L’article Kubernetes Networking by Using Cilium – Intermediate Level – Traditional Linux Routing est apparu en premier sur dbi Blog.

Physical Online Migration to ExaCC with Oracle Zero Downtime Migration (ZDM)

Sun, 2024-02-25 14:06

A while ago I had been testing and blogging about ZDM, see my previous articles. And I finally had the chance to implement it at one of our customer to migrate on-premises database to Exadata Cloud @Customer. After having been implementing Logical Offline migration with ZDM, see my previous article, https://www.dbi-services.com/blog/logical-offline-migration-to-exacc-with-oracle-zero-downtime-migration-zdm/, I had the opportunity to implement a Physical Online Migration and testing it with one database taken as pilot. In this article I would like to share with you my experience I could get from migrating an on-premises database to ExaCC using ZDM Physical Online Migration. This method will use Data Guard and we can then only use it to migrate Oracle Enterprise Edition databases. We call it Physical Online, because ZDM will create a Standby database either through a backup or with Direct Data Transfer (active duplication with rman or restore from service) and synchronise it with the primary. During all the preparation the database is still available for the application, and the maintenance windows will be shorter with less downtime. It will just be needed for the switchover operation. Of course ZDM can include non-cdb to pdb conversion which will make it a little bit longer. The Physical Online is the only ZDM method including fallback. We intended to use this method at customer side, as mandatory one for Large Oracle EE databases, and preferred one for Small Oracle EE databases.

Read more: Physical Online Migration to ExaCC with Oracle Zero Downtime Migration (ZDM) Introduction

The on-premise databases are single-tenant (non-cdb) database running for all version 19.10.

The target databases are Oracle RAC databases running on ExaCC with Oracle version 19.21.

The Oracle Net port used on the on-premise site is 13000 and the Oracle Net port used on the ExaCC is 1521.

We will use ZDM to migrate the on-premise single-tenant database, to a PDB within a CDB. ZDM will then be in charge of migrating the database to the exacc using Data Guard, run datapatch, convert non-cdb database to pdb within a target cdb, upgrade Time Zone. The creation of the standby database will be done through a direct connection. Without any backup.

Of course I have anonymised all outputs to remove customer infrastructure names. So let’s take following convention.

ExaCC Cluster 01 node 01 : ExaCC-cl01n1
ExaCC Cluster 01 node 02 : ExaCC-cl01n2
On premises Source Host : vmonpr
Target db_unique_name on the ExaCC : ONPR_RZ2
Database Name to migrate : ONPR
ZDM Host : zdmhost
ZDM user : zdmuser
Domain : domain.com
ExaCC PDB to migrate to : ONPRZ_APP_001T

We will then migrate on-premise Single-Tenant database, named ONPR, to a PDB on the ExaCC. The PDB will be named ONPRZ_APP_001T.

Ports

It is important to mention that following ports are needed:

SourceDestinationPort ZDM HostOn-premise Host22 ZDM HostExaCC VM (both nodes)22 On-premise HostExaCC VM (scan listener and vip)Oracle Net (1521) ExaCCOn-premise HostOracle Net

If Oracle Net ports are for example not opened from the Exacc to the on-premise host, the migration evaluation will immediately stopped at one of the first steps named ZDM_PRECHECKS_TGT, and following errors will be found in the log file:

PRGZ-1132 : -eval failed for the phase ZDM_PRECHECKS_TGT with exception
PRGZ-3176 : a database connection cannot be established from target node ExaCC-cl01n1 to source node vmonpr
PRCC-1021 : One or more of the submitted commands did not execute successfully.
PRCZ-2103 : Failed to execute command "/u02/app/oracle/product/19.0.0.0/dbhome_2/bin/tnsping" on node "ExaCC-cl01n1" as user "root". Detailed error:
TNS Ping Utility for Linux: Version 19.0.0.0.0 - Production on 06-FEB-2024 10:06:37

Copyright (c) 1997, 2023, Oracle.  All rights reserved.

Used parameter files:

Used HOSTNAME adapter to resolve the alias
Attempting to contact (DESCRIPTION=(CONNECT_DATA=(SERVICE_NAME=))(ADDRESS=(PROTOCOL=tcp)(HOST=X.X.X.15)(PORT=13000)))
TNS-12535: TNS:operation timed out
PRCC-1025 : Command submitted on node ExaCC-cl01n2 timed out after 60 seconds..

If Oracle Net ports are for example not opened from the on-premise host to ExaCC, the migration evaluation will immediately stopped at one of the other steps named ZDM_PRECHECKS_SRC, and following errors will be found in the log file:

PRGZ-1132 : -eval failed for the phase ZDM_PRECHECKS_SRC with exception
PRGZ-3130 : failed to establish connection to target listener from nodes [vmonpr]
PRCC-1021 : One or more of the submitted commands did not execute successfully.
PRCC-1025 : Command submitted on node vmonpr timed out after 15 seconds...

Requirements… To be known before starting

There are few requirements that are needed.

ZDM Host
  • SSH connection allowed between Source and Target host
  • SSH authentication key pairs without passphrase should be established and tested for the user between ZDM host and both source and target database
Source Database
  • Both source and target need to be on the same release version.
  • Transparent Data Encryption (TDE) wallet must be configured (even if source database is not encrypted)
  • WALLET_TYPE should be set to either AUTOLOGIN or PASSWORD
  • Wallet STATUS should be OPEN
  • Wallet should be opened on all PDB in case the source is a container Database
  • The master key must be set for all the PDB and the container database
  • In case the source database is a RAC database, SNAPSHOT CONTROLFILE must be configured on a shared location on all cluster nodes
  • SCAN listener/listener connections allowed on both source and target DB
  • DB_UNIQUE_NAME parameter must be different than target database
  • SYSPASSWORD must be the same on the source and target database
Target Database
  • Database must be created prior the migration
  • Database release version should match source version.
  • The target database patch level should also be the same or higher than the source database. In case the target database patch level is higher, ZDM can be configured to run datapatch on the target database. Target database patch level can not be lower than source database.
  • For Oracle RAC databases, SSH connectivity between nodes for the oracle user should be setup
  • Storage size should be sufficient (same as source database)
  • DB_NAME parameter must be the same than the source database
  • DB_UNIQUE_NAME parameter must be different than the one on the source database
  • Automatic backups should be disabled (for ExaC@C section configure backups, option backup destination, none should be selected)
  • TDE should be activated
  • Wallet should be open and WALLET_TYPE should be set to either AUTOLOGIN or PASSWORD
  • SYSPASSWORD must be the same on the source and target database
Others
  • Ensure that all ports have been opened.
  • Oracle NET Services should be configured and tested on both source and target database for Data Guard synchronisation and deployment of the standby database with active duplication
  • We will need to go through a temporary multitenant database on the ExaCC which will have same DB_NAME than the source and different DB_UNIQUE_NAME. This CDB will host the final PDB.
  • The final PDB can then be relocated to the appropriate final CDB on the ExaCC laster on.
  • ZDM will create its own temporary database with DB_NAME as source database and DB_UNIQUE_NAME as final PDB name to build the Data Guard and will remove it during cleanup phase
Prepare ZDM Physical Online Response file

We will prepare the ZDM response file that will be used, by copying the template provided by ZDM:

[zdmuser@zdmhost migration]$ cp -p /u01/app/oracle/product/zdm/rhp/zdm/template/zdm_template.rsp ./zdm_ONPR_physical_online.rsp

The main parameters to take care of are :

ParameterExplanation TGT_DB_UNIQUE_NAMETarget database DB_UNIQUE_NAME.
For Cloud type Exadata Cloud at Customer (EXACC) Gen2, Exadata Cloud Service (EXACS)
• db_name – The target database db_name should be the same as the source database db_name
• db_unique_name: The target database db_unique_name parameter value must be unique to ensure that Oracle Data Guard can identify the target as a different database from the source database MIGRATION_METHODSpecifies if the migration will uses Oracle Data Guard (online) or backup and restore (offline). We are here using online migration so parameter will need to be setup with ONLINE_PHYSICAL value. DATA_TRANSFER_MEDIUMSpecifies the media used to create the standby database either through a backup using NFS or ZDLRA for example or a direct connection where the standby will be instantiated directly from source using SQL*Net connectivity (duplicate from active or restore from service).
Choose DIRECT as we are doing Physical Direct Data Transfer PLATFORM_TYPETarget Platform Type.
To be EXACC in our case. SRC_PDB_NAMESource database PDB Name.
Not needed here as all our on-premises database are Single Tenant. SRC_DB_LISTENER_PORTTo be used when there is Standalone Database (no Grid Infrastructure) configured with non-default SCAN listener port other than 1521.
To be 13000 in our case. NONCDBTOPDB_CONVERSIONSpecifies to convert a non-CDB source to PDB.
To be TRUE as we wan to convert our on-premises database to PDB during ZDM migration. NONCDBTOPDB_SWITCHOVERFor a physical migration using Data Guard switchover, indicates whether the switchover operations will be executed during a migration job with non-CDB to PDB conversion enabled.
Default is TRUE, to be kept as TRUE in our case. SKIP_FALLBACKIf setup to FALSE, the redo logs will be shipped from the new primary (ExaCC) once the switchover is completed, to the standby (on-premise database) in case of fallback is needed.
To be FALSE as we want fallback. TGT_RETAIN_DB_UNIQUE_NAMEAllow to add a new phase ZDM_RETAIN_DBUNIQUENAME_TGT (can also be ZDM_MODIFY_DBUNIQUENAME_TGT). Need to pause before this phase to keep the ZDM temporary database after the PDB conversion and in case fallback is needed. Resume the job once all is ok and the fallback will need to be removed. TGT_SKIP_DATAPATCHIf set to FALSE ZDM will run datapatch on the target database as part of the post-migration tasks. Useful in case Target patch is in a higher version than source patch.
To be FALSE as we target version is higher than source and we want ZDM to run datapatch. SHUTDOWN_SRCSpecifies to shutdown or not the source database after the migration completes.
To be FALSE. SRC_RMAN_CHANNELSNumber of RMAN channel on the source TGT_RMAN_CHANNELSNumber of RMAN channel on the destination ZDM_SKIP_DG_CONFIG_CLEANUPIf FALSE ZDM will deconfigure DataGuard parameters configured for migration on the source and target database at the end of the migration. ZDM_RMAN_DIRECT_METHODRMAN method to use for ONLINE_PHYISCAL direct data transfer, either using RMAN active duplicate or restore from service.
We kept default RESTORE FROM SERVICE. ZDM_USE_DG_BROKERIf TRUE ZDM will use Data Guard Broker for managing Data Guard configuration.
To be TRUE. ZDM_NONCDBTOPDB_PDB_NAMEWhen migrating non-CDB source to CDB target as a PDB, the PDB name to be used. ZDM_TGT_UPGRADE_TIMEZONEUpgrade target database time zone. Will required downtime for the database
To be TRUE ZDM_APPLY_LAG_MONITORING_INTERVALApply lag monitoring interval to verify both source and target for switchover ready.
Keep NONE.

Note that there is no parameter for the listener port on the target (ExaCC) so assuming this is hard coded to use default 1521 port.

Also note that as we configured SHUTDOWN_SRC as FALSE, additionnal steps will be required to ensure that application do not use the SOURCE (on-premise) database any more.

Updated ZDM response file compared to ZDM template for the migration we are going to run:

[zdmuser@zdmhost migration]$ diff zdm_ONPR_physical_online.rsp /u01/app/oracle/product/zdm/rhp/zdm/template/zdm_template.rsp
24c24
< TGT_DB_UNIQUE_NAME=ONPR_RZ2
---
> TGT_DB_UNIQUE_NAME=
32c32
< MIGRATION_METHOD=ONLINE_PHYSICAL
---
> MIGRATION_METHOD=
63c63
< DATA_TRANSFER_MEDIUM=DIRECT
---
> DATA_TRANSFER_MEDIUM=
75c75
< PLATFORM_TYPE=EXACC
---
> PLATFORM_TYPE=
119c119
< SRC_DB_LISTENER_PORT=13000
---
> SRC_DB_LISTENER_PORT=
230c230
< NONCDBTOPDB_CONVERSION=TRUE
---
> NONCDBTOPDB_CONVERSION=FALSE
252c252
< SKIP_FALLBACK=FALSE
---
> SKIP_FALLBACK=
268c268
< TGT_RETAIN_DB_UNIQUE_NAME=TRUE
---
> TGT_RETAIN_DB_UNIQUE_NAME=
312c312
< SHUTDOWN_SRC=FALSE
---
> SHUTDOWN_SRC=
333c333
< SRC_RMAN_CHANNELS=3
---
> SRC_RMAN_CHANNELS=
340c340
< TGT_RMAN_CHANNELS=6
---
> TGT_RMAN_CHANNELS=
526c526
< ZDM_USE_DG_BROKER=TRUE
---
> ZDM_USE_DG_BROKER=
574c574
< ZDM_NONCDBTOPDB_PDB_NAME=ONPRZ_APP_001T
---
> ZDM_NONCDBTOPDB_PDB_NAME=
595c595
< ZDM_TGT_UPGRADE_TIMEZONE=TRUE
---
> ZDM_TGT_UPGRADE_TIMEZONE=FALSE

ZDM Build Version

We are using ZDM build version 21.4:

[zdmuser@zdmhost migration]$ /u01/app/oracle/product/zdm/bin/zdmcli -build
version: 21.0.0.0.0
full version: 21.4.0.0.0
patch version: 21.4.1.0.0
label date: 221207.25
ZDM kit build date: Jul 31 2023 14:24:25 UTC
CPAT build version: 23.7.0

Passwordless Login

Passwordless Login needs to be configured between ZDM Host, the Source Host and Target Host. See my previous blog : https://www.dbi-services.com/blog/oracle-zdm-migration-java-security-invalidkeyexception-invalid-key-format/

If Passwordless Login is not configured with one node, you will see such error in the log file during migration evaluation:

PRCZ-2006 : Unable to establish SSH connection to node "ExaCC-cl01n2" to execute command "/u02/app/oracle/product/19.0.0.0/dbhome_2/bin/tnsping vmonpr:13000"
Creation of the target database

As explained in the requirements, we must create a target CDB on the ExaCC with same DB_NAME as the source database to be migrated but other DB_UNIQUE_NAME. In our case it will be ONPR for the DB_NAME and ONPR_RZ2 for the DB_UNIQUE_NAME. This database must exist before the migration is started with ZDM. ZDM will create another temporary database taking the final PDB name and will use this target CDB as a template.

TDE (Transparent Data Encryption) configuration

The source database doesn’t need to be encrypted. The target database will be encrypted in any case. ZDM supports the migration of an encrypted and non-encrypted source database. The target database encryption will be taken in account during migration process. Even if the source database is not encrypted a TDE wallet still needs to be configured prior the migration as ZDM will use it to encrypt data to the target.

We need to note that a downtime is needed to reboot the database when wallet_root parameter needs to be configured.

Also until the migration is completed it is more than recommended that the wallet is part of the backup strategy.

Configure instance parameter

Check that the WALLET directory exits otherwise create it:

SQL> !ls /u00/app/oracle/admin/ONPR/wallet
ls: cannot access /u00/app/oracle/admin/ONPR/wallet: No such file or directory

SQL> !mkdir /u00/app/oracle/admin/ONPR/wallet

Configure instance parameter for the database wallet and restart the database:

SQL> alter system set WALLET_ROOT='/u00/app/oracle/admin/ONPR/wallet' scope=spfile;

SQL> shutdown immediate

SQL> startup

Check the wallet. No WRL_PARAMETER should be displayed. WALLET_TYPE should be unknown and STATUS not_available.

SQL> select WRL_PARAMETER, WRL_TYPE,WALLET_TYPE, status from V$ENCRYPTION_WALLET;

WRL_PARAMETER                  WRL_TYPE             WALLET_TYPE          STATUS
------------------------------ -------------------- -------------------- ---------------------------
                               FILE                 UNKNOWN              NOT_AVAILABLE

Configure TDE:

SQL> alter system set tde_configuration='keystore_configuration=FILE' scope=both;

System altered.

Check the Wallet. WRL_PARAMETER should be displayed with the wallet location. WALLET_TYPE should still be unknown and STATUS not_available.

SQL> select WRL_PARAMETER, WRL_TYPE,WALLET_TYPE, status from V$ENCRYPTION_WALLET;

WRL_PARAMETER                            WRL_TYPE             WALLET_TYPE          STATUS
---------------------------------------- -------------------- -------------------- -----------------
/u00/app/oracle/admin/ONPR/wallet/tde/   FILE                 UNKNOWN              NOT_AVAILABLE

Create keystore

Create the keystore using appropriate ExaCC password. We recommend to use the same one for source and target, albeit they can be different.

SQL> ADMINISTER KEY MANAGEMENT CREATE KEYSTORE IDENTIFIED BY "*********************"

keystore altered.

We now can see that we have a wallet file in the TDE wallet directory:

SQL> !ls -ltrh /u00/app/oracle/admin/ONPR/wallet/tde/
total 4.0K
-rw-------. 1 oracle dba 2.5K Feb  9 16:11 ewallet.p12

And if we check the wallet status, we can see it is still UNKNOWN for the WALLET_TYPE, but now STATUS is set to CLOSED.

SQL> select WRL_PARAMETER, WRL_TYPE,WALLET_TYPE, status from V$ENCRYPTION_WALLET;

WRL_PARAMETER                            WRL_TYPE             WALLET_TYPE          STATUS
---------------------------------------- -------------------- -------------------- ------------------------------
/u00/app/oracle/admin/ONPR/wallet/tde/   FILE                 UNKNOWN              CLOSED

Open the keystore

The keystore can now be opened.

SQL> ADMINISTER KEY MANAGEMENT SET KEYSTORE OPEN IDENTIFIED BY "***************";

keystore altered.

And the wallet type is now set to PASSWORD and status is OPEN_NO_MASTER_KEY.

SQL> select WRL_PARAMETER, WRL_TYPE,WALLET_TYPE, status from V$ENCRYPTION_WALLET;

WRL_PARAMETER                            WRL_TYPE             WALLET_TYPE          STATUS
---------------------------------------- -------------------- -------------------- ------------------------------
/u00/app/oracle/admin/ONPR/wallet/tde/   FILE                 PASSWORD             OPEN_NO_MASTER_KEY

Create and activate the master encryption key

Using same password we can now create and activate the master encryption key using backup option. This will set the database encryption key into the wallet.

SQL> ADMINISTER KEY MANAGEMENT SET KEY IDENTIFIED BY "*************" with backup;

keystore altered.

If you are running version 19.10, as we are here, you will face following error:

ORA-28374: typed master key not found in wallet

This is related to following bug:

Bug 31500699 – ORA-28374: typed master key not found in wallet after tablespace TDE Enabled ( Doc ID 31500699.8 )

This is not an issue, we can move forward as the master encryption key has been anyhow created and added in the wallet. The only problem would be that it will impossible to encrypt any data. We do not care as we are not encrypting the source on-premise database and we should not, as we are not licensed with Oracle Advanced Security.

We now have a new wallet and a backup one:

SQL> !ls -ltrh /u00/app/oracle/admin/ONPR/wallet/tde/
total 8.0K
-rw-------. 1 oracle dba 2.5K Feb  9 16:16 ewallet_2024020915161059.p12
-rw-------. 1 oracle dba 4.0K Feb  9 16:16 ewallet.p12

Set autologin Wallet

We will change the wallet type from password to autologin using the same password, in order for the wallet to be opened automatically.

SQL> ADMINISTER KEY MANAGEMENT CREATE AUTO_LOGIN KEYSTORE FROM KEYSTORE '/u00/app/oracle/admin/ONPR/wallet/tde/' IDENTIFIED BY "************";

keystore altered.

SQL> ADMINISTER KEY MANAGEMENT SET KEYSTORE CLOSE IDENTIFIED BY "******************";

keystore altered.

And we can check that all has been configured appropriately for the wallet:

SQL> select WRL_PARAMETER, WRL_TYPE,WALLET_TYPE, status from V$ENCRYPTION_WALLET;

WRL_PARAMETER                            WRL_TYPE             WALLET_TYPE          STATUS
---------------------------------------- -------------------- -------------------- ------------------------------
/u00/app/oracle/admin/ONPR/wallet/tde/   FILE                 AUTOLOGIN            OPEN

And we can see that we have now an autologin cwallet.sso file:

SQL> !ls -ltrh /u00/app/oracle/admin/ONPR/wallet/tde/
total 14K
-rw-------. 1 oracle dba 4.0K Feb  9 17:07 ewallet.p12
-rw-------. 1 oracle dba 5.7K Feb  9 17:07 cwallet.sso
-rw-------. 1 oracle dba 2.5K Feb  9 16:16 ewallet_2024020915161059.p12

Change tablespace_encryption instance parameter to decrypt_only

This is mandatory not to encrypt any new tablespace created on the source database once TDE is configured, otherwise Oracle Advanced Security License will be activated. Also after ZDM switchover steps will be completed, the primary database will be running on the ExaCC, and tablespaces will be encrypted on the ExaCC. The generated redo logs that need to be applied on the standby database running on on-premise will be encrypted as well. Knowing we are not licensed with Oracle Advanced Security on on-premise, we also need to have this parameter set to decrypt_only to be able to decrypt the redo before applying them on the source database.

Unfortunately this parameter came only with Oracle 19.16 version. So if you are running an older version on the source on-premise database, as we are, you do not have the possibility to use this parameter.

This means that we/you will need to:

  • Deactivate fallback possibility. We will only be able, in our situation, to use ZDM to migrate the database but without any fallback possibilities. Bad situation…
  • We will need to ensure the parameter ENCRYPT_NEW_TABLESPACES is set to DDL and ensure no ENCRYPTION clause is specified in the statement for any new created tablespace
SQL> show parameter encry

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
encrypt_new_tablespaces              string      CLOUD_ONLY
tablespace_encryption                string      MANUAL_ENABLE

SQL> alter system set tablespace_encryption='decrypt_only' scope=spfile;

System altered.

SQL> shutdown immediate
Database closed.
Database dismounted.
ORACLE instance shut down.

SQL> startup
ORACLE instance started.

Total System Global Area 3.7581E+10 bytes
Fixed Size                 23061704 bytes
Variable Size            5100273664 bytes
Database Buffers         3.2346E+10 bytes
Redo Buffers              111153152 bytes
Database mounted.
Database opened.

SQL> show parameter encry

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
encrypt_new_tablespaces              string      DDL
tablespace_encryption                string      DECRYPT_ONLY

Update SYS user password on the on-premise database

Both source and target SYS users password should match.

Update the source one with the same ExaCC one you are using:

SQL> alter user sys identified by "********";

Update source listener.ora with static entry

If the on-premise source database is not running on oracle restart (grid infra), you will have to add a static entry for DGMGRL service on the appropriate listener. Unfortunately ZDM does not do it.

SID_LIST_<listener_name> =
  (SID_LIST =
    (SID_DESC = 
      (GLOBAL_DBNAME = <dbname> _DGMGRL.<domain>) 
      (ORACLE_HOME = <ORACLE_HOME>) 
      (SID_NAME = <SID>)
    )
  )

If you do not do so and you will resume the migration, you will have the ZDM switchover steps, ZDM_SWITCHOVER_SRC, failing with following error:

PRGZ-3605 : Oracle Data Guard Broker switchover to database "ONPRZ_APP_001T" on database "ONPR" failed.
ONPRZ_APP_001T
DGMGRL for Linux: Release 19.0.0.0.0 - Production on Thu Feb 22 09:46:57 2024
Version 19.10.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

Welcome to DGMGRL, type "help" for information.
Connected to "ONPR"
Connected as SYSDG.
DGMGRL> Performing switchover NOW, please wait...
Operation requires a connection to database "onprz_app_001t"
Connecting ...
Connected to "ONPRZ_APP_001T"
Connected as SYSDBA.
New primary database "onprz_app_001t" is opening...
Operation requires start up of instance "ONPR" on database "onpr"
Starting instance "ONPR"...
Unable to connect to database using (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(Host=vmONPR.domain.com)(Port=13000))(CONNECT_DATA=(SERVICE_NAME=ONPR_DGMGRL.domain.com)(INSTANCE_NAME=ONPR)(SERVER=DEDICATED)))
ORA-12514: TNS:listener does not currently know of service requested in connect descriptor
Archived log backup on the source

You will have to ensure that the source database archived log deletion policy is set appropriately and ensure not to remove any archived log that would not be applied on the standby. This all to ensure no source archived log is missing for Data Guard.

Convert target database to single instance

I have converted the target database on the ExaCC used during ZDM migration (the one taken as template by ZDM and where the final PDB will be hosted), from RAC to single instance. And this for 2 reasons:

  • The first one, as we will see later, ZDM will create the standby database on a new instance, using PDB final name as ORACLE_SID. And this temporary database is any how single instance
  • If the target database is RAC, ZDM will create a second UNDO tablespace in the single instance source database. I do not want to make any change in the source database. Also as I’m running version 19.10 on the source, the UNDO will be encrypted and more over I will face bug 31500699 and ZDM migration will fail in error.
Update cluster_database instance parameter
oracle@ExaCC-cl01n1:~/ [ONPR1 (CDB$ROOT)] sqh

SQL> show parameter cluster_database

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
cluster_database                     boolean     TRUE
cluster_database_instances           integer     2

SQL> set line 300
SQL> col name for a30
SQL> col value for a30
SQL> select inst_id, name, value from gv$parameter where lower(name)='cluster_database';

   INST_ID NAME                           VALUE
---------- ------------------------------ ------------------------------
         2 cluster_database               TRUE
         1 cluster_database               TRUE

SQL> alter system set cluster_database=FALSE scope=spfile sid='*';

System altered.

Stop cluster database
oracle@ExaCC-cl01n1:~/ [ONPR1 (CDB$ROOT)] srvctl status database -d ONPR_RZ2
Instance ONPR1 is running on node ExaCC-cl01n1
Instance ONPR2 is running on node ExaCC-cl01n2

oracle@ExaCC-cl01n1:~/ [ONPR1 (CDB$ROOT)] srvctl stop database -d ONPR_RZ2

oracle@ExaCC-cl01n1:~/ [ONPR1 (CDB$ROOT)] srvctl status database -d ONPR_RZ2
Instance ONPR1 is not running on node ExaCC-cl01n1
Instance ONPR2 is not running on node ExaCC-cl01n2

Change grid infrastructure configuration

We will remove second instance.

oracle@ExaCC-cl01n1:~/ [ONPR1 (CDB$ROOT)] srvctl config database -d ONPR_RZ2
Database unique name: ONPR_RZ2
Database name: ONPR
Oracle home: /u02/app/oracle/product/19.0.0.0/dbhome_2
Oracle user: oracle
Spfile: +DATAC1/ONPR_RZ2/PARAMETERFILE/spfile.634.1160214211
Password file: +DATAC1/ONPR_RZ2/PASSWORD/pwdonpr_rz2.562.1160213439
Domain: domain.com
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools:
Disk Groups: DATAC1
Mount point paths: /acfs01
Services:
Type: RAC
Start concurrency:
Stop concurrency:
OSDBA group: dba
OSOPER group: racoper
Database instances: ONPR1,ONPR2
Configured nodes: ExaCC-cl01n1,ExaCC-cl01n2
CSS critical: no
CPU count: 0
Memory target: 0
Maximum memory: 0
Default network number for database services:
Database is administrator managed
oracle@ExaCC-cl01n1:~/ [ONPR1 (CDB$ROOT)]

oracle@ExaCC-cl01n1:~/ [ONPR1 (CDB$ROOT)] srvctl remove instance -d ONPR_RZ2 -i ONPR2
Remove instance from the database ONPR_RZ2? (y/[n]) y
oracle@ExaCC-cl01n1:~/ [ONPR1 (CDB$ROOT)]

oracle@ExaCC-cl01n1:~/ [ONPR1 (CDB$ROOT)] srvctl config database -d ONPR_RZ2
Database unique name: ONPR_RZ2
Database name: ONPR
Oracle home: /u02/app/oracle/product/19.0.0.0/dbhome_2
Oracle user: oracle
Spfile: +DATAC1/ONPR_RZ2/PARAMETERFILE/spfile.634.1160214211
Password file: +DATAC1/ONPR_RZ2/PASSWORD/pwdonpr_rz2.562.1160213439
Domain: domain.com
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools:
Disk Groups: DATAC1
Mount point paths: /acfs01
Services:
Type: RAC
Start concurrency:
Stop concurrency:
OSDBA group: dba
OSOPER group: racoper
Database instances: ONPR1
Configured nodes: ExaCC-cl01n1
CSS critical: no
CPU count: 0
Memory target: 0
Maximum memory: 0
Default network number for database services:
Database is administrator managed

Start target database ONPR on ExaCC
oracle@ExaCC-cl01n1:~/ [ONPR1 (CDB$ROOT)] srvctl start database -d ONPR_RZ2

oracle@ExaCC-cl01n1:~/ [ONPR1 (CDB$ROOT)] srvctl status database -d ONPR_RZ2
Instance ONPR1 is running on node ExaCC-cl01n1

As we can see, only one instance is running. This can also be double checked with the instance parameter.

oracle@ExaCC-cl01n1:~/ [ONPR1 (CDB$ROOT)] sqh

SQL> set lines 300 pages 500
SQL> col name for a30
SQL> col value for a30
SQL> select inst_id, name, value from gv$parameter where lower(name)='cluster_database';

   INST_ID NAME                           VALUE
---------- ------------------------------ ------------------------------
         1 cluster_database               FALSE

Evaluating ZDM Migration

We are now all ready to evaluate ZDM Migration. We will first run zdmcli with the -eval option to evaluate the migration and test if all is ok.

We need to provide some arguments :

ArgumentValue -sourcesidDatabase Name of the source database in case the source database is a single instance deployed on a non Grid Infrastructure environment -rspZDM response file -sourcenodeSource host -srcauth with 3 sub-arguments:
-srcarg1
-srcarg2
-srcarg3 Name of the source authentication plug-in with 3 sub-arguments:
1st argument: user. Should be oracle
2nd argument: ZDM private RSA Key
3rd argument: sudo location -targetnodeTarget host -tgtauth with 3 sub-arguments:
-tgtarg1
-tgtarg2
-tgtarg3 Name of the target authentication plug-in with 3 sub-arguments:
1st argument: user. Should be opc
2nd argument: ZDM private RSA Key
3rd argument: sudo location -tdekeystorepasswdSource database TDE keystore password -tgttdekeystorepasswdTarget container database TDE keystore password

All steps done for evaluation have been completed successfully:

[zdmuser@zdmhost migration]$ /u01/app/oracle/product/zdm/bin/zdmcli migrate database -sourcesid ONPR -rsp /home/zdmuser/migration/zdm_ONPR_physical_online.rsp -sourcenode vmonpr -srcauth zdmauth -srcarg1 user:oracle -srcarg2 identity_file:/home/zdmuser/.ssh/id_rsa -srcarg3 sudo_location:/usr/bin/sudo -targetnode ExaCC-cl01n1 -tgtauth zdmauth -tgtarg1 user:opc -tgtarg2 identity_file:/home/zdmuser/.ssh/id_rsa -tgtarg3 sudo_location:/usr/bin/sudo -tdekeystorepasswd -tgttdekeystorepasswd -eval
zdmhost.domain.com: Audit ID: 428
Enter source database ONPR SYS password:
Enter source database ONPR TDE keystore password:
Enter target container database TDE keystore password:
zdmhost: 2024-02-14T13:18:19.773Z : Processing response file ...
Operation "zdmcli migrate database" scheduled with the job ID "39".

[zdmuser@zdmhost migration]$ /u01/app/oracle/product/zdm/bin/zdmcli query job -jobid 39
zdmhost.domain.com: Audit ID: 434
Job ID: 39
User: zdmuser
Client: zdmhost
Job Type: "EVAL"
Scheduled job command: "zdmcli migrate database -sourcesid ONPR -rsp /home/zdmuser/migration/zdm_ONPR_physical_online.rsp -sourcenode vmonpr -srcauth zdmauth -srcarg1 user:oracle -srcarg2 identity_file:/home/zdmuser/.ssh/id_rsa -srcarg3 sudo_location:/usr/bin/sudo -targetnode ExaCC-cl01n1 -tgtauth zdmauth -tgtarg1 user:opc -tgtarg2 identity_file:/home/zdmuser/.ssh/id_rsa -tgtarg3 sudo_location:/usr/bin/sudo -tdekeystorepasswd -tgttdekeystorepasswd -eval"
Scheduled job execution start time: 2024-02-14T14:18:19+01. Equivalent local time: 2024-02-14 14:18:19
Current status: SUCCEEDED
Result file path: "/u01/app/oracle/chkbase/scheduled/job-39-2024-02-14-14:18:29.log"
Metrics file path: "/u01/app/oracle/chkbase/scheduled/job-39-2024-02-14-14:18:29.json"
Job execution start time: 2024-02-14 14:18:29
Job execution end time: 2024-02-14 14:21:18
Job execution elapsed time: 2 minutes 48 seconds
ZDM_GET_SRC_INFO ........... PRECHECK_PASSED
ZDM_GET_TGT_INFO ........... PRECHECK_PASSED
ZDM_PRECHECKS_SRC .......... PRECHECK_PASSED
ZDM_PRECHECKS_TGT .......... PRECHECK_PASSED
ZDM_SETUP_SRC .............. PRECHECK_PASSED
ZDM_SETUP_TGT .............. PRECHECK_PASSED
ZDM_PREUSERACTIONS ......... PRECHECK_PASSED
ZDM_PREUSERACTIONS_TGT ..... PRECHECK_PASSED
ZDM_VALIDATE_SRC ........... PRECHECK_PASSED
ZDM_VALIDATE_TGT ........... PRECHECK_PASSED
ZDM_POSTUSERACTIONS ........ PRECHECK_PASSED
ZDM_POSTUSERACTIONS_TGT .... PRECHECK_PASSED
ZDM_CLEANUP_SRC ............ PRECHECK_PASSED
ZDM_CLEANUP_TGT ............ PRECHECK_PASSED

Run the Migration with ZDM

We will run the migration adding a pause after the ZDM steps ZDM_CONFIGURE_DG_SRC. So ZDM will prepare all the environment (setting the environment, creating standby and configuring Data Guard). All this steps can be done without any downtime.

[zdmuser@zdmhost migration]$ /u01/app/oracle/product/zdm/bin/zdmcli migrate database -sourcesid ONPR -rsp /home/zdmuser/migration/zdm_ONPR_physical_online.rsp -sourcenode vmonpr -srcauth zdmauth -srcarg1 user:oracle -srcarg2 identity_file:/home/zdmuser/.ssh/id_rsa -srcarg3 sudo_location:/usr/bin/sudo -targetnode ExaCC-cl01n1 -tgtauth zdmauth -tgtarg1 user:opc -tgtarg2 identity_file:/home/zdmuser/.ssh/id_rsa -tgtarg3 sudo_location:/usr/bin/sudo -tdekeystorepasswd -tgttdekeystorepasswd -pauseafter ZDM_CONFIGURE_DG_SRC
zdmhost.domain.com: Audit ID: 543
Enter source database ONPR SYS password:
Enter source database ONPR TDE keystore password:
Enter target container database TDE keystore password:
zdmhost: 2024-02-22T09:27:17.864Z : Processing response file ...
Operation "zdmcli migrate database" scheduled with the job ID "44".
[zdmuser@zdmhost migration]$

[zdmuser@zdmhost migration]$ /u01/app/oracle/product/zdm/bin/zdmcli query job -jobid 44
zdmhost.domain.com: Audit ID: 551
Job ID: 44
User: zdmuser
Client: zdmhost
Job Type: "MIGRATE"
Scheduled job command: "zdmcli migrate database -sourcesid ONPR -rsp /home/zdmuser/migration/zdm_ONPR_physical_online.rsp -sourcenode vmonpr -srcauth zdmauth -srcarg1 user:oracle -srcarg2 identity_file:/home/zdmuser/.ssh/id_rsa -srcarg3 sudo_location:/usr/bin/sudo -targetnode ExaCC-cl01n1 -tgtauth zdmauth -tgtarg1 user:opc -tgtarg2 identity_file:/home/zdmuser/.ssh/id_rsa -tgtarg3 sudo_location:/usr/bin/sudo -tdekeystorepasswd -tgttdekeystorepasswd -pauseafter ZDM_CONFIGURE_DG_SRC"
Scheduled job execution start time: 2024-02-22T10:27:17+01. Equivalent local time: 2024-02-22 10:27:17
Current status: PAUSED
Current Phase: "ZDM_CONFIGURE_DG_SRC"
Result file path: "/u01/app/oracle/chkbase/scheduled/job-44-2024-02-22-10:27:27.log"
Metrics file path: "/u01/app/oracle/chkbase/scheduled/job-44-2024-02-22-10:27:27.json"
Job execution start time: 2024-02-22 10:27:27
Job execution end time: 2024-02-22 10:39:38
Job execution elapsed time: 12 minutes 11 seconds
ZDM_GET_SRC_INFO ................ COMPLETED
ZDM_GET_TGT_INFO ................ COMPLETED
ZDM_PRECHECKS_SRC ............... COMPLETED
ZDM_PRECHECKS_TGT ............... COMPLETED
ZDM_SETUP_SRC ................... COMPLETED
ZDM_SETUP_TGT ................... COMPLETED
ZDM_PREUSERACTIONS .............. COMPLETED
ZDM_PREUSERACTIONS_TGT .......... COMPLETED
ZDM_VALIDATE_SRC ................ COMPLETED
ZDM_VALIDATE_TGT ................ COMPLETED
ZDM_DISCOVER_SRC ................ COMPLETED
ZDM_COPYFILES ................... COMPLETED
ZDM_PREPARE_TGT ................. COMPLETED
ZDM_SETUP_TDE_TGT ............... COMPLETED
ZDM_RESTORE_TGT ................. COMPLETED
ZDM_RECOVER_TGT ................. COMPLETED
ZDM_FINALIZE_TGT ................ COMPLETED
ZDM_CONFIGURE_DG_SRC ............ COMPLETED
ZDM_SWITCHOVER_SRC .............. PENDING
ZDM_SWITCHOVER_TGT .............. PENDING
ZDM_POST_DATABASE_OPEN_TGT ...... PENDING
ZDM_DATAPATCH_TGT ............... PENDING
ZDM_MODIFY_DBUNIQUENAME_TGT ..... PENDING
ZDM_NONCDBTOPDB_PRECHECK ........ PENDING
ZDM_NONCDBTOPDB_CONVERSION ...... PENDING
ZDM_POST_MIGRATE_TGT ............ PENDING
TIMEZONE_UPGRADE_PREPARE_TGT .... PENDING
TIMEZONE_UPGRADE_TGT ............ PENDING
ZDM_POSTUSERACTIONS ............. PENDING
ZDM_POSTUSERACTIONS_TGT ......... PENDING
ZDM_CLEANUP_SRC ................. PENDING
ZDM_CLEANUP_TGT ................. PENDING

Pause After Phase: "ZDM_CONFIGURE_DG_SRC"

We can see that all steps have been completed successfully, and ZDM has paused the migration after the Data Guard has been configured.

I can review full ZDM log:

[zdmuser@zdmhost ~]$ cat /u01/app/oracle/chkbase/scheduled/job-44-2024-02-22-10:27:27.log
zdmhost: 2024-02-22T09:27:27.341Z : Processing response file ...
zdmhost: 2024-02-22T09:27:27.345Z : Processing response file ...
zdmhost: 2024-02-22T09:27:32.418Z : Starting zero downtime migrate operation ...
zdmhost: 2024-02-22T09:27:34.498Z : Executing phase ZDM_GET_SRC_INFO
zdmhost: 2024-02-22T09:27:34.498Z : Retrieving information from source node "vmonpr" ...
zdmhost: 2024-02-22T09:27:34.498Z : retrieving information about database "ONPR" ...
zdmhost: 2024-02-22T09:27:38.793Z : Execution of phase ZDM_GET_SRC_INFO completed
zdmhost: 2024-02-22T09:27:38.819Z : Executing phase ZDM_GET_TGT_INFO
zdmhost: 2024-02-22T09:27:38.819Z : Retrieving information from target node "ExaCC-cl01n1" ...
zdmhost: 2024-02-22T09:27:46.173Z : Determined value for parameter TGT_DATADG is '+DATAC1'
zdmhost: 2024-02-22T09:27:46.173Z : Determined value for parameter TGT_REDODG is '+DATAC1'
zdmhost: 2024-02-22T09:27:46.173Z : Determined value for parameter TGT_RECODG is '+RECOC1'
zdmhost: 2024-02-22T09:27:46.284Z : Execution of phase ZDM_GET_TGT_INFO completed
zdmhost: 2024-02-22T09:27:46.821Z : Executing phase ZDM_PRECHECKS_SRC
zdmhost: 2024-02-22T09:27:46.821Z : Execution of phase ZDM_PRECHECKS_SRC completed
zdmhost: 2024-02-22T09:27:47.080Z : Executing phase ZDM_PRECHECKS_TGT
zdmhost: 2024-02-22T09:27:47.080Z : Execution of phase ZDM_PRECHECKS_TGT completed
zdmhost: 2024-02-22T09:27:47.118Z : Executing phase ZDM_SETUP_SRC
zdmhost: 2024-02-22T09:27:47.118Z : Setting up ZDM on the source node vmonpr ...
vmonpr: 2024-02-22T09:28:49.592Z : TNS aliases successfully setup on the source node vmonpr...
zdmhost: 2024-02-22T09:28:49.694Z : Execution of phase ZDM_SETUP_SRC completed
####################################################################
zdmhost: 2024-02-22T09:28:49.730Z : Executing phase ZDM_SETUP_TGT
zdmhost: 2024-02-22T09:28:49.730Z : Setting up ZDM on the target node ExaCC-cl01n1 ...
ExaCC-cl01n1: 2024-02-22T09:29:12.976Z : TNS aliases successfully setup on the target node ExaCC-cl01n1...
zdmhost: 2024-02-22T09:29:12.979Z : Execution of phase ZDM_SETUP_TGT completed
####################################################################
zdmhost: 2024-02-22T09:29:13.023Z : Executing phase ZDM_VALIDATE_SRC
zdmhost: 2024-02-22T09:29:13.024Z : Validating source environment on node vmonpr ...
vmonpr: 2024-02-22T09:29:23.649Z : Validating SYS account password specified..
vmonpr: 2024-02-22T09:29:34.470Z : Validating source environment...
vmonpr: 2024-02-22T09:29:34.470Z : Ensuring source database is running in ARCHIVELOG mode...
vmonpr: 2024-02-22T09:29:34.871Z : Validating Oracle TDE setup
vmonpr: 2024-02-22T09:29:37.474Z : Validating Oracle Password file
vmonpr: 2024-02-22T09:29:38.476Z : Validating database ONPR role is PRIMARY...
vmonpr: 2024-02-22T09:29:38.478Z : Source environment validated successfully
zdmhost: 2024-02-22T09:29:38.487Z : Execution of phase ZDM_VALIDATE_SRC completed
####################################################################
zdmhost: 2024-02-22T09:29:38.521Z : Executing phase ZDM_VALIDATE_TGT
zdmhost: 2024-02-22T09:29:38.521Z : Validating target environment on node ExaCC-cl01n1 ...
zdmhost: 2024-02-22T09:29:38.573Z : Source database timezone file version 32 is less than target database timezone file version 42. Timezone upgrade operation will be performed on target database after completion of database migration.
ExaCC-cl01n1: 2024-02-22T09:29:50.315Z : Validating specified Oracle ASM storage locations...
ExaCC-cl01n1: 2024-02-22T09:29:54.219Z : validating target database size allocation...
ExaCC-cl01n1: 2024-02-22T09:29:56.922Z : Verifying SQL*Net connectivity to source database ...
ExaCC-cl01n1: 2024-02-22T09:29:57.223Z : verifying passwordless connectivity between target nodes
ExaCC-cl01n1: 2024-02-22T09:29:58.425Z : Target environment validated successfully
zdmhost: 2024-02-22T09:29:58.433Z : Execution of phase ZDM_VALIDATE_TGT completed
####################################################################
zdmhost: 2024-02-22T09:29:58.455Z : Executing phase ZDM_DISCOVER_SRC
zdmhost: 2024-02-22T09:29:58.455Z : Setting up the source node vmonpr for creating standby on the target node ExaCC-cl01n1 ...
vmonpr: 2024-02-22T09:30:09.186Z : Enabling force logging on database ONPR...
vmonpr: 2024-02-22T09:30:09.287Z : Creating standby logs on database ONPR...
vmonpr: 2024-02-22T09:30:13.591Z : Source environment set up successfully
zdmhost: 2024-02-22T09:30:13.700Z : Execution of phase ZDM_DISCOVER_SRC completed
####################################################################
zdmhost: 2024-02-22T09:30:13.729Z : Executing phase ZDM_COPYFILES
zdmhost: 2024-02-22T09:30:13.729Z : Copying files from source node vmonpr to target node ExaCC-cl01n1 ...
vmonpr: 2024-02-22T09:30:24.849Z : Source database "ONPR" credentials exported successfully on node "vmonpr"
zdmhost: 2024-02-22T09:30:29.112Z : Execution of phase ZDM_COPYFILES completed
####################################################################
zdmhost: 2024-02-22T09:30:29.148Z : Executing phase ZDM_PREPARE_TGT
zdmhost: 2024-02-22T09:30:29.148Z : Setting up standby on the target node ExaCC-cl01n1 ...
ExaCC-cl01n1: 2024-02-22T09:31:03.106Z : Target environment set up successfully
zdmhost: 2024-02-22T09:31:03.115Z : Execution of phase ZDM_PREPARE_TGT completed
####################################################################
zdmhost: 2024-02-22T09:31:03.137Z : Executing phase ZDM_SETUP_TDE_TGT
zdmhost: 2024-02-22T09:31:03.137Z : Setting up Oracle Transparent Data Encryption (TDE) keystore on the target node ExaCC-cl01n1 ...
ExaCC-cl01n1: 2024-02-22T09:31:13.880Z : target environment Oracle Transparent Data Encryption (TDE) set up successfully
zdmhost: 2024-02-22T09:31:13.889Z : Execution of phase ZDM_SETUP_TDE_TGT completed
####################################################################
zdmhost: 2024-02-22T09:31:13.913Z : Executing phase ZDM_RESTORE_TGT
zdmhost: 2024-02-22T09:31:13.913Z : Restoring database on the target node ExaCC-cl01n1 ...
ExaCC-cl01n1: 2024-02-22T09:31:36.483Z : database ONPRZ_APP_001T dropped successfully
ExaCC-cl01n1: 2024-02-22T09:31:54.048Z : Target database "ONPRZ_APP_001T" credentials staged successfully on node "ExaCC-cl01n1"
ExaCC-cl01n1: 2024-02-22T09:32:06.392Z : Restoring SPFILE ...
ExaCC-cl01n1: 2024-02-22T09:32:45.923Z : SPFILE restored to /u02/app/oracle/product/19.0.0.0/dbhome_2/dbs/spfileONPRZ_APP_001T1.ora successfully
ExaCC-cl01n1: 2024-02-22T09:32:59.196Z : Restoring control files ...
ExaCC-cl01n1: 2024-02-22T09:33:20.313Z : Control files restored successfully
ExaCC-cl01n1: 2024-02-22T09:33:31.775Z : Restoring and encrypting data files ...
ExaCC-cl01n1: 2024-02-22T09:34:38.628Z : Data files restored and encrypted successfully
ExaCC-cl01n1: 2024-02-22T09:34:38.629Z : Cleaning up any orphaned data ...
ExaCC-cl01n1: 2024-02-22T09:34:38.730Z : Orphaned files clean up successful
ExaCC-cl01n1: 2024-02-22T09:34:39.034Z : Data files restored successfully
ExaCC-cl01n1: 2024-02-22T09:34:51.398Z : Renaming TEMP files and online redo log files ...
ExaCC-cl01n1: 2024-02-22T09:35:03.909Z : TEMP files and online redo log files renamed successfully
ExaCC-cl01n1: 2024-02-22T09:35:16.374Z : Recovering data files ...
ExaCC-cl01n1: 2024-02-22T09:35:20.080Z : Data files recovered successfully
zdmhost: 2024-02-22T09:35:20.094Z : Execution of phase ZDM_RESTORE_TGT completed
####################################################################
zdmhost: 2024-02-22T09:35:20.115Z : Executing phase ZDM_RECOVER_TGT
zdmhost: 2024-02-22T09:35:20.115Z : Recovering database on the target node ExaCC-cl01n1 ...
ExaCC-cl01n1: 2024-02-22T09:35:37.674Z : Target database "ONPRZ_APP_001T" credentials staged successfully on node "ExaCC-cl01n1"
ExaCC-cl01n1: 2024-02-22T09:35:50.018Z : Restoring control files ...
ExaCC-cl01n1: 2024-02-22T09:36:31.849Z : Running RMAN crosscheck on database "ONPRZ_APP_001T" ...
ExaCC-cl01n1: 2024-02-22T09:36:36.354Z : RMAN crosscheck on database "ONPRZ_APP_001T" ran successfully
ExaCC-cl01n1: 2024-02-22T09:36:36.354Z : Running RMAN catalog ...
ExaCC-cl01n1: 2024-02-22T09:36:38.357Z : RMAN catalog ran successfully
ExaCC-cl01n1: 2024-02-22T09:36:38.357Z : Control files restored successfully
ExaCC-cl01n1: 2024-02-22T09:36:50.818Z : Starting incremental restore of data files ...
ExaCC-cl01n1: 2024-02-22T09:36:55.122Z : Incremental restore of data files executed successfully
ExaCC-cl01n1: 2024-02-22T09:36:55.123Z : Cleaning up any orphaned data ...
ExaCC-cl01n1: 2024-02-22T09:36:55.224Z : Orphaned files clean up successful
ExaCC-cl01n1: 2024-02-22T09:36:55.327Z : Data files restored successfully
ExaCC-cl01n1: 2024-02-22T09:37:07.696Z : Renaming TEMP files and online redo log files ...
ExaCC-cl01n1: 2024-02-22T09:37:20.207Z : TEMP files and online redo log files renamed successfully
ExaCC-cl01n1: 2024-02-22T09:37:32.669Z : Recovering data files ...
ExaCC-cl01n1: 2024-02-22T09:37:36.575Z : Data files recovered successfully
zdmhost: 2024-02-22T09:37:36.590Z : Execution of phase ZDM_RECOVER_TGT completed
####################################################################
zdmhost: 2024-02-22T09:37:36.614Z : Executing phase ZDM_FINALIZE_TGT
zdmhost: 2024-02-22T09:37:36.614Z : Finalizing creation of standby database on the target node ExaCC-cl01n1 ...
ExaCC-cl01n1: 2024-02-22T09:37:47.554Z : Updating database cluster resource dependency ...
ExaCC-cl01n1: 2024-02-22T09:38:30.389Z : Creating standby redo logs on target database ONPRZ_APP_001T
ExaCC-cl01n1: 2024-02-22T09:38:30.490Z : Enabling Oracle Data Guard Broker on "ONPRZ_APP_001T" ...
ExaCC-cl01n1: 2024-02-22T09:38:33.493Z : Oracle Data Guard Broker enabled successfully on "ONPRZ_APP_001T"
ExaCC-cl01n1: 2024-02-22T09:38:33.694Z : Target database updated successfully
zdmhost: 2024-02-22T09:38:33.704Z : Execution of phase ZDM_FINALIZE_TGT completed
####################################################################
zdmhost: 2024-02-22T09:38:33.725Z : Executing phase ZDM_CONFIGURE_DG_SRC
zdmhost: 2024-02-22T09:38:33.726Z : Finalize steps done on the source node vmonpr for creating standby on the target node ExaCC-cl01n1 ...
vmonpr: 2024-02-22T09:38:44.648Z : Configuring Oracle Data Guard Broker on "ONPR" ...
vmonpr: 2024-02-22T09:39:38.289Z : Oracle Data Guard Broker configured successfully on "ONPR"
vmonpr: 2024-02-22T09:39:38.390Z : Source database updated successfully
zdmhost: 2024-02-22T09:39:38.398Z : Execution of phase ZDM_CONFIGURE_DG_SRC completed
####################################################################
zdmhost: 2024-02-22T09:39:38.403Z : Job execution paused after phase "ZDM_CONFIGURE_DG_SRC".

We can see ZDM will valide source and target, check SYS password, check and setup TDE, validate Oracle Net connections, validate ASM storage, create standby redo logs, creating standby database using direct transfer data method (the application tablespace will be encrypted), and configure Data Guard.

We can check additional and detailed logs on the source database:

oracle@vmonpr:/home/oracle/ [ONPR] cd  /u00/app/oracle/zdm/zdm_ONPR_44/zdm/log/

oracle@vmonpr:/u00/app/oracle/zdm/zdm_ONPR_44/zdm/log/ [ONPR] ls -ltrh
total 228K
-rw-rw-rw-. 1 oracle dba  14K Feb 22 10:28 zdm_setup_tns_alias_src_12728.log
-rwxrwxrwx. 1 oracle dba    0 Feb 22 10:29 default.log
-rw-rw-rw-. 1 oracle dba  17K Feb 22 10:29 zdm_validate_sys_pass_src_13181.log
-rw-rw-rw-. 1 root   root 57K Feb 22 10:29 zdm_validate_src_13240.log
-rw-rw-rw-. 1 root   root 62K Feb 22 10:30 zdm_discover_src_13722.log
-rw-rw-rw-. 1 root   root 22K Feb 22 10:30 zdm_export_db_cred_src_15024.log
-rw-rw-rw-. 1 oracle dba  43K Feb 22 10:39 zdm_configure_dg_src_18159.log

And on the target (ExaCC):

[opc@ExaCC-cl01n1 ~]$ cd /u02/app/oracle/zdm/zdm_ONPR_RZ2_44/zdm/log/

[opc@ExaCC-cl01n1 log]$ ls -ltrh
total 516K
-rw-rw-rw- 1 oracle root  33K Feb 22 10:29 zdm_setup_tns_alias_tgt_224683.log
-rwxrwxrwx 1 oracle root    0 Feb 22 10:29 default.log
-rw-rw-rw- 1 oracle root  29K Feb 22 10:29 zdm_validate_tgt_231351.log
-rw-rw-rw- 1 root   root  59K Feb 22 10:31 zdm_prepare_tgt_239052.log
-rw-rw-rw- 1 root   root 7.9K Feb 22 10:31 zdm_setup_tde_tgt_242739.log
-rw-rw-rw- 1 root   root  16K Feb 22 10:31 zdm_oss_restore_tgt_dropdatabase_244302.log
-rw-rw-rw- 1 root   root  21K Feb 22 10:31 zdm_import_db_cred_tgt_250414.log
-rw-rw-rw- 1 root   root  24K Feb 22 10:32 zdm_oss_restore_tgt_restoreinit_252706.log
-rw-rw-rw- 1 root   root  19K Feb 22 10:33 zdm_oss_restore_tgt_restorecntrl_260310.log
-rw-rw-rw- 1 root   root  37K Feb 22 10:34 zdm_oss_restore_tgt_restoredb_263322.log
-rw-rw-rw- 1 root   root  36K Feb 22 10:35 zdm_oss_restore_tgt_renametemp_279405.log
-rw-rw-rw- 1 root   root  22K Feb 22 10:35 zdm_oss_restore_tgt_recoverdb_281523.log
-rw-rw-rw- 1 root   root  22K Feb 22 10:35 zdm_import_db_cred_tgt_283595.log
-rw-rw-rw- 1 root   root  28K Feb 22 10:36 zdm_oss_recover_tgt_restorecntrl_288735.log
-rw-rw-rw- 1 root   root  37K Feb 22 10:36 zdm_oss_recover_tgt_restoredb_297044.log
-rw-rw-rw- 1 root   root  36K Feb 22 10:37 zdm_oss_recover_tgt_renametemp_298270.log
-rw-rw-rw- 1 root   root  22K Feb 22 10:37 zdm_oss_recover_tgt_recoverdb_300955.log
-rw-rw-rw- 1 root   root  41K Feb 22 10:38 zdm_finalize_tgt_305026.log

Finally I can check Data Guard configuration and see that my standby is synchronized (no gap).

oracle@vmonpr:/home/oracle/ [ONPR] dgmgrl
DGMGRL for Linux: Release 19.0.0.0.0 - Production on Thu Feb 22 10:42:39 2024
Version 19.10.0.0.0

Copyright (c) 1982, 2019, Oracle and/or its affiliates.  All rights reserved.

Welcome to DGMGRL, type "help" for information.

DGMGRL> connect /
Connected to "ONPR"
Connected as SYSDG.

DGMGRL> show configuration lag

Configuration - ZDM_onpr

  Protection Mode: MaxPerformance
  Members:
  onpr           - Primary database
    onprz_app_001t - Physical standby database
                     Transport Lag:      0 seconds (computed 1 second ago)
                     Apply Lag:          0 seconds (computed 1 second ago)

Fast-Start Failover:  Disabled

Configuration Status:
SUCCESS   (status updated 36 seconds ago)

I can see that ZDM created a new instance on the ExaCC, called as the final PDB name:

oracle@ExaCC-cl01n1:/u02/app/oracle/zdm/zdm_ONPR_RZ2_38/zdm/log/ [ONPR1 (CDB$ROOT)] ps -ef | grep [p]mon | grep -i ONPRZ_APP_001T1
oracle   236556      1  0 12:22 ?        00:00:00 ora_pmon_ONPRZ_APP_001T1

If we connect to it, we can see that the instance name is as the final PDB name and the db name is matching the source database one to migrate. The db_unique_name will be the pdb name.

oracle@ExaCC-cl01n1:/u02/app/oracle/product/19.0.0.0/dbhome_2/ [ONPR1 (CDB$ROOT)] export ORACLE_SID=ONPRZ_APP_001T1

oracle@ExaCC-cl01n1:/u02/app/oracle/product/19.0.0.0/dbhome_2/ [ONPRZ_APP_001T1 (CDB$ROOT)] sqh

SQL*Plus: Release 19.0.0.0.0 - Production on Wed Feb 14 13:22:42 2024
Version 19.21.0.0.0

Copyright (c) 1982, 2022, Oracle.  All rights reserved.


Connected to:
Oracle Database 19c EE Extreme Perf Release 19.0.0.0.0 - Production
Version 19.21.0.0.0

SQL> set lines 300 pages 500
SQL> select instance_name from v$instance;

INSTANCE_NAME
------------------------------------------------
ONPRZ_APP_001T1

SQL> select name, db_unique_name, open_mode, database_role from v$database;

NAME                        DB_UNIQUE_NAME                                                                             OPEN_MODE                                                    DATABASE_ROLE
--------------------------- ------------------------------------------------------------------------------------------ ------------------------------------------------------------ ------------------------------------------------
ONPR                        ONPRZ_APP_001T                                                                             MOUNTED                                                      PHYSICAL STANDBY

And I could see that this temporary database was single instance database:

SQL> show parameter cluster_database;

NAME                                 TYPE                              VALUE
------------------------------------ --------------------------------- ------------------------------
cluster_database                     boolean                           FALSE
cluster_database_instances           integer                           1

Migration – Maintenance Windows

Now we will have the maintenance windows and switch the database to the ExaCC. After the switchover, ZDM will run datapatch (to patch the database to the new version 19.21), convert the noncdb to pdb, upgrade the timezone and any other post migration tasks.

For this we just need to resume the job. We could even resume it adding a new pause if we want to do each steps separately.

To resume the job:

[zdmuser@zdmhost migration]$ /u01/app/oracle/product/zdm/bin/zdmcli resume job -jobid 44
zdmhost.domain.com: Audit ID: 552

Wrap up

ZDM is really a nice tool, and ZDM will mainly configure all for you. Albeit I was facing sometimes some problem that I had to troubleshoot, I could always find a solution. Oracle ZDM team is also very flexible and available to discuss. Knowing we have all the on-premises databases running 19.10, and as customer does not want to go without fallback, I could unfortunately not test the switchover and next conversion steps. I will blog later on it, once customer databases will be patched to last 19c version, and that I will be able to move forward with the test.

L’article Physical Online Migration to ExaCC with Oracle Zero Downtime Migration (ZDM) est apparu en premier sur dbi Blog.

PostgreSQL 17: transaction_timeout

Thu, 2024-02-22 02:21

PostgreSQL already comes with various time out parameters when it comes to sessions and statements. There is idle_in_transaction_session_timeout, idle_session_timeout, and there is statement_timeout. All of them are disabled by default but can be turned on to prevent either long running sessions or statements. Starting with PostgreSQL 17 there will be another time out related parameter: transaction_timeout. As the name implies, this one applies on the transaction level.

An easy test to see how this works is this:

postgres=# set transaction_timeout = '5s';
SET
postgres=# begin;
BEGIN
postgres=*# select pg_sleep(6);
FATAL:  terminating connection due to transaction timeout
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
postgres=# 

If idle_in_transaction_session_timeout and statement_timeout are set as well but transaction_timeout is set to a shorter time, then transaction_timeout will be the one which counts:

postgres=# set idle_in_transaction_session_timeout = '10s';
SET
postgres=# set statement_timeout = '10s';
SET
postgres=# set transaction_timeout = '5s';
SET
postgres=# begin;
BEGIN
postgres=*# select pg_sleep(6);
FATAL:  terminating connection due to transaction timeout
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
postgres=# 

Be aware that setting this on the instance level (postgresql.conf or postgresql.auto.conf) will make this active for all transactions and this is probably not what you want. Use it with care where you really need.

L’article PostgreSQL 17: transaction_timeout est apparu en premier sur dbi Blog.

Logical Offline Migration to ExaCC with Oracle Zero Downtime Migration (ZDM)

Tue, 2024-02-20 17:47

A while ago I had been testing and blogging about ZDM, see my previous articles. And I finally had the chance to implement it at one of our customer to migrate on-premises database to Exadata Cloud @Customer. In this article I would like to share with you my experience migrating an on-premises database to ExaCC using ZDM Logical Offline Migration with a backup location. We intended to use this method, as mandatory one for small Oracle SE2 databases, and preferred one for huge Oracle SE2 databases.

Read more: Logical Offline Migration to ExaCC with Oracle Zero Downtime Migration (ZDM) Naming convention

Of course I have anonymised all outputs to remove customer infrastructure names. So let’s take following convention.

ExaCC Cluster 01 node 01 : ExaCC-cl01n1
ExaCC Cluster 01 node 02 : ExaCC-cl01n2
On premises Source Host : vmonpr
Target db_unique_name on the ExaCC : ONPR_RZ2
Database Name to migrate : ONPR
ZDM Host : zdmhost
ZDM user : zdmuser
Domain : domain.com
ExaCC PDB to migrate to : ONPRZ_APP_001T

We will then migrate on-premise Single-Tenant database, named ONPR, to a PDB on the ExaCC. The PDB will be named ONPRZ_APP_001T.

We will migrate 3 schemas : USER1, USER2 and USER3

Ports

It is important to mention that following ports are needed:

SourceDestinationPortZDM HostOn-premise Host22ZDM HostOn-premise HostOracle NetZDM HostExaCC VM (both nodes)22ZDM HostExaCC (scan + VIP)Oracle NetOn-premise HostNFS Server111
2049ExaCCNFS Server111
2049

If Oracle Net ports are for example not opened between ZDM Host and ExaCC, the migration evaluation will immediately stopped at first steps named ZDM_VALIDATE_TGT, and following errors will be found in the log file:

PRGZ-3181 : Internal error: ValidateTargetDbLogicalZdm-5-PRGD-1059 : query to retrieve NLS database parameters failed
PRGD-1002 : SELECT statement "SELECT * FROM GLOBAL_NAME" execution as user "system" failed for database with Java Database Connectivity (JDBC) URL "jdbc:oracle:thin:@(description=(address=(protocol=tcp)(port=1521)(host=ExaCC-cl01-scan.domain.com))(connect_data=(service_name=ONPRZ_APP_001T_PRI.domain.com)))"
IO Error: The Network Adapter could not establish the connection (CONNECTION_ID=9/tZ9Bt5Q5q5VfqU7JC/xA==)
Requirements

There is a few requirements that are needed

streams_pool_size instance parameter on the source database

To have an initial pool allocated and optimal Data Pump performance, source DB instance parameter needs to be set to minimal 256-300 MB for Logical Offline Migration.


Passwordless Login

Passwordless Login needs to be configured between ZDM Host, the Source Host and Target Host. See my previous blog : https://www.dbi-services.com/blog/oracle-zdm-migration-java-security-invalidkeyexception-invalid-key-format/

If Passwordless Login is not configured with one node, you will see such error in the log file during migration evaluation:

PRCZ-2006 : Unable to establish SSH connection to node "ExaCC-cl01n2" to execute command "<command_to_be_executed>"
No more authentication methods available.

Database Character Set

ExaCC target CDB should be in the same character set as on-premise source db. If the final CDB where you would like to host your new PDB has got a character set of AL32UTF8 for example (so this CDB can host various PDB character set) and your source DB is not in AL32UTF8 you will need to go through a temporary CDB on the ExaCC before relocating the PDB to the final one.

To check the character set, run following statement on the on-premise source DB:

SQL> select parameter, value from v$nls_parameters where parameter='NLS_CHARACTERSET';

If your ExaCC target CDB character set (here as example AL32UTF8) does not match the on-premise source DB character set (here as example WE8ISO8859P1), you will get following ZDM error during the evaluation of the migration:

PRGZ-3549 : Source NLS character set WE8ISO8859P1 is different from target NLS character set AL32UTF8.

Create PDB on the ExaCC

Final PDB will have to be created in one of the ExaCC container database according to the character set of the source database.


Create NFS directory

NFS directory and Oracle directories need to be setup to store Oracle dump file created automatically by ZDM. We will create the file system directory on the NFS Mount point and a new Oracle Directory named MIG_SOURCE_DEST in both databases (source and target). NFS directory should be accessible and shared between both environments.

If you do not have any shared NFS between source and target, you will get following kind of errors when evaluating the migration:

zdmhost: 2024-02-06T14:14:17.001Z : Executing phase ZDM_VALIDATE_DATAPUMP_SETTINGS_TGT
zdmhost: 2024-02-06T14:14:19.583Z : validating Oracle Data Pump dump directory /u02/app/oracle/product/19.0.0.0/dbhome_2/rdbms/log/10B7A59DF2E82A9AE063021FA10ABD38 ...
zdmhost: 2024-02-06T14:14:19.587Z : listing directory path /u02/app/oracle/product/19.0.0.0/dbhome_2/rdbms/log/10B7A59DF2E82A9AE063021FA10ABD38 on node ExaCC-cl01n1.domain.com ...
PRGZ-1211 : failed to validate specified database directory object path "/u02/app/oracle/product/19.0.0.0/dbhome_2/rdbms/log/10B7A59DF2E82A9AE063021FA10ABD38"
PRGZ-1420 : specified database import directory object path "/u02/app/oracle/product/19.0.0.0/dbhome_2/rdbms/log/10B7A59DF2E82A9AE063021FA10ABD38" is not shared between source and target database server

After having created the directory on the shared NFS, directory which will be shared on both the source and the target, you will need to create (or use an existing one) an oracle directory. I decided to create a new one, named MIG_SOURCE_DEST. The following will have to be run on both the source and the target databases.

SQL> create directory MIG_SOURCE_DEST as '/mnt/nfs_share/ONPR/';

Directory created.

SQL> select directory_name, directory_path from dba_directories where upper(directory_name) like '%MIG%';

DIRECTORY_NAME                 DIRECTORY_PATH
------------------------------ ------------------------------------------------------------
MIG_SOURCE_DEST                /mnt/nfs_share/ONPR/

You will also need to set correct permissions to the folder knowing that ExaCC OS user might not have the same id than the Source Host OS user.


Source user password version

It is mandatory that the password for all user schemas been migrated is in at least 12c versions. For old password version like 10G or 11G, password for user needs to be change to avoid additional troubleshooting and actions with ZDM migration.

To check user password version on the source, run following SQL statement:

SQL> select username, account_status, lock_date, password_versions from dba_users where ORACLE_MAINTAINED='N';

Prepare ZDM response file

We will use ZDM response file template named zdm_logical_template.rsp and adapt it.

[zdmuser@zdmhost migration]$ cp /u01/app/oracle/product/zdm/rhp/zdm/template/zdm_logical_template.rsp ./zdm_ONPR_logical_offline.rsp

The main parameters to take care of are :

ParameterExplanation DATA_TRANSFER_MEDIUMSpecifies how data will be transferred from the source database system to the target database system.
To be NFS TARGETDATABASE_ADMINUSERNAMEUser to be used on the target for the migration.
To be SYSTEM SOURCEDATABASE_ADMINUSERNAMEUser to be used on the source for the migration.
To be SYSTEM SOURCEDATABASE_CONNECTIONDETAILS_HOSTSource listener host SOURCEDATABASE_CONNECTIONDETAILS_PORTSource listener port. SOURCEDATABASE_CONNECTIONDETAILS_SERVICENAMESource database service name TARGETDATABASE_CONNECTIONDETAILS_HOSTTarget listener host (on ExaCC scan listener) TARGETDATABASE_CONNECTIONDETAILS_PORTTarget listener port.
To be 1521 TARGETDATABASE_CONNECTIONDETAILS_SERVICENAMETarget database service name TARGETDATABASE_DBTYPETarget environment
To be EXADATA DATAPUMPSETTINGS_SCHEMABATCH-1Comma separated list of Database schemas to be migrated DATAPUMPSETTINGS_SCHEMABATCHCOUNTExclusive with schemaBatch option. If specified, user schemas are identified DATAPUMPSETTINGS_DATAPUMPPARAMETERS_IMPORTPARALLELISMDEGREEMaximum number of worker processes that can be used for a Data Pump Import job
For SE2 needs to be configure to 1 DATAPUMPSETTINGS_DATAPUMPPARAMETERS_EXPORTPARALLELISMDEGREEMaximum number of worker processes that can be used for a Data Pump Export job
For SE2 needs to be configure to 1 DATAPUMPSETTINGS_DATAPUMPPARAMETERS_EXCLUDETYPELISTSpecifies a comma separated list of object types to exclude DATAPUMPSETTINGS_EXPORTDIRECTORYOBJECT_NAMEOracle DBA directory that was created on the source for the export DATAPUMPSETTINGS_EXPORTDIRECTORYOBJECT_PATHNFS directory for dump that is used for export DATAPUMPSETTINGS_IMPORTDIRECTORYOBJECT_NAMEOracle DBA directory that was created on the source for the import DATAPUMPSETTINGS_IMPORTDIRECTORYOBJECT_PATHNFS directory for dump that is used for import TABLESPACEDETAILS_AUTOCREATEIf set to TRUE, ZDM will automatically create the tablespaces
To be TRUE

Updated ZDM response file compared to ZDM template for the migration we are going to run:

[zdmuser@zdmhost migration]$ diff zdm_ONPR_logical_offline.rsp /u01/app/oracle/product/zdm/rhp/zdm/template/zdm_logical_template.rsp
30c30
< DATA_TRANSFER_MEDIUM=NFS
---
> DATA_TRANSFER_MEDIUM=OSS
47c47
< TARGETDATABASE_ADMINUSERNAME=system
---
> TARGETDATABASE_ADMINUSERNAME=
63c63
< SOURCEDATABASE_ADMINUSERNAME=system
---
> SOURCEDATABASE_ADMINUSERNAME=
80c80
< SOURCEDATABASE_CONNECTIONDETAILS_HOST=vmonpr
---
> SOURCEDATABASE_CONNECTIONDETAILS_HOST=
90c90
< SOURCEDATABASE_CONNECTIONDETAILS_PORT=13000
---
> SOURCEDATABASE_CONNECTIONDETAILS_PORT=
102c102
< SOURCEDATABASE_CONNECTIONDETAILS_SERVICENAME=ONPR.domain.com
---
> SOURCEDATABASE_CONNECTIONDETAILS_SERVICENAME=
153c153
< TARGETDATABASE_CONNECTIONDETAILS_HOST=ExaCC-cl01-scan.domain.com
---
> TARGETDATABASE_CONNECTIONDETAILS_HOST=
163c163
< TARGETDATABASE_CONNECTIONDETAILS_PORT=1521
---
> TARGETDATABASE_CONNECTIONDETAILS_PORT=
175c175
< TARGETDATABASE_CONNECTIONDETAILS_SERVICENAME=ONPRZ_APP_001T_PRI.domain.com
---
> TARGETDATABASE_CONNECTIONDETAILS_SERVICENAME=
307c307
< TARGETDATABASE_DBTYPE=EXADATA
---
> TARGETDATABASE_DBTYPE=
726c726
< DATAPUMPSETTINGS_SCHEMABATCH-1=USER1,USER2,USER3
---
> DATAPUMPSETTINGS_SCHEMABATCH-1=
947c947
< DATAPUMPSETTINGS_DATAPUMPPARAMETERS_IMPORTPARALLELISMDEGREE=1
---
> DATAPUMPSETTINGS_DATAPUMPPARAMETERS_IMPORTPARALLELISMDEGREE=
957c957
< DATAPUMPSETTINGS_DATAPUMPPARAMETERS_EXPORTPARALLELISMDEGREE=1
---
> DATAPUMPSETTINGS_DATAPUMPPARAMETERS_EXPORTPARALLELISMDEGREE=
969c969
< DATAPUMPSETTINGS_DATAPUMPPARAMETERS_EXCLUDETYPELIST=STATISTICS
---
> DATAPUMPSETTINGS_DATAPUMPPARAMETERS_EXCLUDETYPELIST=
1137c1137
< DATAPUMPSETTINGS_EXPORTDIRECTORYOBJECT_NAME=MIG_SOURCE_DEST
---
> DATAPUMPSETTINGS_EXPORTDIRECTORYOBJECT_NAME=
1146c1146
< DATAPUMPSETTINGS_EXPORTDIRECTORYOBJECT_PATH=/mnt/nfs_share/ONPR
---
> DATAPUMPSETTINGS_EXPORTDIRECTORYOBJECT_PATH=
1166c1166
< DATAPUMPSETTINGS_IMPORTDIRECTORYOBJECT_NAME=MIG_SOURCE_DEST
---
> DATAPUMPSETTINGS_IMPORTDIRECTORYOBJECT_NAME=
1175c1175
< DATAPUMPSETTINGS_IMPORTDIRECTORYOBJECT_PATH=/mnt/nfs_nfs_share/ONPR
---
> DATAPUMPSETTINGS_IMPORTDIRECTORYOBJECT_PATH=
2146c2146
< TABLESPACEDETAILS_AUTOCREATE=TRUE
---
> TABLESPACEDETAILS_AUTOCREATE=

ZDM Build Version

I’m using ZDM build 21.4.

[zdmuser@zdmhost migration]$ /u01/app/oracle/product/zdm/bin/zdmcli -build
version: 21.0.0.0.0
full version: 21.4.0.0.0
patch version: 21.4.1.0.0
label date: 221207.25
ZDM kit build date: Jul 31 2023 14:24:25 UTC
CPAT build version: 23.7.0

The migration will be done using ZDM CLI (zdmcli), which run migration through jobs. We can abort, query, modify, suspend or resume a running job.

Evaluate the migration

We will first run zdmcli with the -eval option to evaluate the migration and test if all is ok.

We need to provide some arguments :

ArgumentValue -sourcesidDatabase Name of the source database in case the source database is a single instance deployed on a non Grid Infrastructure environment -rspZDM response file -sourcenodeSource host -srcauth with 3 sub-arguments:
-srcarg1
-srcarg2
-srcarg3 Name of the source authentication plug-in with 3 sub-arguments:
1st argument: user. Should be oracle
2nd argument: ZDM private RSA Key
3rd argument: sudo location -targetnodeTarget host -tgtauth with 3 sub-arguments:
-tgtarg1
-tgtarg2
-tgtarg3 Name of the target authentication plug-in with 3 sub-arguments:
1st argument: user. Should be opc
2nd argument: ZDM private RSA Key
3rd argument: sudo location
[zdmuser@zdmhost migration]$ /u01/app/oracle/product/zdm/bin/zdmcli migrate database -sourcesid ONPR -rsp /home/zdmuser/migration/zdm_ONPR_logical_offline.rsp -sourcenode vmonpr -srcauth zdmauth -srcarg1 user:oracle -srcarg2 identity_file:/home/zdmuser/.ssh/id_rsa -srcarg3 sudo_location:/usr/bin/sudo -targetnode ExaCC-cl01n1 -tgtauth zdmauth -tgtarg1 user:opc -tgtarg2 identity_file:/home/zdmuser/.ssh/id_rsa -tgtarg3 sudo_location:/usr/bin/sudo -eval
zdmhost.domain.com: Audit ID: 194
Enter source database administrative user "system" password:
Enter target database administrative user "system" password:
Operation "zdmcli migrate database" scheduled with the job ID "27".
[zdmuser@zdmhost migration]$

[zdmuser@zdmhost migration]$ /u01/app/oracle/product/zdm/bin/zdmcli query job -jobid 27
zdmhost.domain.com: Audit ID: 197
Job ID: 27
User: zdmuser
Client: zdmhost
Job Type: "EVAL"
Scheduled job command: "zdmcli migrate database -sourcesid ONPR -rsp /home/zdmuser/migration/zdm_ONPR_logical_offline.rsp -sourcenode vmonpr -srcauth zdmauth -srcarg1 user:oracle -srcarg2 identity_file:/home/zdmuser/.ssh/id_rsa -srcarg3 sudo_location:/usr/bin/sudo -targetnode ExaCC-cl01n1 -tgtauth zdmauth -tgtarg1 user:opc -tgtarg2 identity_file:/home/zdmuser/.ssh/id_rsa -tgtarg3 sudo_location:/usr/bin/sudo -eval"
Scheduled job execution start time: 2024-02-06T16:03:49+01. Equivalent local time: 2024-02-06 16:03:49
Current status: SUCCEEDED
Result file path: "/u01/app/oracle/chkbase/scheduled/job-27-2024-02-06-16:04:01.log"
Metrics file path: "/u01/app/oracle/chkbase/scheduled/job-27-2024-02-06-16:04:01.json"
Excluded objects file path: "/u01/app/oracle/chkbase/scheduled/job-27-filtered-objects-2024-02-06T16:04:13.522.json"
Job execution start time: 2024-02-06 16:04:01
Job execution end time: 2024-02-06 16:05:55
Job execution elapsed time: 1 minutes 54 seconds
ZDM_VALIDATE_TGT ...................... COMPLETED
ZDM_VALIDATE_SRC ...................... COMPLETED
ZDM_SETUP_SRC ......................... COMPLETED
ZDM_PRE_MIGRATION_ADVISOR ............. COMPLETED
ZDM_VALIDATE_DATAPUMP_SETTINGS_SRC .... COMPLETED
ZDM_VALIDATE_DATAPUMP_SETTINGS_TGT .... COMPLETED
ZDM_PREPARE_DATAPUMP_SRC .............. COMPLETED
ZDM_DATAPUMP_ESTIMATE_SRC ............. COMPLETED
ZDM_CLEANUP_SRC ....................... COMPLETED

We can see that the Job Type is EVAL, and that the Current Status is SUCCEEDED with all prechecks steps having a COMPLETED status.

We can also review the log file which will provide us more information. We will see all the checks that the tool is doing. We can also review the output of the advisor which is already warning us about old password for users. Reviewing all the advisor outputs might help. We can also see that ZDM will ignore as non critical a few ORA errors. This make senses because the migration should still happen even if the user is already created with empty objects for example.

[zdmuser@zdmhost ~]$ cat /u01/app/oracle/chkbase/scheduled/job-27-2024-02-06-16:04:01.log
zdmhost: 2024-02-06T15:04:01.505Z : Starting zero downtime migrate operation ...
zdmhost: 2024-02-06T15:04:01.511Z : Executing phase ZDM_VALIDATE_TGT
zdmhost: 2024-02-06T15:04:04.952Z : Fetching details of on premises Exadata Database "ONPRZ_APP_001T_PRI.domain.com"
zdmhost: 2024-02-06T15:04:04.953Z : Type of database : "Exadata at Customer"
zdmhost: 2024-02-06T15:04:05.014Z : Verifying configuration and status of target database "ONPRZ_APP_001T_PRI.domain.com"
zdmhost: 2024-02-06T15:04:09.067Z : Global database name: ONPRZ_APP_001T.DOMAIN.COM
zdmhost: 2024-02-06T15:04:09.067Z : Target PDB name : ONPRZ_APP_001T
zdmhost: 2024-02-06T15:04:09.068Z : Database major version : 19
zdmhost: 2024-02-06T15:04:09.069Z : obtaining database ONPRZ_APP_001T.DOMAIN.COM tablespace configuration details...
zdmhost: 2024-02-06T15:04:09.585Z : Execution of phase ZDM_VALIDATE_TGT completed
zdmhost: 2024-02-06T15:04:09.670Z : Executing phase ZDM_VALIDATE_SRC
zdmhost: 2024-02-06T15:04:09.736Z : Verifying configuration and status of source database "ONPR.domain.com"
zdmhost: 2024-02-06T15:04:09.737Z : source database host vmonpr service ONPR.domain.com
zdmhost: 2024-02-06T15:04:13.464Z : Global database name: ONPR.DOMAIN.COM
zdmhost: 2024-02-06T15:04:13.465Z : Database major version : 19
zdmhost: 2024-02-06T15:04:13.466Z : Validating database time zone compatibility...
zdmhost: 2024-02-06T15:04:13.521Z : Database objects which will be migrated : [USER2, USER3]
zdmhost: 2024-02-06T15:04:13.530Z : Execution of phase ZDM_VALIDATE_SRC completed
zdmhost: 2024-02-06T15:04:13.554Z : Executing phase ZDM_SETUP_SRC
zdmhost: 2024-02-06T15:05:04.925Z : Execution of phase ZDM_SETUP_SRC completed
zdmhost: 2024-02-06T15:05:04.944Z : Executing phase ZDM_PRE_MIGRATION_ADVISOR
zdmhost: 2024-02-06T15:05:05.371Z : Running CPAT (Cloud Premigration Advisor Tool) on the source node vmonpr ...
zdmhost: 2024-02-06T15:05:07.894Z : Premigration advisor output:
Cloud Premigration Advisor Tool Version 23.7.0
CPAT-4007: Warning: the build date for this version of the Cloud Premigration Advisor Tool is over 216 days.  Please run "premigration.sh --updatecheck" to see if a more recent version of this tool is available.
Please download the latest available version of the CPAT application.

Cloud Premigration Advisor Tool completed with overall result: Review Required
Cloud Premigration Advisor Tool generated report location: /u00/app/oracle/zdm/zdm_ONPR_27/out/premigration_advisor_report.json
Cloud Premigration Advisor Tool generated report location: /u00/app/oracle/zdm/zdm_ONPR_27/out/premigration_advisor_report.txt

 CPAT exit code: 2
 RESULT: Review Required

Schemas Analyzed (2): USER3,USER2
A total of 17 checks were performed
There were 0 checks with Failed results
There were 0 checks with Action Required results
There were 2 checks with Review Required results: has_noexport_object_grants (8 relevant objects), has_users_with_10g_password_version (1 relevant objects)
There were 0 checks with Review Suggested results has_noexport_object_grants
         RESULT: Review Required
         DESCRIPTION: Not all object grants are exported by Data Pump.
         ACTION: Recreate any required grants on the target instance.  See Oracle Support Document ID 1911151.1 for more information. Note that any SELECT grants on system objects will need to be replaced with READ grants; SELECT is no longer allowed on system objects.
has_users_with_10g_password_version
         RESULT: Review Required
         DESCRIPTION: Case-sensitive passwords are required on ADB.
         ACTION: To avoid Data Pump migration warnings change the passwords for the listed users before migration. Alternatively, modify these users passwords after migration to avoid login failures. See Oracle Support Document ID 2289453.1 for more information.

zdmhost: 2024-02-06T15:05:07.894Z : Execution of phase ZDM_PRE_MIGRATION_ADVISOR completed
zdmhost: 2024-02-06T15:05:07.948Z : Executing phase ZDM_VALIDATE_DATAPUMP_SETTINGS_SRC
zdmhost: 2024-02-06T15:05:08.545Z : validating Oracle Data Pump dump directory /mnt/nfs_share/ONPR/ ...
zdmhost: 2024-02-06T15:05:08.545Z : validating Data Pump dump directory path /mnt/nfs_share/ONPR/ on node vmonpr.domain.com ...
zdmhost: 2024-02-06T15:05:08.975Z : validating if target database user can read files shared on medium NFS
zdmhost: 2024-02-06T15:05:08.976Z : setting Data Pump dump file permission at source node...
zdmhost: 2024-02-06T15:05:08.977Z : changing group of Data Pump dump files in directory path /mnt/nfs_share/ONPR/ on node vmonpr.domain.com ...
zdmhost: 2024-02-06T15:05:09.958Z : Execution of phase ZDM_VALIDATE_DATAPUMP_SETTINGS_SRC completed
zdmhost: 2024-02-06T15:05:10.005Z : Executing phase ZDM_VALIDATE_DATAPUMP_SETTINGS_TGT
zdmhost: 2024-02-06T15:05:13.307Z : validating Oracle Data Pump dump directory /mnt/nfs_nfs_share/ONPR ...
zdmhost: 2024-02-06T15:05:13.308Z : listing directory path /mnt/nfs_nfs_share/ONPR on node ExaCC-cl01n1.domain.com ...
zdmhost: 2024-02-06T15:05:14.008Z : Execution of phase ZDM_VALIDATE_DATAPUMP_SETTINGS_TGT completed
zdmhost: 2024-02-06T15:05:14.029Z : Executing phase ZDM_PREPARE_DATAPUMP_SRC
zdmhost: 2024-02-06T15:05:14.033Z : Execution of phase ZDM_PREPARE_DATAPUMP_SRC completed
zdmhost: 2024-02-06T15:05:14.058Z : Executing phase ZDM_DATAPUMP_ESTIMATE_SRC
zdmhost: 2024-02-06T15:05:14.059Z : starting Data Pump Dump estimate for database "ONPR.DOMAIN.COM"
zdmhost: 2024-02-06T15:05:14.060Z : running Oracle Data Pump job "ZDM_27_DP_ESTIMATE_6279" for database "ONPR.DOMAIN.COM"
zdmhost: 2024-02-06T15:05:14.071Z : applying Data Pump dump compression ALL algorithm MEDIUM
zdmhost: 2024-02-06T15:05:14.135Z : applying Data Pump dump encryption ALL algorithm AES128
zdmhost: 2024-02-06T15:05:14.135Z : Oracle Data Pump Export parallelism set to 1 ...
zdmhost: 2024-02-06T15:05:14.286Z : Oracle Data Pump errors to be ignored are ORA-31684,ORA-39111,ORA-39082...
zdmhost: 2024-02-06T15:05:23.515Z : Oracle Data Pump log located at /mnt/nfs_share/ONPR//ZDM_27_DP_ESTIMATE_6279.log in the Database Server node
zdmhost: 2024-02-06T15:05:53.643Z : Total estimation using BLOCKS method: 3.112 GB
zdmhost: 2024-02-06T15:05:53.644Z : Execution of phase ZDM_DATAPUMP_ESTIMATE_SRC completed
zdmhost: 2024-02-06T15:05:53.721Z : Executing phase ZDM_CLEANUP_SRC
zdmhost: 2024-02-06T15:05:54.261Z : Cleaning up ZDM on the source node vmonpr ...
zdmhost: 2024-02-06T15:05:55.506Z : Execution of phase ZDM_CLEANUP_SRC completed

Migrate Source database to ExaCC

Once the evaluation is all good, we can move forward with running the migration. It is exactly the same zdmcli command without the option -eval.

Let’s have a try and run it. We will have to provide both source and target system password:

[zdmuser@zdmhost migration]$ /u01/app/oracle/product/zdm/bin/zdmcli migrate database -sourcesid ONPR -rsp /home/zdmuser/migration/zdm_ONPR_logical_offline.rsp -sourcenode vmonpr -srcauth zdmauth -srcarg1 user:oracle -srcarg2 identity_file:/home/zdmuser/.ssh/id_rsa -srcarg3 sudo_location:/usr/bin/sudo -targetnode ExaCC-cl01n1 -tgtauth zdmauth -tgtarg1 user:opc -tgtarg2 identity_file:/home/zdmuser/.ssh/id_rsa -tgtarg3 sudo_location:/usr/bin/sudo
zdmhost.domain.com: Audit ID: 205
Enter source database administrative user "system" password:
Enter target database administrative user "system" password:
Operation "zdmcli migrate database" scheduled with the job ID "29".

We will query the job:

[zdmuser@zdmhost migration]$ /u01/app/oracle/product/zdm/bin/zdmcli query job -jobid 29
zdmhost.domain.com: Audit ID: 211
Job ID: 29
User: zdmuser
Client: zdmhost
Job Type: "MIGRATE"
Scheduled job command: "zdmcli migrate database -sourcesid ONPR -rsp /home/zdmuser/migration/zdm_ONPR_logical_offline.rsp -sourcenode vmonpr -srcauth zdmauth -srcarg1 user:oracle -srcarg2 identity_file:/home/zdmuser/.ssh/id_rsa -srcarg3 sudo_location:/usr/bin/sudo -targetnode ExaCC-cl01n1 -tgtauth zdmauth -tgtarg1 user:opc -tgtarg2 identity_file:/home/zdmuser/.ssh/id_rsa -tgtarg3 sudo_location:/usr/bin/sudo"
Scheduled job execution start time: 2024-02-07T08:21:38+01. Equivalent local time: 2024-02-07 08:21:38
Current status: FAILED
Result file path: "/u01/app/oracle/chkbase/scheduled/job-29-2024-02-07-08:22:03.log"
Metrics file path: "/u01/app/oracle/chkbase/scheduled/job-29-2024-02-07-08:22:03.json"
Excluded objects file path: "/u01/app/oracle/chkbase/scheduled/job-29-filtered-objects-2024-02-07T08:22:16.074.json"
Job execution start time: 2024-02-07 08:22:03
Job execution end time: 2024-02-07 08:30:29
Job execution elapsed time: 8 minutes 25 seconds
ZDM_VALIDATE_TGT ...................... COMPLETED
ZDM_VALIDATE_SRC ...................... COMPLETED
ZDM_SETUP_SRC ......................... COMPLETED
ZDM_PRE_MIGRATION_ADVISOR ............. COMPLETED
ZDM_VALIDATE_DATAPUMP_SETTINGS_SRC .... COMPLETED
ZDM_VALIDATE_DATAPUMP_SETTINGS_TGT .... COMPLETED
ZDM_PREPARE_DATAPUMP_SRC .............. COMPLETED
ZDM_DATAPUMP_ESTIMATE_SRC ............. COMPLETED
ZDM_PREPARE_DATAPUMP_TGT .............. COMPLETED
ZDM_PARALLEL_EXPORT_IMPORT ............ FAILED
ZDM_POST_DATAPUMP_SRC ................. PENDING
ZDM_POST_DATAPUMP_TGT ................. PENDING
ZDM_POST_ACTIONS ...................... PENDING
ZDM_CLEANUP_SRC ....................... PENDING

As we can see the jobs failed during import of the data.

Checking ZDM logs file I could see following errors:

ORA-39384: Warning: User USER2 has been locked and the password expired.
ORA-39384: Warning: User USER1 has been locked and the password expired.

Checking the user on the source, I could see that USER1 and USER2 is having only password in old 10G version, which definitively will make problem :

SQL> select username, account_status, lock_date, password_versions from dba_users where ORACLE_MAINTAINED='N';

USERNAME                       ACCOUNT_STATUS                   LOCK_DATE            PASSWORD_VERSIONS
------------------------------ -------------------------------- -------------------- -----------------
USER1                          OPEN                                                  10G
USER2                          OPEN                                                  10G
USER3                          OPEN                                                  10G 11G 12C

3 rows selected.

Checking on the target PDB on the ExaCC, I could see that, as these 2 users were having 10G password, ZDM, after importing the data, locked the related users:

SQL> select username, account_status, lock_date from dba_users where ORACLE_MAINTAINED='N';

USERNAME                       ACCOUNT_STATUS                   LOCK_DATE
------------------------------ -------------------------------- --------------------
USER1                          EXPIRED & LOCKED                 07-FEB-2024 08:26:10
ADMIN                          LOCKED                           06-FEB-2024 14:36:18
USER2                          EXPIRED & LOCKED                 07-FEB-2024 08:26:10
USER3                          OPEN

4 rows selected.

On the ExaCC target PDB, I unlocked the user and changed the password.

SQL> alter user USER2 account unlock;

User altered.

SQL> alter user user1 account unlock;

User altered.

SQL> alter user USER2 identified by ************;

User altered.

SQL> alter user user1 identified by ************;

User altered.

SQL> select username, account_status, lock_date from dba_users where ORACLE_MAINTAINED='N';

USERNAME                       ACCOUNT_STATUS                   LOCK_DATE
------------------------------ -------------------------------- --------------------
USER1                          OPEN
ADMIN                          LOCKED                           06-FEB-2024 14:36:18
USER2                          OPEN
USER3                          OPEN

6 rows selected.

And I resumed the zdmcli jobs so he would start again where it was failing:

[zdmuser@zdmhost migration]$ /u01/app/oracle/product/zdm/bin/zdmcli resume job -jobid 29
zdmhost.domain.com: Audit ID: 213

The job was still failing at the same step, and in the log file I could find several errors like :

BATCH1 : Non-ignorable errors found in Oracle Data Pump job ZDM_29_DP_IMPORT_5005_BATCH1 log are
ORA-39151: Table "USER3"."OPB_MAP_OPTIONS" exists. All dependent metadata and data will be skipped due to table_exists_action of skip
ORA-39151: Table "USER3"."OPB_USER_GROUPS" exists. All dependent metadata and data will be skipped due to table_exists_action of skip

In fact, as ZDM previously failed on the import step, ZDM tried to import the data again. But table was still there.

So I had to cleanup the Target PDB on the ExaCC for USER3 and USER2. USER1 had no objects.

As I did not want to change the on-premise source database, changing user password, I checked how the users were created on the ExaCC, I dropped them to create them again before resuming the jobs.

SQL> set long 99999999
SQL> select dbms_metadata.get_ddl('USER','USER2') from dual;

DBMS_METADATA.GET_DDL('USER','USER2')
--------------------------------------------------------------------------------

   CREATE USER "USER2" IDENTIFIED BY VALUES 'S:C5EF**********3F79'
      DEFAULT TABLESPACE "TSP******"
      TEMPORARY TABLESPACE "TEMP"

SQL> select dbms_metadata.get_ddl('USER','USER3') from dual;

DBMS_METADATA.GET_DDL('USER','USER3')
--------------------------------------------------------------------------------

   CREATE USER "USER3" IDENTIFIED BY VALUES 'S:EDD8**********FD44'
      DEFAULT TABLESPACE "TSP******"
      TEMPORARY TABLESPACE "TEMP"

SQL> drop user USER2 cascade;

User dropped.

SQL> drop user USER3 cascade;

User dropped.

SQL> CREATE USER "USER3" IDENTIFIED BY VALUES 'S:EDD86**********8FD44'
  2  DEFAULT TABLESPACE "TSP******"
  3  TEMPORARY TABLESPACE "TEMP";

User created.

SQL> CREATE USER "USER2" IDENTIFIED BY VALUES 'S:C5EF**********3F79'
  2  DEFAULT TABLESPACE "TSP******"
  3  TEMPORARY TABLESPACE "TEMP";

User created.

And I resumed the job once again:

[zdmuser@zdmhost migration]$ /u01/app/oracle/product/zdm/bin/zdmcli resume job -jobid 29
zdmhost.domain.com: Audit ID: 219

And now the migration has been completed successfully. The job type is MIGRATE now and Current Status is SUCCEEDED:

[zdmuser@zdmhost migration]$ /u01/app/oracle/product/zdm/bin/zdmcli query job -jobid 29
zdmhost.domain.com: Audit ID: 223
Job ID: 29
User: zdmuser
Client: zdmhost
Job Type: "MIGRATE"
Scheduled job command: "zdmcli migrate database -sourcesid ONPR -rsp /home/zdmuser/migration/zdm_ONPR_logical_offline.rsp -sourcenode vmonpr -srcauth zdmauth -srcarg1 user:oracle -srcarg2 identity_file:/home/zdmuser/.ssh/id_rsa -srcarg3 sudo_location:/usr/bin/sudo -targetnode ExaCC-cl01n1 -tgtauth zdmauth -tgtarg1 user:opc -tgtarg2 identity_file:/home/zdmuser/.ssh/id_rsa -tgtarg3 sudo_location:/usr/bin/sudo"
Scheduled job execution start time: 2024-02-07T08:21:38+01. Equivalent local time: 2024-02-07 08:21:38
Current status: SUCCEEDED
Result file path: "/u01/app/oracle/chkbase/scheduled/job-29-2024-02-07-08:22:03.log"
Metrics file path: "/u01/app/oracle/chkbase/scheduled/job-29-2024-02-07-08:22:03.json"
Excluded objects file path: "/u01/app/oracle/chkbase/scheduled/job-29-filtered-objects-2024-02-07T08:22:16.074.json"
Job execution start time: 2024-02-07 08:22:03
Job execution end time: 2024-02-07 09:01:21
Job execution elapsed time: 14 minutes 43 seconds
ZDM_VALIDATE_TGT ...................... COMPLETED
ZDM_VALIDATE_SRC ...................... COMPLETED
ZDM_SETUP_SRC ......................... COMPLETED
ZDM_PRE_MIGRATION_ADVISOR ............. COMPLETED
ZDM_VALIDATE_DATAPUMP_SETTINGS_SRC .... COMPLETED
ZDM_VALIDATE_DATAPUMP_SETTINGS_TGT .... COMPLETED
ZDM_PREPARE_DATAPUMP_SRC .............. COMPLETED
ZDM_DATAPUMP_ESTIMATE_SRC ............. COMPLETED
ZDM_PREPARE_DATAPUMP_TGT .............. COMPLETED
ZDM_PARALLEL_EXPORT_IMPORT ............ COMPLETED
ZDM_POST_DATAPUMP_SRC ................. COMPLETED
ZDM_POST_DATAPUMP_TGT ................. COMPLETED
ZDM_POST_ACTIONS ...................... COMPLETED
ZDM_CLEANUP_SRC ....................... COMPLETED

ZDM log file:

[zdmuser@zdmhost ~]$ tail -37 /u01/app/oracle/chkbase/scheduled/job-29-2024-02-07-08:22:03.log
####################################################################
zdmhost: 2024-02-07T07:56:33.580Z : Resuming zero downtime migrate operation ...
zdmhost: 2024-02-07T07:56:33.587Z : Starting zero downtime migrate operation ...
zdmhost: 2024-02-07T07:56:37.205Z : Fetching details of on premises Exadata Database "ONPRZ_APP_001T_PRI.domain.com"
zdmhost: 2024-02-07T07:56:37.205Z : Type of database : "Exadata at Customer"
zdmhost: 2024-02-07T07:56:37.283Z : Skipping phase ZDM_VALIDATE_SRC on resume
zdmhost: 2024-02-07T07:56:37.365Z : Skipping phase ZDM_SETUP_SRC on resume
zdmhost: 2024-02-07T07:56:37.377Z : Skipping phase ZDM_PRE_MIGRATION_ADVISOR on resume
zdmhost: 2024-02-07T07:56:37.391Z : Skipping phase ZDM_VALIDATE_DATAPUMP_SETTINGS_SRC on resume
zdmhost: 2024-02-07T07:56:37.406Z : Skipping phase ZDM_VALIDATE_DATAPUMP_SETTINGS_TGT on resume
zdmhost: 2024-02-07T07:56:37.422Z : Skipping phase ZDM_PREPARE_DATAPUMP_SRC on resume
zdmhost: 2024-02-07T07:56:37.437Z : Skipping phase ZDM_DATAPUMP_ESTIMATE_SRC on resume
zdmhost: 2024-02-07T07:56:37.455Z : Skipping phase ZDM_PREPARE_DATAPUMP_TGT on resume
zdmhost: 2024-02-07T07:56:37.471Z : Executing phase ZDM_PARALLEL_EXPORT_IMPORT
zdmhost: 2024-02-07T07:56:37.482Z : Skipping phase ZDM_DATAPUMP_EXPORT_SRC_BATCH1 on resume
zdmhost: 2024-02-07T07:56:37.485Z : Skipping phase ZDM_TRANSFER_DUMPS_SRC_BATCH1 on resume
zdmhost: 2024-02-07T07:56:37.487Z : Executing phase ZDM_DATAPUMP_IMPORT_TGT_BATCH1
zdmhost: 2024-02-07T07:56:38.368Z : listing directory path /mnt/nfs_nfs_share/ONPR on node ExaCC-cl01n1.domain.com ...
zdmhost: 2024-02-07T07:56:39.474Z : Oracle Data Pump Import parallelism set to 1 ...
zdmhost: 2024-02-07T07:56:39.481Z : Oracle Data Pump errors to be ignored are ORA-31684,ORA-39111,ORA-39082...
zdmhost: 2024-02-07T07:56:39.481Z : starting Data Pump Import for database "ONPRZ_APP_001T.DOMAIN.COM"
zdmhost: 2024-02-07T07:56:39.482Z : running Oracle Data Pump job "ZDM_29_DP_IMPORT_5005_BATCH1" for database "ONPRZ_APP_001T.DOMAIN.COM"
zdmhost: 2024-02-07T08:00:46.569Z : Oracle Data Pump job "ZDM_29_DP_IMPORT_5005_BATCH1" for database "ONPRZ_APP_001T.DOMAIN.COM" completed.
zdmhost: 2024-02-07T08:00:46.569Z : Oracle Data Pump log located at /mnt/nfs_nfs_share/ONPR/ZDM_29_DP_IMPORT_5005_BATCH1.log in the Database Server node
zdmhost: 2024-02-07T08:01:17.239Z : Execution of phase ZDM_DATAPUMP_IMPORT_TGT_BATCH1 completed
zdmhost: 2024-02-07T08:01:17.248Z : Execution of phase ZDM_PARALLEL_EXPORT_IMPORT completed
zdmhost: 2024-02-07T08:01:17.268Z : Executing phase ZDM_POST_DATAPUMP_SRC
zdmhost: 2024-02-07T08:01:17.272Z : listing directory path /mnt/nfs_share/ONPR/ on node vmonpr.domain.com ...
zdmhost: 2024-02-07T08:01:17.811Z : deleting Data Pump dump in directory path /mnt/nfs_share/ONPR/ on node vmonpr.domain.com ...
zdmhost: 2024-02-07T08:01:19.052Z : Execution of phase ZDM_POST_DATAPUMP_SRC completed
zdmhost: 2024-02-07T08:01:19.070Z : Executing phase ZDM_POST_DATAPUMP_TGT
zdmhost: 2024-02-07T08:01:19.665Z : Execution of phase ZDM_POST_DATAPUMP_TGT completed
zdmhost: 2024-02-07T08:01:19.689Z : Executing phase ZDM_POST_ACTIONS
zdmhost: 2024-02-07T08:01:19.693Z : Execution of phase ZDM_POST_ACTIONS completed
zdmhost: 2024-02-07T08:01:19.716Z : Executing phase ZDM_CLEANUP_SRC
zdmhost: 2024-02-07T08:01:20.213Z : Cleaning up ZDM on the source node vmonpr ...
zdmhost: 2024-02-07T08:01:21.458Z : Execution of phase ZDM_CLEANUP_SRC completed
[zdmuser@zdmhost ~]$

If we check the ZDM import log saved on the NFS shared folder, here named ZDM_32_DP_IMPORT_1847_BATCH1.log, we would see that the import has been done successfully with 3 errors. The 3 errors are displayed in the same log file and are:

09-FEB-24 10:00:22.534: W-1 Processing object type SCHEMA_EXPORT/USER
09-FEB-24 10:00:22.943: ORA-31684: Object type USER:"USER1" already exists
09-FEB-24 10:00:22.943: ORA-31684: Object type USER:"USER2" already exists
09-FEB-24 10:00:22.943: ORA-31684: Object type USER:"USER3" already exists

These errors are here because we created the user on the ExaCC target DB previously to resuming zdmcli job, thus before performing the import again. These errors are fortunately part of the list that ZDM would ignore, which make senses.

Checks

We can then of course do some tests as comparing the number of objects for the migrated users on the source and the target, checking pdb violation, checking invalid objects, ensuring that tablespace are encrypted on the ExaCC target DB, and so on.

To compare number of objects:

SQL> select owner, count(*) from dba_objects where owner in ('USER1','USER2','USER3') group by owner order by 1;

OWNER             COUNT(*)
--------------- ----------
USER3                758
USER2                760

To check that tablespace are encrypted:

SQL> select a.con_id, a.tablespace_name, nvl(b.ENCRYPTIONALG,'NOT ENCRYPTED') from  cdb_tablespaces a, (select x.con_id, y.ENCRYPTIONALG, x.name from V$TABLESPACE x,  V$ENCRYPTED_TABLESPACES y where x.ts#=y.ts# and x.con_id=y.con_id) b where a.con_id=b.con_id(+) and a.tablespace_name=b.name(+) order by 1,2;

To check pdb violations:

SQL> select status, message from pdb_plug_in_violations;

To check invalid objects:

SQL> select count(*) from dba_invalid_objects;

And we could, of course, if needed, relocate the PDB to another ExaCC CDB.

Conclusion

That’s it. We could easily migrate a single-tenant on-premise database to ExaCC PDB using ZDM Logical Offline. The tools really have advantages. We do not need to deal with any oracle command, like running datapump on ourselves.

In the next blog I will show you we migrated on-premises database to ExaCC on our customer system using ZDM Physical Online Migration.

L’article Logical Offline Migration to ExaCC with Oracle Zero Downtime Migration (ZDM) est apparu en premier sur dbi Blog.

Keep your data sorted with AI

Tue, 2024-02-20 09:08

As mentioned in my previous posts about Enterprise Content Management (like this one). A key point to have your Content efficiently stored is the use of Metadata. It helps to sort and retrieve easily your company data.

But to be honest, this part is often boring for technical people like me. Imagine it for people for whom IT is just a tool to get the job done.

On the one hand, we need to be precise to have a fine organization, but on the other hand, the more metadata we insert, the less likely we are to have filled them correctly.

Based on my experience, if there are a lot of properties required before uploading or creating a document, it leads irremediably to a partial adoption of the solution. Or even worst to wrong information associated to the document, which can cause other issues and prevent the system to serve the business as designed.

All the challenge is to find the right balance. But no worries, M-Files is there, to help you!

Let’s introduce the M-Files Intelligent Metadata Layer module.

IML includes 2 main aspects.

The First one is a repository-neutral approach to unify your Enterprise Content irrespective of the sources (as soon as a connector exists or you develop it ), as seen below:

M-Files Intelligent Metadata LayerM-Files IML (Source M-Files)

The second, the one we are interested of today, is Intelligent Services. These are artificial intelligence components that provide metadata suggestions by analyzing content with natural language processing techniques and text analytics.

Again this Intelligent Services is sub-divided into several services and we will focus on two:

M-Files Matcher

This module analyzes the documents and search for a matching value.

For example this module can catch the name of a supplier present in a document and suggest you to put it as a property “supplier” for the document.

Basic M-Files MatcherBasic M-Files Matcher

But It can also be more smart, like you have an e-mail address in the document, this e-mail address is the one used to contact the supplier, then M-Files can make the link and offer you to relate it to the document.

Advanced M-Files MatcherAdvanced M-Files Matcher M-Files Text Analytics

This module is a bit different, with some configuration made by your favorite M-Files administrator (us), it can suggest some property values detected in the document.

Of course, in the example below the capacities are not exhaustive

M-Files Text AnalyticsM-Files Text Analytics

Firstly based on the document title, we can select the type of document, in this case SOP or Procedure suggest the class “SOP Working copy”

Then it detect:

  • the title
  • the ID of the document
  • the short description
  • The department concerned by the document

As mentioned before, it is only suggestions. If you are not agree with a value, feel free to put your own.

This is what I configured for this example, but as soon as you can write a Regular expression to catch the data then you can automate it, Awesome!

Find more information about M-Files AI click here.

L’article Keep your data sorted with AI est apparu en premier sur dbi Blog.

Kubernetes Networking by Using Cilium – Intermediate Level – Part-1

Tue, 2024-02-20 01:36

If you are new or uneasy with networking in Kubernetes, you may benefit from my previous blog for beginner level. In this blog post I will show you in a Kubernetes cluster what a building and its networking components look like. As a reminder, below is the picture I drew in my previous blog to illustrate the networking in a Kubernetes cluster with Cilium:

If you want to understand this networking in Kubernetes in more details, read on, this blog post is for you! I’ll consider you know the basics about Kubernetes and how to interact with it, otherwise you may find our training course on it very interesting (in English or in French)!

Diving into the IP Addresses configuration

Let’s start by checking our environment and update our picture with real information from our Kubernetes cluster:

$ kubectl get no -owide
NAME                      STATUS   ROLES           AGE    VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION      CONTAINER-RUNTIME
mycluster-control-plane   Ready    control-plane   113d   v1.27.3   172.18.0.3    <none>        Debian GNU/Linux 11 (bullseye)   5.15.0-94-generic   containerd://1.7.1
mycluster-worker          Ready    <none>          113d   v1.27.3   172.18.0.2    <none>        Debian GNU/Linux 11 (bullseye)   5.15.0-94-generic   containerd://1.7.1
mycluster-worker2         Ready    <none>          113d   v1.27.3   172.18.0.4    <none>        Debian GNU/Linux 11 (bullseye)   5.15.0-94-generic   containerd://1.7.1

$ kubectl get po -n networking101 -owide
NAME                        READY   STATUS    RESTARTS      AGE     IP            NODE                NOMINATED NODE   READINESS GATES
busybox-c8bbbbb84-fmhwc     1/1     Running   1 (24m ago)   3d23h   10.10.1.164   mycluster-worker2   <none>           <none>
busybox-c8bbbbb84-t6ggh     1/1     Running   1 (24m ago)   3d23h   10.10.2.117   mycluster-worker    <none>           <none>
netshoot-7d996d7884-fwt8z   1/1     Running   0             79s     10.10.2.121   mycluster-worker    <none>           <none>
netshoot-7d996d7884-gcxrm   1/1     Running   0             80s     10.10.1.155   mycluster-worker2   <none>           <none>

You can now see for real that the IP subnets of the pods are different than the one of the nodes. Also the IP subnet of pods on each node is different from each other. If you are not sure why, you are perfectly right because it is not so clear at this stage. So let’s clarify it by checking our Cilium configuration.

I’ve told you in my previous blog that there is one Cilium Agent per building. This Agent is a pod itself and he takes care about networking in the node. This is what they look like in our cluster:

$ kubectl get po -n kube-system -owide|grep cilium
cilium-9zh9s                                      1/1     Running   5 (65m ago)   113d   172.18.0.3    mycluster-control-plane   <none>           <none>
cilium-czffc                                      1/1     Running   5 (65m ago)   113d   172.18.0.4    mycluster-worker2         <none>           <none>
cilium-dprvh                                      1/1     Running   5 (65m ago)   113d   172.18.0.2    mycluster-worker          <none>           <none>
cilium-operator-6b865946df-24ljf                  1/1     Running   5 (65m ago)   113d   172.18.0.2    mycluster-worker          <none>           <none>

There is two things to notice here:

  • The Cilium Agent is a Daemonset so that is how you make sure to always have one on each node of our cluster. As it is a pod, it also gets an IP Address… but wait a minute… this is the same IP Address as the node! Exactly! This is a special case for pods IP Address assignation, usually for system pods that need direct access to the node (host) network. If you look at the pods in the kube-system namespace, you’ll see most of them uses the node IP Address.
  • The Cilium Operator pod is responsible for IP address management in the cluster and so it gives to each Cilium Agent its range to use.

Now you want to see which IP range is used by each node right? Let’s just check that Cilium Agent on each node as we have found their name above:

$ kubectl exec -it -n kube-system cilium-dprvh -- cilium debuginfo | grep IPAM
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
IPAM:                   IPv4: 5/254 allocated from 10.10.2.0/24,

$ kubectl exec -it -n kube-system cilium-czffc -- cilium debuginfo | grep IPAM
Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
IPAM:                   IPv4: 5/254 allocated from 10.10.1.0/24,

You can now see the different IP subnet on each node. In my previous blog I told you that an IP Address belong to a group and it uses the subnet mask. This subnet mask is here /24 which means for the first node that any address starting with 10.10.2 belongs to the same group. For the second node it is 10.10.1 and so they are both in a separate group or IP subnet.

What now about checking the interfaces that are the doors of our drawing?

Diving into the interfaces configuration

Let’s explore our buildings and see what we could find out! We are going to start with our four pods:

$ kubectl get po -n networking101 -owide
NAME                        READY   STATUS    RESTARTS       AGE    IP            NODE                NOMINATED NODE   READINESS GATES
busybox-c8bbbbb84-fmhwc     1/1     Running   1 (125m ago)   4d1h   10.10.1.164   mycluster-worker2   <none>           <none>
busybox-c8bbbbb84-t6ggh     1/1     Running   1 (125m ago)   4d1h   10.10.2.117   mycluster-worker    <none>           <none>
netshoot-7d996d7884-fwt8z   1/1     Running   0              103m   10.10.2.121   mycluster-worker    <none>           <none>
netshoot-7d996d7884-gcxrm   1/1     Running   0              103m   10.10.1.155   mycluster-worker2   <none>           <none>

$ kubectl exec -it -n networking101 busybox-c8bbbbb84-t6ggh -- ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
8: eth0@if9: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue qlen 1000
    link/ether 9e:80:70:0d:d9:37 brd ff:ff:ff:ff:ff:ff
    inet 10.10.2.117/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::9c80:70ff:fe0d:d937/64 scope link
       valid_lft forever preferred_lft forever

$ kubectl exec -it -n networking101 netshoot-7d996d7884-fwt8z -- ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
12: eth0@if13: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether ea:c4:71:d6:4f:a0 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.10.2.121/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::e8c4:71ff:fed6:4fa0/64 scope link
       valid_lft forever preferred_lft forever

$ kubectl exec -it -n networking101 netshoot-7d996d7884-gcxrm -- ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
12: eth0@if13: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether be:57:3d:54:40:f1 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.10.1.155/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::bc57:3dff:fe54:40f1/64 scope link
       valid_lft forever preferred_lft forever

$ kubectl exec -it -n networking101 busybox-c8bbbbb84-fmhwc -- ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
10: eth0@if11: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue qlen 1000
    link/ether 2a:7f:05:a0:69:db brd ff:ff:ff:ff:ff:ff
    inet 10.10.1.164/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::287f:5ff:fea0:69db/64 scope link
       valid_lft forever preferred_lft forever

You can see that each container has only one network interface in addition to its local loopback. The format is for example 8: eth0@if9 which means the interface in the container has the number 9 and is linked to its pair interface number 8 of the node it is hosted on. These are the 2 doors connected by a corridor in my drawing.

Then check the nodes network interfaces:

$ sudo docker exec -it mycluster-worker ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: cilium_net@cilium_host: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 5e:84:64:22:90:7f brd ff:ff:ff:ff:ff:ff
3: cilium_host@cilium_net: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether ca:7e:e1:cc:4e:74 brd ff:ff:ff:ff:ff:ff
    inet 10.10.2.205/32 scope global cilium_host
       valid_lft forever preferred_lft forever
4: cilium_vxlan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether f6:bf:81:9b:e2:c5 brd ff:ff:ff:ff:ff:ff
5: eth0@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 02:42:ac:12:00:02 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.18.0.2/16 brd 172.18.255.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fc00:f853:ccd:e793::2/64 scope global nodad
       valid_lft forever preferred_lft forever
    inet6 fe80::42:acff:fe12:2/64 scope link
       valid_lft forever preferred_lft forever
7: lxc_health@if6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 8a:de:c1:2c:f5:83 brd ff:ff:ff:ff:ff:ff link-netnsid 1
9: lxc4a891387ff1a@if8: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether d6:21:74:eb:67:6b brd ff:ff:ff:ff:ff:ff link-netns cni-67a5da05-a221-ade5-08dc-64808339ad05
11: lxc5b7b34955e61@if10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether f2:80:da:a5:17:74 brd ff:ff:ff:ff:ff:ff link-netns cni-0b438679-e5d3-d429-85c0-b6e3c8914250
13: lxc73d2e1d7cf4f@if12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether f6:87:b6:c3:a6:45 brd ff:ff:ff:ff:ff:ff link-netns cni-f608f13c-1869-6134-3d6b-a0f76fd6d483

$ sudo docker exec -it mycluster-worker2 ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: cilium_net@cilium_host: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether f2:91:2b:31:1f:47 brd ff:ff:ff:ff:ff:ff
3: cilium_host@cilium_net: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether be:7f:e0:2b:d6:b1 brd ff:ff:ff:ff:ff:ff
    inet 10.10.1.55/32 scope global cilium_host
       valid_lft forever preferred_lft forever
4: cilium_vxlan: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether e6:c8:8d:5d:1e:2d brd ff:ff:ff:ff:ff:ff
6: lxc_health@if5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether d2:cf:ec:4c:51:b6 brd ff:ff:ff:ff:ff:ff link-netnsid 1
8: lxcdc5fb9751595@if7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether fe:b4:3a:e0:67:a3 brd ff:ff:ff:ff:ff:ff link-netns cni-c0d4bea2-92fd-03fb-ba61-3656864d8bd7
9: eth0@if10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 02:42:ac:12:00:04 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.18.0.4/16 brd 172.18.255.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fc00:f853:ccd:e793::4/64 scope global nodad
       valid_lft forever preferred_lft forever
    inet6 fe80::42:acff:fe12:4/64 scope link
       valid_lft forever preferred_lft forever
11: lxc174c023046ff@if10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether ae:7a:b9:6b:b3:c1 brd ff:ff:ff:ff:ff:ff link-netns cni-4172177b-df75-61a8-884c-f9d556165df2
13: lxce84a702bb02c@if12: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 92:65:df:09:dd:28 brd ff:ff:ff:ff:ff:ff link-netns cni-d259ef79-a81c-eba6-1255-6e46b8d1c779

On each node there are several interfaces to notice. I’ll take the first node for example:

  • eth0@if6: As our Kubernetes cluster is created with Kind, a node is actually a container (and this interface open a corridor to its pair interface on my laptop). If it feels like the movie Inception, well, it is a perfectly correct comparison! This interface is the main door of the building.
  • lxc4a891387ff1a@if8: This is the pair interface number 8 that is linked to the left container above.
  • lxc73d2e1d7cf4f@if12: This is the pair interface number 12 that is linked to the right container above.
  • cilium_host@cilium_net: This is the circle interface in my drawing that allows the routing to/from other nodes in our cluster.
  • cilium_vxlan: This is the rectangle in my drawing and is the tunnel interface that will transport you to/from the other nodes in our cluster.

Let’s now get the complete picture by updating our drawing with these information:

Wrap up

With this foundation knowledge, you now have all the key elements to understand the communication between pods on the same node or on different nodes. This is what we will look at in my next blog post. Stay tuned!

L’article Kubernetes Networking by Using Cilium – Intermediate Level – Part-1 est apparu en premier sur dbi Blog.

Pages