Project:MariaDB-K8s: Difference between revisions

From MaRDI portal
No edit summary
No edit summary
Line 73: Line 73:
To restore or load a backup file once the DB is running.
To restore or load a backup file once the DB is running.


1) Copy the backup file inside the main DB pod. The main pod is the only one with write rights. The file can be copied into the /tmp directory with:
1) Copy the backup file inside the main DB pod. The main pod is the only one with write rights. The file can be copied into the volume attached to the mariadb-0 pod, which will have enough space for it. This is mounted to /var/lib/mysql:


<syntaxhighlight lang=bash>
<syntaxhighlight lang=bash>
kubectl cp ./portal_db_backup.gz mariadb-0:/tmp
kubectl cp ./portal_db_backup.gz (production|staging)/mariadb-0:/var/lib/mysql
</syntaxhighlight>
</syntaxhighlight>


2) Unzip and load the file into the database:
2) Unzip and load the file into the database:
Unzipping directly will probably fail with out of memory errors when the backup file is big.


<syntaxhighlight lang=bash>
<syntaxhighlight lang=bash>
cd /tmp
cd /var/lib/mysql
gzip -d portal_db_backup.gz
gzip -d portal_db_backup.gz
mariadb -u root -p [enter password]
</syntaxhighlight>
 
An alternative is to stream the compressed file into a new file:
 
<syntaxhighlight lang=bash>
gunzip -c portal_db_backup.gz > portal_backup
</syntaxhighlight>
</syntaxhighlight>


<syntaxhighlight lang=sql>
<syntaxhighlight lang=sql>
mariadb -u root -p [enter password]
SOURCE /tmp/portal_db_backup
SOURCE /tmp/portal_db_backup
</syntaxhighlight>
</syntaxhighlight>

Revision as of 10:12, 10 April 2025

MariaDB configuration

Set up the DB

The main database of the portal is based on MariaDB. Our MariaDB instance is deployed an managed using mariadb-operator, which allows to use Custom Resource Definitions (CRDs) in Kubernetes to declaratively manage our database. In simple terms, this allows us to define all our DB resources (databases, uses, grants...) using YAML files.

The MariaDB operator takes care of the DB replication to spin up a main node and several other read replicas.

The following definition in mariadb.yaml assigns an external IP to access the DB from outside the cluster.

primaryService:
  type: LoadBalancer

secondaryService:
  type: LoadBalancer

primaryService represents the write node whereas secondaryService refers to the read-only replicas. With

kubectl get svc

it is possible to see which specific IP was assigned to these services. These IPs are necessary to set up the connection between the MediaWiki container and the DB.

MediaWiki configuration

MediaWiki requires its SQL user to have the SLAVE MONITOR and BINLOG MONITOR privileges in order for replication to work. These privileges are thus defined in grant.yaml.

The connection between MediaWiki and the different DB nodes can be specified with the following parameter in LocalSettings.php, substituting primaryService-IP and secondaryService-IP by the right values:

$wgLBFactoryConf = array(

    'class' => 'LBFactoryMulti',

    'sectionsByDB' => array(
        'my_wiki' => 's1', 
    ),

    'sectionLoads' => array(
        's1' => array(
            '<primaryService-IP>'   => 0,
            '<secondaryService-IP>' => 50,
        ),
    ),


    'serverTemplate' => array(
        'dbname'     => $wgDBname,
        'user'       => $wgDBuser,
        'password'   => $wgDBpassword,
        'type'       => 'mysql',
        'flags'      => DBO_DEFAULT,
        'max lag'    => 30,
    ),
);

Backup

  • Backups for the database are scheduled every 24h. We simultaneously keep 7 copies for the last 7 days.
  • MariaDB-operator takes care of creating the backup and pushing them into an S3 bucket, as defined in backup.yaml.
  • Production backups are prefixed with production, staging backups use the prefix staging.
  • The files in the S3 bucket can examined using s3cmd:
s3cmd --host=hsm-test-09.zib.de:9001 --host-bucket=hsm-test-09.zib.de:9001 --region=us-east-1 --access_key=<access_key> --secret_key=<secret_key> ls s3://

Restore a database to the cluster

To restore or load a backup file once the DB is running.

1) Copy the backup file inside the main DB pod. The main pod is the only one with write rights. The file can be copied into the volume attached to the mariadb-0 pod, which will have enough space for it. This is mounted to /var/lib/mysql:

kubectl cp ./portal_db_backup.gz (production|staging)/mariadb-0:/var/lib/mysql

2) Unzip and load the file into the database:

Unzipping directly will probably fail with out of memory errors when the backup file is big.

cd /var/lib/mysql
gzip -d portal_db_backup.gz

An alternative is to stream the compressed file into a new file:

gunzip -c portal_db_backup.gz > portal_backup
mariadb -u root -p [enter password]
SOURCE /tmp/portal_db_backup