Differences between revisions 22 and 23
Revision 22 as of 2016-03-21 18:09:58
Size: 17438
Editor: TimurBakeyev
Comment: Add a note about linking sysconfig to default
Revision 23 as of 2016-03-21 19:08:23
Size: 17786
Editor: TimurBakeyev
Comment:
Deletions are marked like this. Additions are marked like this.
Line 7: Line 7:
=== Overview === == Overview ==
Line 17: Line 17:
=== Configure the hosts ===

== Plan the Installation ==
== Configure the hosts ==

=== Plan the Installation ===
Line 44: Line 44:
== Prepare SSH between nodes == === Prepare SSH between nodes ===
Line 56: Line 56:
=== Install the Pacemaker/Corosync2.X HA cluster stack === == Install the Pacemaker/Corosync2.X HA cluster stack ==
Line 60: Line 60:
We are ready to get the packages:
Among others '''pacemaker''' will bring '''openhpid''' as a dependency, which fails to install properly without configuration file. To make install process smooth we should temporary mask openhpid in advance:
{{{
# systemctl mask openhpid.service
}}}

Now we are ready to get the packages:
Line 64: Line 68:
# apt-get install -t jessie-backports libqb0 fence-agents pacemaker corosync crmsh
}}}

Current version of systemd unit files for pacemaker(1.1.14-2~bpo8+1) and corosync(2.3.5-3~bpo8+1) are wrongly referring to RedHat specific "/etc/sysconfig" configuration directory, while relevant configuration files are installed into "/etc/default". So, to make them effective we need to create symlink:
# apt-get install -t jessie-backports pacemaker crmsh
}}}

That would install all the necessary dependencies, including '''corosync''' and '''fence-agents'''.

Current version of systemd unit files for pacemaker(1.1.14-2~bpo8+1) and corosync(2.3.5-3~bpo8+1) are wrongly referring to RedHat specific ''/etc/sysconfig'' configuration directory, while relevant configuration files are installed into ''/etc/default''. So, to make them effective we need to create symlink:
Line 73: Line 79:
Now we are ready to install application(s) we want to run on a cluster: At this point we can install application(s) to run on a cluster:
Line 78: Line 84:
Everything is ready to configure the cluster


=== Configure the Pacemaker/Corosync2.X HA cluster stack ===
We are ready to configure the cluster.


== Configure the Pacemaker/Corosync2.X HA cluster stack ==
Line 87: Line 93:
Corosync uses the port 5405 udp to communicate with the other members, so we have to create a rule to allow a traffic. Corosync uses the configurable port '''5405''' and one above it udp to communicate with the other members, so we have to create a rule to allow a traffic.
Line 216: Line 222:
Line 218: Line 223:
Line 334: Line 338:
=== Testing the Cluster === == Testing the Cluster ==
Line 543: Line 547:
=== Testing Stonith === == Testing Stonith ==

Documentation for getting started with the HA cluster stack on Debian Jessie and beyond.

This page is under construction.

Overview

To begin, you should have two machines, both freshly loaded with your brand new copy of Debian 8+. We'll need to do some configuration on both machines, so here are some things to keep in mind as you follow along:

  • We'll assume a 2-node cluster throughout most of this guide; if you are building a larger cluster then increment the integer used at the end of the hostname so that it correctly identifies your additional host.
  • With the first point covered, let's also refer to the 2 hosts as node01 and node02; just replace node with your own name if desired.

  • Until we get into the more advanced tools which simplify the more advanced configuration that will be required, and unless its specified to do something only on one host; please assume you are to do things, such as run a command or install software, on both nodes to make things a little simpler in the beginning of this guide.

We'll briefly cover getting the machines configured and ready for - and then get into - installing and configuring the Pacemaker/Corosync2.X HA cluster stack.

Configure the hosts

Plan the Installation

It's assumed that we have two nodes running Debian Jessie, and it's also assumed that both hosts are connected somehow, so one can see each other and viceversa.

What we'll do in this article, is configure a cluster integrated by two nodes, where we'll configure the following resources:

  • Nginx
  • Shared IP for Nginx

For this purpose, we chose two nodes, which only have one interface with a public IP:

  • node01 : 1.2.3.4
  • node02 : 1.2.3.5

# cat /etc/debian_version
8.1

An easy way to proceed is execute the commands in one of the nodes, and in paralell execute the same command in the other node.

In order to do that we need to copy the ssh-keys.

Prepare SSH between nodes

Generate a key in both nodes and copy over to the other node:

# ssh-keygen -t rsa
# ssh-copy-id root@node01
# ssh-copy-id root@node02

Now we can try to connect from node01 to node02 and vice versa.

Install the Pacemaker/Corosync2.X HA cluster stack

Since the Jessie branch doesn't have the packages for the new stack, we will use jessie-backports.

Among others pacemaker will bring openhpid as a dependency, which fails to install properly without configuration file. To make install process smooth we should temporary mask openhpid in advance:

# systemctl mask openhpid.service

Now we are ready to get the packages:

# apt-get update
# apt-get install -t jessie-backports pacemaker crmsh

That would install all the necessary dependencies, including corosync and fence-agents.

Current version of systemd unit files for pacemaker(1.1.14-2~bpo8+1) and corosync(2.3.5-3~bpo8+1) are wrongly referring to RedHat specific /etc/sysconfig configuration directory, while relevant configuration files are installed into /etc/default. So, to make them effective we need to create symlink:

cd /etc; ln -nfs default sysconfig

At this point we can install application(s) to run on a cluster:

# apt-get install nginx

We are ready to configure the cluster.

Configure the Pacemaker/Corosync2.X HA cluster stack

Corosync supports multicast and unicast, but here we'll show an unicast configuration.

Before put our hands into it, we need to prepare the firewall if any.

Corosync uses the configurable port 5405 and one above it udp to communicate with the other members, so we have to create a rule to allow a traffic.

For example:

Iptables:

iptables -A INPUT -p udp --dport 5405 -d [node] -j ACCEPT

Shorewall:

ACCEPT                  net:[node]                      $FW    udp         5405

Now, the corosync configuration could be something like:

totem {
        version: 2
        token: 3000
        token_retransmits_before_loss_const: 10
        clear_node_high_bit: yes
        crypto_cipher: none
        crypto_hash: none
        transport: udpu
        interface {
                ringnumber: 0
                bindnetaddr: [public ip]
        }
}

logging {
        to_logfile: yes
        logfile: /var/log/corosync/corosync.log
        debug: off
        timestamp: on
        logger_subsys {
                subsys: QUORUM
                debug: off
        }
}

quorum {
        provider: corosync_votequorum
        two_node: 1
        wait_for_all: 1
}

nodelist {
        node {
                ring0_addr: node01
        }
        node {
                ring0_addr: node02
        }
}

This is a very simple setup that it works, but we can see all the available options here:

man corosync.conf 5

Now we can start the services:

# service corosync start
# service pacemaker start

# crm status
Last updated: Thu Jun 11 10:42:19 2015
Last change: Thu Jun 11 10:41:46 2015
Stack: corosync
Current DC: node02 (1053402613) - partition with quorum
Version: 1.1.12-561c4cf
2 Nodes configured
0 Resources configured


Online: [ node01 node02 ]

Adding Resources

Ok, now we can start configuring the resources.

We'll configure Nginx, a shared IP where Nginx will run, also we'll tell to the cluster that we want that both resources run in the same node, and since we need the IP to start Nginx, we'll set up a properly order.

Also, since both nodes are equal, we won't specify any location preference, and we'll also specify that if one nodes crash, and the other takes the resources, the resource will not be migrated back to the node that crashed.

# crm configure
crm(live)configure# property stonith-enabled=no
crm(live)configure# property no-quorum-policy=ignore
crm(live)configure# property default-resource-stickiness=100
crm(live)configure# primitive IP-rsc_nginx ocf:heartbeat:IPaddr2 params ip="xx.xx.xxx.xx" nic="eth0" cidr_netmask="xx.xx.xx.xy" meta migration-threshold=2 op monitor interval=20 timeout=60 on-fail=restart
crm(live)configure# primitive Nginx-rsc ocf:heartbeat:nginx meta migration-threshold=2 op monitor interval=20 timeout=60 on-fail=restart
crm(live)configure# colocation lb-loc inf: IP-rsc_nginx Nginx-rsc
crm(live)configure# order lb-ord inf: IP-rsc_nginx Nginx-rsc
crm(live)configure# commit

We'll receive a warning related to the timeout for the operations start/stop on Nginx. You are free to configure it, but for the purpose of this, that is enough.

We also disabled stonith for now, but later we'll come back to it again.

# crm status
Last updated: Thu Jun 11 11:05:45 2015
Last change: Thu Jun 11 11:04:35 2015
Current DC: node02 (1053402613) - partition with quorum
2 Nodes configured
2 Resources configured


Online: [ node01 node02 ]

 IP-rsc_nginx   (ocf::heartbeat:IPaddr2):       Started node01
 Nginx-rsc      (ocf::heartbeat:nginx): Started node01

Fencing

After configure the normal resources, we saw that everything is running as expected. Now it's time to dig into fencing.

Fence is used to put a node into a known-state, so if we're having a problem inside of the cluster, and one node is flapping or behaiving bad, we'll put this node into an state that we know is safe. ( Fence-Clusterlabs )

To see all the stonith devices/agents that you have, you can run:

# stonith_admin -I

This command will return you all the stonith devices you can use for fencing purposes.

* JFI: fence-agents package stores all fence-agents in /usr/sbin ( ls /usr/sbin/fence_* )

You have to choose properly which fence-agent do you need, because it depends on which hardware/software is relaying your cluster.

Once you have choosen it, you can inspect all the options by:

man [fence_agent]

For this article I created a cluster of two machines running on top of KVM, so I'll use fence_virsh agent.

Since our cluster is already running, we'll create a new shadow config to put our new changes there, so in this way we can do some changes and check further without mess up with the working configuration.

crm(live)# cib new fencing
INFO: cib.new: fencing shadow CIB created
crm(fencing)# configure
crm(fencing)configure# property stonith-enabled=yes
crm(fencing)configure# primitive fence_node01 stonith:fence_virsh \
   >         params ipaddr=virtnode01 port=node01 action=off login=root passwd=passwd pcmk_host_list=node01 \
   >         op monitor interval=60s
crm(fencing)configure# primitive fence_node02 stonith:fence_virsh \
   >         params ipaddr=virtnode02 port=node02 action=off login=root passwd=passwd delay=15 pcmk_host_list=node02 \
   >         op monitor interval=60s
crm(fencing)configure# location l_fence_node01 fence_node01 -inf: node01
crm(fencing)configure# location l_fence_node02 fence_node02 -inf: node02
crm(fencing)configure# end
There are changes pending. Do you want to commit them (y/n)? y
crm(fencing)#

Ok, we've written the changes, but the new configuration is not deployed yet.

Now we can simulate what will happen:

crm(fencing)# cib cibstatus simulate

Current cluster status:
Online: [ node01 node02 ]

 IP-rsc_nginx   (ocf::heartbeat:IPaddr2):       Started node01 
 Nginx-rsc      (ocf::heartbeat:nginx): Started node01 
 fence_node01   (stonith:fence_virsh):  Stopped 
 fence_node02   (stonith:fence_virsh):  Stopped 

Transition Summary:
 * Start   fence_node01 (node02)
 * Start   fence_node02 (node01)

Executing cluster transition:
 * Resource action: fence_node01 monitor on node02
 * Resource action: fence_node01 monitor on node01
 * Resource action: fence_node02 monitor on node02
 * Resource action: fence_node02 monitor on node01
 * Pseudo action:   probe_complete
 * Resource action: fence_node01 start on node02
 * Resource action: fence_node02 start on node01
 * Resource action: fence_node01 monitor=60000 on node02
 * Resource action: fence_node02 monitor=60000 on node01

Revised cluster status:
Online: [ node01 node02 ]

 IP-rsc_nginx   (ocf::heartbeat:IPaddr2):       Started node01 
 Nginx-rsc      (ocf::heartbeat:nginx): Started node01 
 fence_node01   (stonith:fence_virsh):  Started node02  
 fence_node02   (stonith:fence_virsh):  Started node01  

crm(fencing)#

This is a nice way to test changes in the configuration without crash anything.

Once we've seen that everything will work as expected, we can commit the new changes:

crm(fencing)# cib commit
# crm status
Last updated: Tue Jun 13 16:05:26 2015
Last change: Tue Jun 13 16:05:00 2015
Current DC: node01 (1053402612) - partition with quorum
2 Nodes configured
4 Resources configured


Online: [ node01 node02 ] 

 IP-rsc_nginx   (ocf::heartbeat:IPaddr2):       Started node01 
 Nginx-rsc      (ocf::heartbeat:nginx): Started node01 
 fence_node01        (stonith:fence_virsh):  Started node02 
 fence_node02        (stonith:fence_virsh):  Started node01

We can see the stonith resources running, so our fencing is well configured.

Testing the Cluster

Now it's time to test this war-machine in order to see if it's reliable and it'll behave as we expect.

Possible tests

  • Migrate resources
  • Stop/Start Resources
  • Kill a Resource
  • Poweroff one node
  • Put a node as a standby

Migrate a resource

Let's say that for some reason we want to move one resource that is running in node01 to node02.

# crm resource status IP-rsc_nginx
resource IP-rsc_nginx is running on: node01

# crm resource migrate IP-rsc_nginx

Here we'll receive a warning telling us that a new location constraint was created. So the resource will not run anymore in the node was running before migrate it. This is done because it's assumed that if we migrated that resource, is for a good reason, and we don't want that this resource is running there again before we explicit that.

#  crm resource status IP-rsc_nginx
resource IP-rsc_nginx is running on: node02

The node was migrated over to node02

** We can remove the location constraint created by the migrate command by:

crm_resource -U --resource [resource]

Stop/Start Resources

crm(live)# status
Last updated: Wed Jun 16 18:30:56 2015
Last change: Wed Jun 16 18:23:59 2015
Stack: corosync
Current DC: node02 (1053402613) - partition with quorum
Version: 1.1.12-561c4cf
2 Nodes configured
4 Resources configured


Online: [ node01 node02 ]

 IP-rsc_nginx   (ocf::heartbeat:IPaddr2):       Started node02 
 Nginx-rsc      (ocf::heartbeat:nginx): Started node02 
 fence_node01   (stonith:fence_virsh):  Started node02  
 fence_node02   (stonith:fence_virsh):  Started node01

crm(live)# resource stop IP-rsc_nginx
crm(live)# resource show
 IP-rsc_nginx   (ocf::heartbeat:IPaddr2):       Stopped 
 Nginx-rsc      (ocf::heartbeat:nginx): Stopped 
 fence_node01   (stonith:fence_virsh):  Started 
 fence_node02   (stonith:fence_virsh):  Started

Since Nginx-rsc depends on IP-rsc_nginx, was also stopped.

crm(live)# resource start IP-rsc_nginx
crm(live)# resource show
 IP-rsc_nginx   (ocf::heartbeat:IPaddr2):       Started 
 Nginx-rsc      (ocf::heartbeat:nginx): Started 
 fence_node01   (stonith:fence_virsh):  Started 
 fence_node02   (stonith:fence_virsh):  Started 

Kill a Resource

# killall -9 nginx
# pstree | grep nginx <- any process

When op monitor hits again, it detects that Nginx is not running, and pacemaker start again the resource:

# pstree | grep nginx &> /dev/null && echo "Running"
Running

Poweroff one node

Here we'll simulate a power-cut

# crm status
Last updated: Tue Jun 16 17:07:37 2015
Last change: Tue Jun 16 17:03:46 2015
Current DC: node01 (1053402612) - partition with quorum
2 Nodes configured
4 Resources configured


Online: [ node01 node02 ]

 IP-rsc_nginx   (ocf::heartbeat:IPaddr2):       Started node02 
 Nginx-rsc      (ocf::heartbeat:nginx): Started node02 
 fence_node01   (stonith:fence_virsh):  Started node02  
 fence_node02   (stonith:fence_virsh):  Started node01

The resources are running on node02, so this is the node we'll poweroff.

Since I'm working in a virtual environtment, I 'll do a:

virsh destroy node02

crm(live)# status
Last updated: Tue Jun 16 17:10:52 2015
Last change: Tue Jun 16 17:03:47 2015
Current DC: node01 (1053402612) - partition with quorum
2 Nodes configured
4 Resources configured


Online: [ node01 ]
OFFLINE: [ node02 ]

 IP-rsc_nginx   (ocf::heartbeat:IPaddr2):       Started node01 
 Nginx-rsc      (ocf::heartbeat:nginx): Started node01 
 fence_node02   (stonith:fence_virsh):  Started node01

As we can see here, since node02 went away and was marked as OFFLINE, node01 took over the resources.

Put a node as a standby

When we put a node in a standby mode, this node will be not acceptable for holding resource, and the resources that this node is running, will be automatically moved to the other node.

This can be very useful to do some maintenance stuff in this node, like a software-upgrade, or configuration changes, etc.

crm(live)# status
Last updated: Wed Jun 16 18:52:52 2015
Last change: Wed Jun 16 18:41:06 2015
Stack: corosync
Current DC: node02 (1053402613) - partition with quorum
Version: 1.1.12-561c4cf
2 Nodes configured
4 Resources configured


Online: [ node01 node02 ] 

 IP-rsc_nginx   (ocf::heartbeat:IPaddr2):       Started node01 
 Nginx-rsc      (ocf::heartbeat:nginx): Started node01 
 fence_node01   (stonith:fence_virsh):  Started node02  
 fence_node02   (stonith:fence_virsh):  Started node01

crm(live)# node standby node01
crm(live)# status
Last updated: Wed Jun 18 10:56:06 2015
Last change: Wed Jun 18 10:53:02 2015
Stack: corosync
Current DC: node02 (1053402613) - partition with quorum
Version: 1.1.12-561c4cf
2 Nodes configured
4 Resources configured


Node node01 (1053402612): standby
Online: [ node02 ]

 IP-rsc_nginx   (ocf::heartbeat:IPaddr2):       Started node02 
 Nginx-rsc      (ocf::heartbeat:nginx): Started node02 
 fence_node01   (stonith:fence_virsh):  Started node02

crm(live)# node show
node02(1053402613): normal
        standby=off
node01(1053402612): normal
        standby=on

Testing Stonith

We've tested the cluster. Now it's time to see if our fencing will run as expected.

Tests we'll do:

  • Kill Corosync
  • Loss-connection (network)

Kill Corosync

It's assumed that both nodes are running together again. We need to choose one of them, and kill corosync process.

node02

# killall -9 corosync

When Pacemaker detects that, we can see that fences the other node:

Peer node02 was terminated (reboot) by node01 for node01: OK

Loss-connection

To simulate a Loss-connection, we can block the port through Corosync is messaging, so everynode will think is alone and will take over the resources:

# fw-block

Then, we'll see that the node that has the delay option, will be fenced in first place. With that we keep the consistency because we only shoot one node, and the resources are safe running in only one place.