Skip to content

Commit

Permalink
Initial commit of rcron python implementation
Browse files Browse the repository at this point in the history
  • Loading branch information
EvanK committed Mar 17, 2015
0 parents commit f81d1bf
Show file tree
Hide file tree
Showing 16 changed files with 658 additions and 0 deletions.
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
The MIT License (MIT)

Copyright (c) 2015 Evan Kaufman

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
147 changes: 147 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
# Rcron

Cron job redundancy and failover for a group of machines.

This is a python reimplementation of [Benjamin Pineau's rcron tool](https://code.google.com/p/rcron/), and is intended as a drop-in replacement.

## Requirements

A POSIX system with a Python 2.4+ interpreter, and a separate tool to maintain state across machines such as [keepalived](http://www.keepalived.org/).

## Installation

The rcron script can live anywhere in the path. For our purposes, we'll expect it to live in `/bin`:

cp ./src/rcron.py /bin/rcron
chown root:root /bin/rcron
chmod 0755 /bin/rcron

It expects a configuration file, either at `/etc/rcron/rcron.conf` or specified with the `--conf` flag:

mkdir -p /etc/rcron
cp ./etc/rcron.conf.example /etc/rcron/rcron.conf
chown root:root /etc/rcron/rcron.conf
chmod 0644 /etc/rcron/rcron.conf

For Debian and Ubuntu systems, there is an included init.d script to generate a state file upon boot:

cp ./etc/debian.initd.sh /etc/init.d/rcron
chown root:root /etc/init.d/rcron
chmod 0755 /etc/init.d/rcron
update-rc.d rcron defaults

## Configuration and Usage

The rcron config file supports the following:

* `cluster_name` - An arbitrary name for your group of machines
* `state_file` - A file indicating the current machine's state, defaults to `/var/run/rcron.state`
* `default_state` - Default state if the state file cannot be read, defaults to `active`
* `syslog_facility` - Which [syslog facility](http://en.wikipedia.org/wiki/Syslog#Facility_levels) rcron messages should be generated from, defaults to `LOG_CRON`
* `syslog_level` - Which [severity level](http://en.wikipedia.org/wiki/Syslog#Severity_levels) rcron messages should be generated as, defaults to `LOG_INFO`
* `nice_level` - Job nicecess/priority, defaults to `19`

To ensure a job is run only on the active machine in a group, prefix it with your rcron path:

# Run daily at 3am
0 3 * * * /bin/rcron my-job --flag=a arg1 arg2

## Maintaining State

An external tool (eg, keepalived) would maintain state across machines such that only one in the group would have an "active" state and actually _run_ said job(s).

### Two machines, one active and one passive

Both machines can have an identical rcron configuration with a default state of "passive".

The first machine should have keepalived configured with a _higher_ priority, so that it can promote itself to a MASTER state:

# /etc/keepalived/keepalived.conf on machine 1
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 31
priority 100 # higher than my siblings!
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
notify_backup "/bin/echo passive > /var/run/rcron.state"
notify_master "/bin/echo active > /var/run/rcron.state"
notify_fault "/bin/echo passive > /var/run/rcron.state"
notify_stop "/bin/echo passive > /var/run/rcron.state"
}

The second machine should have a _lower_ priority than the first, so that it defaults to a BACKUP state:

# /etc/keepalived/keepalived.conf on machine 2
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 31
priority 99 # lower than my first sibling!
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
notify_backup "/bin/echo passive > /var/run/rcron.state"
notify_master "/bin/echo active > /var/run/rcron.state"
notify_fault "/bin/echo passive > /var/run/rcron.state"
notify_stop "/bin/echo passive > /var/run/rcron.state"
}

### Three or more machines

To ensure that only one machine is active, _every_ machine should be configured with a different priority.

For example, if we want to add a third machine to the example above, we would configure a _lower_ priority than the second so that it only promotes itself to a MASTER state when all higher priority machines are unavailable:

# /etc/keepalived/keepalived.conf on machine 3
vrrp_instance VI_1 {
state BACKUP
interface eth0
virtual_router_id 31
priority 98 # lower than my second sibling!
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
notify_backup "/bin/echo passive > /var/run/rcron.state"
notify_master "/bin/echo active > /var/run/rcron.state"
notify_fault "/bin/echo passive > /var/run/rcron.state"
notify_stop "/bin/echo passive > /var/run/rcron.state"
}

### Machines across disparate networks

By default, keepalived uses multicast for the machines to "broadcast" to one another. If you have multiple machines in the same group that live across two or more separate internal networks, they may not be able to communicate with each other. In such a case, you would need to use _unicast_ (added in keepalived v1.2.8), where each machine in the group is configured with the external ip of every other machine:

# /etc/keepalived/keepalived.conf on machine 1 via unicast
vrrp_instance VI_1 {
state BACKUP
interface eth1
virtual_router_id 31
priority 100
advert_int 1
unicast_src_ip 30.100.200.101 # this is me!
unicast_peer {
30.100.200.102 # this is my first sibling
30.100.200.103 # this is my second sibling
# et al...
}
authentication {
auth_type PASS
auth_pass 1111
}
notify_backup "/bin/echo passive > /var/run/rcron.state"
notify_master "/bin/echo active > /var/run/rcron.state"
notify_fault "/bin/echo passive > /var/run/rcron.state"
notify_stop "/bin/echo passive > /var/run/rcron.state"
}

## License

As with the original C implementation, this is released under the MIT license.
24 changes: 24 additions & 0 deletions Vagrantfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# -*- mode: ruby -*-
# # vi: set ft=ruby :

INSTANCES = 3

Vagrant.configure("2") do |config|
config.vm.box = "trusty64"
config.vm.box_url = "https://vagrantcloud.com/ubuntu/trusty64/version/1/provider/virtualbox.box"

(1..INSTANCES).each do |i|
config.vm.define vm_name = "rcl#{i}" do |rcl|
rcl.vm.hostname = "rcron-cluster-#{i}"
ip = "50.50.50.#{i+100}"
rcl.vm.network :private_network, ip: ip

rcl.vm.provision "ansible" do |ansible|
ansible.playbook = "provision/playbook.yml"
ansible.inventory_path = "provision/local"
ansible.limit = ip
#ansible.verbose = "vvvv"
end
end
end
end
46 changes: 46 additions & 0 deletions etc/debian.initd.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
#!/bin/sh

### BEGIN INIT INFO
# Provides: rcron
# Required-Start: $remote_fs $syslog
# Default-Start: 2 3 4 5
# Short-Description: Ensure rcron is ready to run
### END INIT INFO

PATH=/sbin:/bin:/usr/sbin:/usr/bin
DAEMON=/bin/rcron
NAME=rcron
DESC="redundant cron middleware"
CONFIG=/etc/rcron/rcron.conf

#includes lsb functions
. /lib/lsb/init-functions

#includes service defaults, if any
[ -r /etc/default/rcron ] && . /etc/default/rcron

test -f $CONFIG || exit 0
test -f $DAEMON || exit 0

case "$1" in
start|restart|reload|force-reload)
log_daemon_msg "Bootstrapping $DESC" "$NAME"
$DAEMON --conf=$CONFIG --generate > /dev/null 2>&1

if [ $? -eq 0 ]; then
log_end_msg 0
else
log_end_msg 1
fi
;;
stop)
log_daemon_msg "No-op $DESC" "$NAME"
log_end_msg 0
;;
*)
echo "Usage: /etc/init.d/$NAME {start|stop|restart|reload|force-reload}" >&2
exit 1
;;
esac

exit 0
10 changes: 10 additions & 0 deletions etc/rcron.conf.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# An arbitrary name
cluster_name = myredundant_jobs
# A file containing either the word "active" or the word "passive"
state_file = /var/run/rcron.state
# The default state in case state_file can't be read
default_state = passive
syslog_facility = LOG_CRON
syslog_level = LOG_INFO
# We can tune jobs niceness/priorities (see nice(1)).
nice_level = 19
4 changes: 4 additions & 0 deletions provision/local
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
[all]
50.50.50.101
50.50.50.102
50.50.50.103
6 changes: 6 additions & 0 deletions provision/playbook.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
- hosts: all
gather_facts: yes
sudo: true
roles:
- rcron
6 changes: 6 additions & 0 deletions provision/roles/rcron/handlers/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
- name: restart rcron
service: name=rcron state=restarted

- name: restart keepalived
service: name=keepalived state=restarted
82 changes: 82 additions & 0 deletions provision/roles/rcron/tasks/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
---
- name: "Install rcron"
copy: >
src=../../../../src/rcron.py
dest=/bin/rcron
owner=root
group=root
mode=0755
- name: "Make rcron etc directory"
file: >
path=/etc/rcron
owner=root
group=root
state=directory
- name: "Create rcron config"
copy: >
src=../../../../etc/rcron.conf.example
dest=/etc/rcron/rcron.conf
owner=root
group=root
mode=0644
- name: "Install rcron init script"
copy: >
src=../../../../etc/debian.initd.sh
dest=/etc/init.d/rcron
owner=root
group=root
mode=0755
notify: restart rcron

- name: "Add keepalived stable ppa"
apt_repository: repo='ppa:keepalived/stable'

- name: "Pin ppa keepalived package"
copy: >
content="Package: keepalived\nPin: release o=LP-PPA-keepalived-stable\nPin-Priority: 900\n"
dest=/etc/apt/preferences.d/keepalived-stable-900
owner=root
group=root
mode=0644
- name: "Install keepalived with unicast support"
apt: >
name=keepalived
state=latest
update_cache=yes
- name: "Ensure DAEMON_ARGS defined in keepalived init.d"
lineinfile: >
dest=/etc/init.d/keepalived
regexp='^DAEMON_ARGS='
line="DAEMON_ARGS=\"{{ '-x' if keepalived_snmp else '' }}\""
insertafter='^CONFIG='
- name: "Ensure $DAEMON_ARGS used in keepalived init.d start/restart"
replace: >
dest=/etc/init.d/keepalived
regexp='(start-stop-daemon(?:.*?\n?.*?)--start(?:.*?\n?.*?)--exec \$DAEMON);'
replace='\1 -- $DAEMON_ARGS;'
- name: "Create keepalived config"
template: >
src=keepalived.conf.j2
dest=/etc/keepalived/keepalived.conf
owner=root
group=root
mode=0644
notify: restart keepalived

- name: "Log every reboot (through rcron)"
cron: >
name="rcron reboot"
job="/bin/rcron echo Ran at $(date) >> /tmp/rcron.log"
- name: "Run every minute (through rcron)"
cron: >
name="rcron minutely"
special_time=reboot
job="/bin/rcron echo Rebooted at $(date) >> /tmp/rcron.log"
40 changes: 40 additions & 0 deletions provision/roles/rcron/templates/keepalived.conf.j2
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
vrrp_instance VI_1 {
# start all in BACKUP state
state BACKUP

# listen on internal/private network
interface eth1

# must be same for all nodes
virtual_router_id 31

{% for host in groups['all'] %}{% if host == inventory_hostname %}
# must be *different* for all nodes, for master election purposes
priority {{ 100 - loop.index0 }}
{% endif %}{% endfor %}

# heartbeat every second
advert_int 1

# this node
unicast_src_ip {{ inventory_hostname }}

# sibling nodes
unicast_peer {
{% for host in groups['all'] %}{% if host != inventory_hostname %}
{{ host }}
{% endif %}{% endfor %}
}

# must be same for all nodes
authentication {
auth_type PASS
auth_pass 1111
}

# track state
notify_backup "/bin/echo passive > /var/run/rcron.state"
notify_master "/bin/echo active > /var/run/rcron.state"
notify_fault "/bin/echo passive > /var/run/rcron.state"
notify_stop "/bin/echo passive > /var/run/rcron.state"
}
3 changes: 3 additions & 0 deletions provision/roles/rcron/vars/main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
---
# off since ppa doesn't currently have snmp support
keepalived_snmp: Off
Loading

0 comments on commit f81d1bf

Please sign in to comment.