DevOps stands for development operations. The general concept deals with the ability for developers producing software to collaborate with IT professionals in deploying that software.
Historically, development and IT work was very siloed. There was not much communication between the two groups. Developers would hand their work off to the IT department and they would be responsible to deploy this work that they didn't know much about.
In modern development, DevOps is often referred to when a small team has no dedicated sysadmin and it's the developer's responsibility to launch their own product. Usually this developer splits their role between being part-time developer and part-time sysadmin. The idea is that through a better process and with some programming, the workflow can be much more efficient.
Continuous Deployment is one way that development can exist more closely with operations. Heroku is actually another solution to DevOps, but having the Ops part of the equation mostly outsourced.
In this repo we'll actually be learning sysadmin done manually as well as automated.
First let's see how we could get a simple static site up and running first, before we graduate to a full Django project.
There's a sample "Coming Soon" portfolio site on github here. Fork it, clone it, and edit it so that it's "Your" portfolio site. You could simply just swap in your name, change some of the text, or change the colors, it's up to you. Spend about 15 minutes to do that.
For the rest of the week we will be using Amazon's Web Services to learn how to deploy our websites ourselves and get an introduction to IT and DevOps work.
You'll need to create a AWS account to use for these exercises. Don't worry, there's a free usage tier valid for 12 months! The signup process does take some time and requires a verification phone call as well as putting in a credit card.
RocketSpace also has credits for AWS that you can make use of. After today's lesson, if you don't want to use up all of your free hours for the month, shut down your EC2 instances or you may be charged.
Let's set up our first virtual server on Amazon. AWS calls these EC2 (Elastic Compute Cloud) Instances.
- Navigate to the EC2 Service and select Launch Instance
- Select the Ubuntu 14.04 AMI (Amazon Machine Image)
- Confirm you'd like the t2.micro instance type. This is the free tier type, but has the lowest hardware specs.
- Click 5. Tag Instance from the top. For the Value field, name this "portfolio" so we can more easily tell what this EC2 instance is for in the future.
- Click 6. Configure Security Group. We need to add a new rule and select HTTP. This will allow our EC2 instance to host a website and allow website traffic access to it.
- Select the Launch option. We'll need to create a new key pair. Once it's created, make sure you click the Download Key Pair button. A pem file will download (we'll get back to this in a minute).
- Select Launch Instances. Go back to your dashboard for EC2 and check that you now have a running EC2 instance.
If you're not already there, navigate to your running EC2 instance via the EC2 dashboard and the Running Instance link.
We need to find the public IP address of our EC2 instance in order to connect to it. Locate it in the bottom info panel.
Now that we know the IP address, we just need to set up our pem file, which we downloaded while we were setting up our EC2 instance. This file is our "key" that gets us into the server.
First, locate your pem file on your computer and move it to your user's .ssh folder for safe keeping. You can do this with a GUI or command line, up to you. On a mac, the command you likely can run is: mv ~/Downloads/portfolio.pem ~/.ssh/
assuming you named your EC2 instance portfolio.
Now we need to change the permissions of the pem file so that we can use it. Change directory to your .ssh folder then run chmod 400 portfolio.pem
. This allows our currently logged in user to read our pem file.
Now that we have the IP address of our EC2 instance and our pem file we can securely connect to our server. We can do this using SSH (Secure Shell). This is a protocal for connecting to a remote server.
Let's run ssh -i portfolio.pem ubuntu@your-ip-address
. Make sure you use your IP address. This is telling ssh to connect to our server at our public ip address as the ubuntu user using our pem file. If we don't pass it the pem file, we won't be allowed in.
If it asks if you're sure you want to connect, say yes. It does this whenever you're sshing to a new IP address for the first time.
Now we're in our EC2 instance! And everything's ready for us to set up our website.
First thing we want to do is install nginx so that we can start setting up our web server, which will be responsible for receiving and returning our HTTP requests and responses.
Ubuntu uses a tool called apt-get to manage installing software. This works by keeping a large list of repositories where the software is stored. Over time, these repositories get updated as new versions of the software is released.
Before we can install nginx, we want to make sure Ubuntu has the most up to date apt-get repositories or we may install an older version of nginx or some other software. To do this we can run sudo apt-get update
. Once that finishes we can run sudo apt-get upgrade
, which will then upgrade all of our installed packages.
We're currently logged in as the user, ubuntu
. Usually to run apt-get commands we need to have "root" permissions. By using sudo
we're saying "SuperUser DO" this command.
Once Ubuntu finishes updating it's apt-get repositories, now we can install nginx. We can do this by running sudo apt-get install nginx
. This should look fairly similar to when we're using pip with python packages. Ultimately apt-get and pip are both just different tools used for downloading software, one for Ubuntu, the other for python.
The stage is set, let's get our portfolio site on our server and nginx serving it!
Let's get our portfolio site's code onto our EC2 instance. The easiest way to do this, is just to clone it onto our server. In the future, if we make an update to it, we can just pull it down via git.
This means we'll need to install git: sudo apt-get install git
.
For our instance to have access to our github repository, it will need an authorized ssh key.
Last, but not least, we need to set up our web server so that it knows about the new website code we just placed onto our server.
Nginx has it's own configuration file format, which we'll need to setup for our website. You can potentially have multiple websites running on the same server, by using multiple configuration files.
The configuration files all live in a specific folder, let's change directory to it so we can make our new one: cd /etc/nginx/sites-enabled/
.
You'll notice there is already a default
configuration file in here. Take a look at your dashboard again on EC2 for this running instance. Go to the public dns url listed in the info pane. You should see a Welcome to nginx! page. This is controlled by this default
nginx config.
For us to get up and running as quick as possible, let's edit and use this config. Let's rename it to something more appropriate: sudo mv default portfolio.conf
.
Since we're ssh'd into a remote server, we can't just open up this file in our favorite text editor. This means we have to use one of the universal text editors that is installed on most servers.
Our options are nano, vim, and emacs. If you are familiar with one of these, please use that one to start editing the file. If you're not, let's use emacs. The shortcut keys can be hard to learn, so let's walk through it step by step.
- Let's install emacs:
sudo apt-get install emacs
- Next, let's open our file:
sudo emacs portfolio.conf
- All we need to do is edit the location of where the website files exist:
root /usr/share/nginx/html;
becomesroot /home/ubuntu/portfolio;
- To save in emacs we need to hit: cntrl-x then cntrl-s
- To exit emacs we need to hit: cntrl-x then cntrl-c
- Lastly, we need to restart nginx so our changes take affect:
sudo service nginx restart service
is a useful tool that helps give us some shortcut commands to start|stop|restart various tools on our server.
- Create an ssh key:
ssh-keygen -t rsa
- Then copy and paste this key to the deploy key setting for our repository in github:
cat ~/.ssh/id_rsa.pub
, Settings -> Deploy keys cd ~/
to make sure we're back in our home folder and now let's clone our site:git clone [email protected]:username/rocketu-portfolio.git
. Make sure you replaceusername
with your github username and the proper link to your repo.
Nginx is the web server and responsible for receiving and sending messages to our user's browser.
Since this first example is only a static website and not using python, we do not need to worry about the WSGI layer of our architecture.
Our index.html
is just a static file and nginx knows how to read the static file and send the contents to the browser. Essentially it turns our HTML into a string that gets sent in the data portion of the HTTP response.
We'll see that things get a bit more complicated once we need to set up a Django application. We'll also dive a bit more into the nginx configuration file.
It may be annoying to have to constantly remember our IP addresses and pem files when sshing into our servers. It becomes especially annoy when you have many different ones you need to ssh into often.
On our computers we can edit a configuration file, which let's us give just a nickname so we could simply type ssh portfolio and our configuration file will do the rest of the work for us.
Let's set it up!
- Let's first create the file if it doesn't exist already:
touch ~/.ssh/config
. - Since we're on our computer, let's open the file in our preferred text editor:
open ~/.ssh/config
. - Add the follow lines of code, replacing the IP address with your EC2 instances public IP address.
Host portfolio
HostName your_ec2_ip_address
User ubuntu
IdentityFile ~/.ssh/portfolio.pem
- This is simply just configuring the different parts of our normal ssh statement:
ssh -i ~/.ssh/portfolio.pem ubuntu@your_ec2_ip_address
->ssh -i IdentityFile User@HostName
- Save the file and now try to
ssh portfolio
. Now you don't have to look up your IP address everytime!
- Make a change to your portfolio site locally on your computer. Commit and push the changes back to github.
- SSH back into your server, pull down the changes and restart nginx.
- Confirm you see your new changes in the browser!
From this point forward, we will be using the rocketu_blog_analytics for deployment using Fabric and then Ansible. First, let's trace back through our steps and create a new EC2 instance for our Django application.
We're using a new EC2 instance for repetition in setting one up. Remember to shut down one or both of these instances when you're done with them.
Create a new key pair to have practice chmoding it, moving it, and sshing with it.
Once you've sshed back into your server, continue to follow the previous steps: update apt-get, install git, setup a deploy ssh key, clone it to /home/ubuntu, install nginx and install emacs.
Wait to set up your nginx config, this will be more complicated for our Django application!
Just like all of the steps involved in starting a new Django project, we will need to do many of the same things on our server. This includes creating a virtual environment, installing packages, setting up postgreSQL, and running migrations.
Let's tackle these one by one.
Let's start with setting up our virtualenv.
- We need to install apt-get's python libraries, which include pip:
sudo apt-get install python-pip python-dev build-essential
- Then we need to upgrade pip, virtualenv, and install virtualenv wrapper:
sudo pip install pip --upgrade
,sudo pip install virtualenv --upgrade
,sudo pip install virtualenvwrapper
- Now we need to setup the proper folder where our virtualenvs will save:
mkdir ~/.virtualenvs
- Next let's edit our basrc to point to our folder and where we installed virtualenvwrapper:
emacs
, then add/.bashrcexport WORKON_HOME=/.virtualenvs
andsource "/usr/local/bin/virtualenvwrapper.sh"
to the bottom of the file. - Reload our bashrc so the changes take affect:
source ~/.bashrc
- Create our virtualenv that we'll be using:
mkvirtualenv blog_analytics
Next up, let's set up PostgreSQL on our server.
- First we need to add a new user account for our server:
sudo adduser blog_analytics
. When prompted set a secure password for this user. Remember it! You'll need it in the next step. - We need to install the appropriate postgreSQL packages from apt-get:
sudo apt-get install postgresql postgresql-contrib libpq-dev
. - Once those finish installing we can switch to the postgres user account, which has access to the database:
sudo -i -u postgres
. - We want to create a new postgres user for our application:
createuser --interactive
. Name the userblog_analytics
when prompted. We don't want to give the user any extra permissions so just say no to the questions. - Now we can create the database for our user and Django application:
createdb blog_analytics
. - Our user and db should be created, let's switch to that user. If you're still the postgres user, you'll need to type exit first, then
sudo -i -u blog_analytics
. - Now just type
psql
, and you should connect to the database. - Let's set up a password for our postgres user blog_analytics by running:
ALTER USER blog_analytics WITH PASSWORD 'myawesomepassword';
. Make sure you replace myawesomepassword with your secure password.
Now that we have PostgreSQL and our virtualenv set up, now we can set up our local settings file and install our python dependencies.
- Let's make a copy of our local settings template:
cp local_settings.py.template local_settings.py
. - Now let's edit the file using
emacs local_settings.py
and edit our file with some new information about our server,
import os
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': 'blog_analytics',
'HOST': 'localhost',
'USER': 'blog_analytics',
'PASSWORD': 'myawesomepassword'
}
}
BASE_DIR = os.path.dirname(os.path.dirname(__file__))
STATIC_ROOT = os.path.join(BASE_DIR, 'static')
STATIC_URL = '/static/'
- Next we can install all of the packages in our requirements file:
pip install -r requirements.txt
. - Make sure you have all of your packages listed in your requirements file!
- Run
python manage.py runserver
to test that we have everything working and can talk to the database. - Note we have a warning in red saying we should apply our migrations, let's do that next!
python manage.py migrate
- We also set up our static files directory in
local_settings.py
. Let's runpython manage.py collectstatic
, which will help set up our static files for later.
We have now gone through all the familiar bits of getting our project working locally, except on Ubuntu. Next we need to get our Django application synced with nginx so we can actually receive and respond to web requests to our server.
Please follow along carefully. Any wrong character type or file misplaced in the wrong directory will lead to your setup not working.
Likely, you're also not very comfortable with the command line text editor yet as well as all of the terminology so even more reason to pay careful attention.
Gunicorn is a WSGI layer that we can use to allow our Django application to talk with nginx. Setting it up is very straightforward.
- First, let's install it in our virtualenv:
pip install gunicorn
. - Next let's create our gunicorn config file:
touch rocketu_blog_analytics/gunicorn.conf.py
. Make sure it's in the root folder of our Django project. - We'll need to open it in our text editor,
emacs rocketu_blog_analytics/gunicorn.conf.py
, and place the following code:
proc_name = "rocketu_blog_analytics"
bind = '127.0.0.1:8001'
loglevel = "error"
workers = 2
- There are a lot more configuration options then the ones we specified, but keeping it this simple for now works.
workers
specifies how many threads to use for gunicorn. This means we could handle 2 incoming requests at the same time. Usually you want to have a number of workers that corresponds to the number of CPU cores on your machine.bind
refers to localhost and the port gunicorn is running on. If we had multiple Django applications on this EC2 instance we could bind the other to port 8001, we'd choose something else.
Supervisor is a tool that basically helps manage other tools. The idea is that we can easily use supervisor to stop, start, and restart gunicorn quickly and reliable.
- As usual, let's install it first:
sudo apt-get install supervisor
. - To get it started and check it was installed let's restart the supervisor service:
sudo service supervisor restart
. - Supervisor also works by having a configuration file. Let's create one for our project:
sudo touch /etc/supervisor/conf.d/rocketu_blog_analytics.conf
. Note, it belongs inside supervisor's conf.d directory. - Now let's open it up in our editor,
sudo emacs /etc/supervisor/conf.d/rocketu_blog_analytics.conf
, and add the following:
[group:rocketu_blog_analytics]
programs=gunicorn_rocketu_blog_analytics
[program:gunicorn_rocketu_blog_analytics]
command=/home/ubuntu/.virtualenvs/blog_analytics/bin/gunicorn -c gunicorn.conf.py -p gunicorn.pid wsgi:application --pythonpath /home/ubuntu/rocketu_blog_analytics/rocketu_blog_analytics
directory=/home/ubuntu/rocketu_blog_analytics
user=ubuntu
autostart=true
autorestart=true
redirect_stderr=true
- Let's restart supervisor to make sure it works:
sudo service supervisor restart
. This will run the gunicorn program we created in our supervisor config file. - Notice that we set the command to run gunicorn for our project and target our
gunicorn.conf.py
configuration file we setup. - We specifiy we want to use our ubuntu user to run the command and that it should be run from our Django project directory.
To put it all together, we now need to set up nginx to point to our Django project and the gunicorn wsgi server we just set up.
- This time we're going to make the nginx config file from scratch:
sudo touch /etc/nginx/sites-available/rocketu_blog_analytics.conf
. - Let's open it up for editing:
sudo emacs /etc/nginx/sites-available/rocketu_blog_analytics.conf
and place in the following.
server {
server_name your_ec2_url;
access_log off;
location /static/ {
alias /home/ubuntu/static/;
}
location / {
proxy_pass http://127.0.0.1:8001;
proxy_set_header X-Forwarded-Host $server_name;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Host $host;
}
}
-
Make sure you replace your ec2 public url in the nginx config.
-
Notice we set up a special location for the static files. This helps us server our css, js, and images for our application.
-
We also point the
proxy_pass
to 127.0.0.1:8001. This has to match what we configured gunicorn to run on. -
The rest of these headers are just informing nginx to pass on the appropriate information to gunicorn and subsequently Django.
-
You may not have noticed, but we put the file in the
sites-available
folder this time, instead ofsites-enabled
. You're supposed to put all available nginx configs in the available folder, and then symlink the ones you want active to the enabled folder. -
Let's symlink our new nginx config file:
sudo ln -s /etc/nginx/sites-available/rocketu_blog_analytics.conf /etc/nginx/sites-enabled/rocketu_blog_analytics.conf
. -
Now let's also remove the old symlinked default file:
sudo rm /etc/nginx/sites-enabled/default
. -
Let's restart nginx
sudo service nginx restart
, and go to our browser!
Our site's up! There's still some things left that we should do, before it's complete.
- Navigate to a url that you know doesn't exist. Oops, we're still in debug mode! Not good.
- Let's edit our local settings file,
emacs local_settings.py
, and addDEBUG = False
- We then need to restart supervisor:
sudo service supervisor restart
- And we need to restart nginx:
sudo service nginx restart
- Note, you'll need to do these everytime you change some python code. This recompiles our python code to .pyc files.
- Uh oh, we broke something!?
Something about our site does not work when we're not in DEBUG mode. This is something that will definitely happen to you again so pay attention!
- We need to set the
ALLOWED_HOSTS
setting to allow requests from our EC2 url. - Edit our local settings file,
emacs local_settings.py
, and add the following:ALLOWED_HOSTS = ['your_ec2_url']
. Note, you need to use your ec2 instance url! - Again we need to restart supervisor:
sudo service supervisor restart
. - And restart nginx:
sudo service nginx restart
. - And our site works!
- Navigate around the site, and create a superuser to log into the admin to see your analytics.
Up to this point, we have seen what it takes to get a virtual server up and running with a web server and project on it.
Imagine if we were working with a large team of developers and we had to push new updates every day, maybe even multiple times a day.
Imagine if the website we were running actually received hundreds of thousands of users in traffic every day. We would need to have many servers running simultaneously.
Our entire job, every day, would just be sshing into servers and updating the code. Training someone else to also do our job or collaborating with a team of sysadmins would be a nightmare.
On a team, when the realities of all the scenarios we just started talking about set it, sysadmins and developers start looking for tools we can use to help automate this process. After all, it is essentiall just a repeatable list of steps.
Today we're going to take a walk through those automation tools and see how we can abstract away the need to manually ssh into a server.
This is going to make deploying much better, but we wouldn't have the ability to put together these automation tools unless we had done it by hand before.
It's also going to give us some insight into what Heroku is probably doing, when we run specific commands to launch our application.
Our tool of choice this morning will be Fabric. This is a python library that gives us handy functions for running commands on remote servers. It also allows us to parallelize these commands, so that we could potentially update multiple servers at once.
Fabric also has the ability to assign roles to different servers. This means we can run certain commands on different servers we may have running in our system.
For today's purposes though, we'll just be worrying about writing some commands to automate tasks on our one server.
Let's create our first Fabric task, which will just print out to the console to prove we have it working.
- Let's first install fabric so we can use it:
pip install fabric
- Next, we'll need to make a new file in our blog analytics project called
fabfile.py
. Make this file in your root folder, the same that containsmanage.py
- Let's define our first task, hello.
@task
def hello():
print("I'm alive!")
- Note that we're using a decorator called
@task
. This comes from Fabric and we have to specify this on any function that we want to be able to call from the command line. - Let's run our new task:
fab hello
. When we run a fabric command using fab, it looks for a file calledfabfile.py
or a directory calledfabfile
to import commands from.
We're now successfully ran a fabric command and had something print out to our console. We can run fab -l
to see the list of all of our available commands at any given time.
Our message right now is trivial though. Let's spruce it up by adding some color. Change your print line to this: print(green("I'm alive!"))
, then run your command again.
It's green! An important part of fabric scripts is making sure that appropriate output is being displayed to the user running the script. Using different colors can help signify successes, warnings, and failures.
Now that we have our fabfile working, let's do something more than just printing. Fabric gives us a command, local()
, which takes a command as a string and runs this locally on your computer.
Let's create a task, which will make an empty file on our Desktop as an example.
@task
def create_file():
local("touch ~/Desktop/dummy_file.txt")
Please change the path to your Desktop as needed to work with your computer. Run this fab command and try it out: fab create_file
. Check that the file was created on your Desktop.
We can do even better with our new task. Let's say we want the ability to specific the name of the file that our Fabric task should create. Our tasks can receive arguments just like normal functions and we can pass in these arguments to them when we call the command.
@task
def create_file(file_name):
local("touch ~/Desktop/{}.txt".format(file_name))
All we did was add an argument to our function and then use it to edit our command to make a file with that name.
Let's run it and pass it a file name: fab create_file:test_arguments
. This should create an empty test_arguments.txt
file on your Desktop! All we did was pass the arguments after the command name, separated by a colon.
Let's try to quickly write our own tasks to do something locally on our computer, to make sure we understand what's happening.
- Write a task to create a new directory called "my_directory" on your Desktop.
- Write a task that takes in two arguments, one is the name of the directory you should create and the other is the path to the folder you should make that directory in.
Okay, the real reason for using fabric is to be able to ssh into our server and run commands remotely. Let's see how we can easily do that.
At the top of our fabfile, we're going to need to set some global env variables so our script knows how to ssh into our server. The way this works is a bit "unpythonic", but we have no other choice.
from fabric.api import *
env.hosts = ['your_public_ip_address']
env.user = 'ubuntu'
env.key_filename = '~/.ssh/your_pem_file.pem'
As usual, remember to substitute the appropriate public IP address and name of your pem file in your local version of the fabfile.
Now let's write a simple task, which will run a remote command on our server. Fabric's run
method can be used to accomplish this.
@task
def ubuntu_hello():
run("lsb_release -a")
lsb_release -a
is a command that has Ubuntu print out version info about itself. This will prove that we actually made it onto our server. All we did was pass the command we wanted to run, as a string to run
.
** Run our new command using fab ubuntu_hello
, and test that it works.
If you look at the output of our command, the info we get from lsb_release -a
is a bit hidden with the rest of the fab output.
By default, the stdout and stderror of our commands on our server is just printed out to our console locally.
We can change this by choosing not to show the output of the command and instead storing it in a variable. We can then color and print the output so that it stands out more. This is a common way to make important output standout in a script.
@task
def ubuntu_hello():
with hide("stdout"):
output = run("lsb_release -a")
print(yellow(output))
Run the command again, to see the change in the output.
Here, we've used the context manager hide
to suppress and output to stdout that happens inside of it's code block. Then as we mentioned, we save the output of run
to a variable so we can print and style it as we see fit.
We've played with a few of the basics of Fabric, let's see how this would really apply to us in DevOps.
A common task is to update our existing Django project on our server. That process has the following steps:
- Pull the latest from git
- pip install our requirements file in case there was a new library added
- Run migrations in case there have been any schema changes.
- Run collectstatic to get any new static files
- Restart supervisor
- Restart nginx We've got our list of remote commands that need to be run. Let's set up a Fabric task, which sequentially executes this list.
Let's work on pull the latest from github first. In order to do this, we'll need to be in the proper directory that our Django project lives in. This is easy enough with the cd context manager that fabric defines for us.
@task
def deploy():
with cd("/home/ubuntu/rocketu_blog_analytics"):
run("git pull origin master")
What this is doing, is running any command inside of the context manager inside of the rocketu_blog_analytics
folder. This will effectively run cd /home/ubuntu/rocketu_blog_analytics
then run git pull origin master
.
Try it out!
Next on our list is installing any changes from our requirements file. The command part is easy enough, but the trick here is that we need to be inside of our virtualenv when we run this command.
Again, fabric has another context manager called prefix
that we can use so that we can run workon blog_analytics
, before we use pip
.
@task
def deploy():
with prefix("workon blog_analytics"):
with cd("/home/ubuntu/rocketu_blog_analytics"):
run("git pull origin master")
run("pip install -r requirements.txt")
Trying running this, it still won't work. This is because we're using virtualenvwrapper, which requires that our .bashrc
file is loaded. To do this, we need to add a line to the top of our file: env.shell = "/bin/bash -l -i -c"
This line, specifically the -i
flag, informs our remote commands that it should be source any bash profiles that exist for our user.
There's 4 commands left you need to automate. We now have all of the context managers set up properly to be inside of the proper virtualenv and in the right folder to run them all.
Try to implement the rest yourself. Hint: You need to use sudo to restart supervisor and nginx. Check out the fabric documentation for it if you need help.
- Run migrations in case there have been any schema changes.
- Run collectstatic to get any new static files
- Restart supervisor
- Restart nginx
Since our examples this morning our so small, we can't see the immediate benefit, but splitting up our fabfile methods into small reusable chunks will be great for saving time later.
In this exercise we could split out restarting supervisor and nginx into it's own method called restart_app
.
def restart_app():
sudo("service supervisor restart")
sudo("service nginx restart")
Note that I don't have to use the @task decorator, since this isn't a stand alone task I want to call via fab restart_app
. It is just a helper method that will be called by the other methods.
Let's take a look at another example that's completed already.
@task
def setup_postgres(database_name, password):
sudo("adduser {}".format(database_name))
sudo("apt-get install postgresql postgresql-contrib libpq-dev")
with settings(sudo_user='postgres'):
sudo("createuser {}".format(database_name))
sudo("createdb {}".format(database_name))
alter_user_statement = "ALTER USER {} WITH PASSWORD '{}';".format(database_name, password)
sudo('psql -c "{}"'.format(alter_user_statement))
This fabric task helps automate setting up the different moving parts of a PostgreSQL database for our Django application.
Step by step, this runs the list of necessary commands that were run earlier. It also takes in variables, so we could optionally configure this for any database name and password.
There is one new bit of functionality introduced here, which is the settings
context manager. This specifies that any command ran inside this code block uses the postgres
user.
Fabric also comes with some built-in methods for manipulating files. Specifically we're going to take a look at upload_template
, which will help us set up files on our remote server, like our nginx configuration file.
As the name suggests, they are templates, and like Django templates they have some basic functionality to replace variables in them with a dictionary of data we pass to it.
Let's check out how we would upload our nginx configuration file.
We'll create a deploy folder in our Django project. It is best practice to designate a folder where you will keep these configuration templates.
Inside this folder, let's create an empty file called nginx.conf
. In here, we want to put our basic nginx configuration file.
Below are the contents of the nginx.conf
file.
server {
server_name %(server_name)s;
access_log off;
location /static/ {
alias /home/ubuntu/static/;
}
location / {
proxy_pass http://127.0.0.1:8001;
proxy_set_header X-Forwarded-Host $server_name;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Host $host;
}
}
Note that we've set up a variable for our server name. In a real life deployment scenario, we may want to use this script to not only deploy to production, but possibly to a staging or development server as well. By setting our fabfile up to work with variables, we're giving ourselves flexibility to deploy to different environments.
Now let's actually upload this template to our server.
@task
def setup_nginx(project_name, server_name):
upload_template("./deploy/nginx.conf",
"/etc/nginx/sites-enabled/{}.conf".format(project_name),
{'server_name': server_name},
use_sudo=True,
backup=False)
restart_app()
Here we've specified:
- the local path to the configuration file to be uploaded
- the remote path on the server to where it should be uploaded to
- a dictionary of data to be used to replace variables in the template
- that we should use sudo
- that we do not want to make a backup
Run it!
fab setup_nginx:rocketu_blog_analytics,your_ec2_url
Let's wrap up our lesson on fabric with trying to implement the gunicorn and supervisor conf files we set up yesterday.
- Create the appropriate conf files in your deploy folder. Make them project agnostic, so that
rocketu_blog_analytics
orblog_analytics
does not show up anywhere in them. - Create a new task, which will upload those templates, insert the appropriate variables, and place them in the right directory on our server.
- Make sure you use our helper method,
restart_app
, to restart supervisor and nginx after you're finished uploading the templates.
Obviously we have just scratched the surface of what a large fabric deployment script would look like. Mezzanine, an open source Django blog project, has a built-in fabfile, which is very comprehensive and a good place to see examples.
Here are some learned lessons from using Fabric to keep in mind if you find yourself at one point needed to automate your deployment process:
- Try to keep your fabfile's methods in small reusable chunks.
- Splitting your
fabfile.py
out into afabfile
directory with multiple files can make it easier to read and maintain. - First manually attempt whatever you're trying to automate and take careful notes of each step. This will save you both time and headaches.
- Find a good way to reliable test your script is working. Rerunning the whole script from the beginning every time you make a small change can be a nightmare.
As we saw earlier, Fabric is great for automating our deployment process by scripting it in python. In practice, this works great at first and for small teams, this can perfect.
As the complexity and size of your system grows as well as the number of people who need to work and interact with your fabfile issues start to come up. The fabfile can quickly become a burdensome application to maintain and is hard to keep clean and organized. It is usually hard for anyone outside of the core maintainers to understand how it operates.
This is where Ansible comes in. Ansible is a Configuration Management and Deployment Automation tool, which at the core does a lot of the same things our fabfile does, but makes it easier to configure, read, and understand for the developer.
Popular competitors to Ansible are Chef, Puppet, and Salt Stack. Chef & Puppet have been around for much longer and are considered more mature options, but both are written in Ruby and have a very difficult learning curve. Ansible and Salt Stack are newer and are both implemented in python.
We've chosen to go with Ansible, but there is still not a clear favorite yet between Ansible and Salt Stack. They both have their pros and cons. Read more about the debate here.
Ansible is an open source technology that has thousands of contributors, with all of the code up on github. The open source project is backed by a company called Ansible Works, which is/was a startup that has investors.
Ansible Works has a few paid for services that enhance their open source project. A major one being consulting on how to implement Ansible in the enterprise and another product called Ansible Tower. Ansible Tower let's you manage your hundreds of servers from a GUI and let's you run your "playbooks" from a web application.
We'll get into what a playbook is shortly.
Every configuration management tool has it's own list of vocabulary for what it calls it's moving pieces. Let's have a quick rundown up front.
- Playbook - Each playbook is made up of multiple plays, which say for each server what list of commands should be run on that server.
- Inventory - Your inventory is a list of all of your different servers in your system and what roles each of them have.
- Role - A role defines a set of functionality your server should have. For example your server may be an application server, a database server, or could be both.
- Variables - Variables help define information that we may want to change. This could make it so that our Playbooks our reusable among different projects or so we can define a list of variables for our production server and a slightly different list for our staging server.
Ansible uses yaml as it's configuration language of choice. Yaml is a good choice since it is meant to be very human readable, which is necessary when the nature of what you're configuring can be very complicated.
At this point we've probably figured out the sports analogy that Ansible is trying to make use of. Let's make our first Playbook and get our first Ansible play running.
- The first thing we need to do is install Ansible. Since it is written in python, we can just install it from pip!
pip install ansible
. - Next we want to create our Playbook. Create a file called
production.yml
in our deploy folder. - At first, we want to put the following code in that file:
---
- name: Provision a RocketU Blog Analytics server
hosts: all
sudo: yes
sudo_user: root
- To keep it simple, we're going to use one Play, which is in charge of setting up one server with our Django project. We give it a name, specify that it should use all of our servers (which will just be 1), and it can use sudo as root.
In order to start listing out commands for our Play to do, we need to create our first role. Best practices is to have a base
role, which we give to every server.
- Let's create a folder called
roles
in ourdeploy
folder. - Then let's make another folder called
base
inside of ourroles
folder. - Role's come with their own list of vocabulary:
- Task - This is a single command to run on the server.
- Handlers - These are commonly reused tasks, such as restarting nginx.
- Templates - Like we saw with fabric, we can have jinja templates that we can upload to our server.
- Variables - These are variables that we set for this specific role.
Each one of these can have it's own folder inside of the role. Let's make a folder called tasks
inside of base.
And finally, let's make another empty file inside of tasks called main.yml
.
Okay! We've got all of the pieces set up to write our first task. Our goal with Ansible is to create a Playbook that will set up a new EC2 instance from scratch for our website as well as be re-runable so that it will update our instance with our latest code, migrations, requirements, etc.
So for our first task, let's automate the first step, which is to update and upgrade apt-get.
In main.yml
in our base
role, let's put the following:
---
- name: Update and upgrade apt-get
apt: update_cache=yes upgrade=yes
We declare the name of our task, then we use the Ansible apt module and specify that we want to update and upgrade.
Ansible comes with lots of core modules built-in. There is also a large community that creates 3rd party modules you can use as well for all common types of tasks you'd like to run.
Now that we created our role and our first task, we need to register our role with our play in our playbook:
- name: Provision a RocketU Blog Analytics server
hosts: all
sudo: yes
sudo_user: root
roles:
- base
This tells our Play to run all of the tasks in the base
role.
Let's checkout the command which will run our first task, in our first play, of our first playbook.
ansible-playbook -i server_public_ip, --private-key ~/.ssh/blog-analytics.pem -u ubuntu -v deploy/production.yml
This is long, so let's break it down into it's parts:
- ansible-playbook - this is the ansible command that is available after we installed it
- -i your_public_ip - this is specifying our inventory, which is just our one server
- --private-key - this points to our identityfile we need to use to ssh to our server
- -u ubuntu - we need to ssh in as ubuntu
- -v deploy/production.yml - finally, we specify the playbook that we want to run Run it!
Ansible has some fairly subtle, but powerful features in the way you can define tasks and variables. Next let's automate the apt-get installing of our different packages we'd like to use on every server.
We'll make a new task in main.yml
for our base role. This will just go sequentially, below the last task we
just made.
- name: Install base packages
apt: name={{ item }}- name: Ensure the PostgreSQL service is running
service: name=postgresql state=started enabled=yes state=installed
with_items:
- nginx
- emacs
- git
- python-pip
- python-dev
- build-essential
- supervisor
tags: packages
This task will actually get run for every item in the with_items list, meaning
apt-get install {{ item }}
will be called with every item.
Let's run our playbook again and see that our packages get installed.
Following along with our setup slides, next we want to install the latest version of pip. Again this is a new task in our role's yaml file.
- name: Latest version of pip
pip: name=pip state=latest
tags: packages
Note that we're specifying that we want the latest version of pip and we're using the pip module, which is the equivalent of running pip install
.
Again, run your playbook and verify it worked.
Lastly to finish our base role's task list, we need to install the latest version of virtualenv. Try this out on your own! It should look awfully similar to the last task we just created.
Next up, we're going to create the db
role, which will handle installing PostgreSQL, creating the proper user, and the correct database.
Let's create the appropriate files and folders for our new role. This means a db folder on roles, a tasks folder under db and a main yaml file in the tasks folder.
Our first task, will be to install all of the necessary packages for PostgreSQL:
---
- name: Install PostgreSQL
apt: name={{ item }} update_cache=yes state=installed
with_items:
- postgresql
- postgresql-contrib
- libpq-dev
- python-psycopg2
tags: packages
Place this in our new main.yml
tasks file. We need to install psycopg2 globally in order for Ansible's following postgreSQL modules to work.
We also need to register our new db role in production.yml
by just adding - db
under roles.
Now that we've installed postgresql, we want to make sure it's running:
- name: Ensure the PostgreSQL service is running
service: name=postgresql state=started enabled=yes
This task simply runs service postgresql restart
Let's run our script and make sure we have installed and started postgres properly.
Now that we have postgresql installed, we can create the database. Again, Ansible has a handy, postgresql_db module we can use to accomplish this:
- name: Ensure database is created
sudo_user: postgres
postgresql_db: name={{ db_name }}
encoding='UTF-8'
lc_collate='en_US.UTF-8'
lc_ctype='en_US.UTF-8'
template='template0'
state=present
But this time, we'd like to make the name of the database configurable. We're going to need this name later to put in local_settings.py
.
Now it's time to set up some variables. We want to make a new folder in our deploy
folder called env_vars. Inside this folder, let's make a file called production.yml
. Here we can configure variables for our production environment. Put the following inside of it:
---
db_name: "blog_analytics"
We then need to tell our play in our playbook, that we want to use this variables file. This snippet of code can be defined right after our roles
section.
Now run our playbook. We should now have postgresql set up with a new database for our application to use.
Lastly, we need to make sure our postgresql user is created with the proper name and password. This will again require a few more variables that we'd like to setup.
** db/tasks/main.yml
- name: Ensure user has access to the database
sudo_user: postgres
postgresql_user: db={{ db_name }}
name={{ db_user }}
password={{ db_password }}
priv=ALL
state=present
** env_vars/production.yml
db_user: "blog_analytics"
db_password: myawesomepassword
Let's run our playbook again. We should have postgresql entirely ready for our Django project.
Let's define our last role, web, by creating all of the appropriate files and folders. roles -> web -> tasks -> main.yml
We then need to make sure we register our role in production.yml
Our web role is where the meat of our Ansible script is going to take place so instead of place all of our tasks in main.yml
, we're going to instead split them out into separate files.
Let's create a new file in our tasks folder called setup_virtualenv.yml
. In our main.yml
, let's put the following, which will include this new file:
---
- include: setup_virtualenv.yml
In setup_virtualenv.yml
, we only need to write one task. We don't have access to virtualenvwrapper, so we'll use the command for reguar virtualenv.
---
- name: Create the virtualenv
command: virtualenv {{ virtualenv_path }} --no-site-packages
creates={{ virtualenv_path }}/bin/activate
Note the creates
part of this command. If we run this script multiple times, Ansible will not try to create this virtualenv again if that file and folders already exist.
You'll notice we also used another variable here, since we're going to want to refer to the location of the virtualenv often and have that be configurable.
In our web
role let's create a new folder vars
with it's on main.yml
. Inside of it, let's define this variable:
---
virtualenv_path: "/home/ubuntu/.virtualenvs/blog_analytics"
Next we want to clone our github repository onto our server. We'll want to create a new file at tasks/setup_git_repo.yml
and put - include: setup_git_repo.yml
in tasks/main.yml
.
We will want to define three new variables in vars/main.yml
:
git_repo: https://github.com/rocketu/rocketu_blog_analytics.git
project_name: rocketu_blog_analytics
project_path: "/home/ubuntu/{{ project_name }}"
We will need to use git_repo
and project_path
in our task. Read Ansible's documentation on the git module and try to implement the task yourself!
After we've pulled down the repo from git we need to ensure that our ubuntu user owns the repository folder.
Potentially, if we used sudo, the root user will own the folder and will cause hiccups with running our application later.
Let's put the following task in our tasks/setup_git_repo.yml
:
- name: Ensure user owns our project
file: state=directory path={{ project_path }} owner=ubuntu
Now that we have our code cloned to our server we can start installing our packages and running our migrations.
Let's create a new file for our Django app setup tasks: tasks/setup_django_app.yml
and include it:
-include: setup_django_app.yml
.
The first thing we're going to do is create our local_settings.py
file by using Ansible's templates. Just like Fabric and Django, Ansible is using a templating library, Jinja2, to help make it easy to upload files to our server and replace variables as needed.
Our local settings template task will look like this:
---
- name: Create the local settings file
template: src=local_settings.j2
dest={{ project_path }}/local_settings.py
owner=ubuntu
group=ubuntu
mode=0755
We're using Ansible's template module and we're specifying where the template exists locally and where its destination is on our server. We need to make a new folder web/templates
with the file local_settings.j2
in it.
Our local settings template should have the following in it:
import os
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'NAME': '{{ db_name }}',
'HOST': 'localhost',
'USER': '{{ db_user }}',
'PASSWORD': '{{ db_password }}'
}
}
BASE_DIR = os.path.dirname(os.path.dirname(__file__))
STATIC_ROOT = os.path.join(BASE_DIR, 'static')
STATIC_URL = '/static/'
DEBUG = False
ALLOWED_HOSTS = ['{{ nginx_server_name }}']
Ansible will replace all of these variables, with what we have set already. We have one new one that we need to define in env_vars/production.yml
: nginx_server_name: "your_ec2_url"
.
Run it! Check that your local settings file looks correct on your server.
Next we need to install our requirements file. This task is fairly straightforward. We should put the following in setup_django_app.yml
- name: Install packages required by the Django app inside virtualenv
pip: virtualenv={{ virtualenv_path }} requirements={{ requirements_file }}
We do have one new variable we want to keep track of here, which is the location of our requirements file. In web/vars/main.yml
let's define that:
requirements_file: "{{ project_path }}/requirements.txt"
Lastly, we need to run our migrations and collectstatic. First let's migrate our application. We'll use Ansible's django_manage module, which knows about Django's manage.py commands.
- name: Run Django migrations
django_manage:
command: migrate
app_path: '{{ project_path }}'
virtualenv: '{{ virtualenv_path }}'
settings: '{{ django_settings_file }}'
tags: django
Put this code in setup_django_app.yml
. Again, we see one new variable we need to define. The django_manage module needs to know where to find our Django setting file.
Let's define this in env_vars/production.yml
: django_settings_file: rocketu_blog_analytics.settings
. In the future, if we wanted to use a different settings file per deployment environment, we could now do that easily by changing this setting.
Following the same template, write the last task, which will collectstatic. Check out the Ansible documentation on the django_manage module.
Finally, we're on to configuring our WSGI and web server then we're done!
Let's create a new file for our supervisor setup tasks: tasks/setup_supervisor.yml
and include it: - include: setup_supervisor.yml
.
First, we need to make sure we've installed gunicorn, so here's our first task, which should look familiar:
- name: Ensure gunicorn is installed
pip: virtualenv={{ virtualenv_path }} name=gunicorn
Next we need to create our gunicorn config file. We'll use Ansible's template module again and create one that looks like what we manually created yesterday.
Put the following in templates/gunicorn.conf.j2
proc_name = "rocketu_blog_analytics"
bind = '127.0.0.1:8001'
loglevel = "error"
workers = 2
And here's our task to copy that file over to our server:
- name: Create the Gunicorn conf file
template: src=gunicorn.conf.j2
dest={{ project_path }}/gunicorn.conf.py
owner=ubuntu
group=ubuntu
mode=0755
Now we need to do the same thing for our supervisor config file. Look above and emulate the template process we just did for the gunicorn conf file.
You'll need to make use of two variables in your template:
{{ virtualenv_path }}
and {{ project_path }}
.
Lastly, we need to have supervisor reload our new config files when it's done and then restart supervisor. Put these final tasks in our setup_supervisor.yml
taks file.
- name: Restart Supervisor
supervisorctl: name={{ project_name }} state=restarted
- name: Restart Supervisor Service
service: name=supervisor state=restarted sleep=1
Note we sleep
supervisor for 1 second. This is because sometimes the task can erroneously fail because supervisor can take a little while to restart.
And lastly, we need to set up our web server then our website should be up!
Let's create a new file for our nginx setup tasks: tasks/setup_nginx.yml and include it: - include: setup_nginx.yml.
The first task we'll implement is uploading our nginx config. Again, we'll use Ansible's template module to help us do this. Put the following as the first task in our file:
---
- name: Create the Nginx configuration file
template: src=nginx_site_config.j2
dest=/etc/nginx/sites-available/{{ project_name }}
notify: reload nginx
Notice we have a new command here called notify
. This is the last new Ansible concept we'll be introduced to.
At the beginning of this section we mentioned handlers are reusable tasks that we may want to call more than once, specifically after another task has been run. Reloading nginx's config files is a good example of this.
Let's create a new folder and file web/handlers/main.yml
and put the following handler in it:
---
- name: reload nginx
service: name=nginx state=reloaded
Now we need to just define the nginx config template file. Create the template file in web/templates/nginx_site_config.j2
. Put the following template in our new file.
server {
server_name {{ nginx_server_name }};
access_log off;
location /static/ {
alias /home/ubuntu/static/;
}
location / {
proxy_pass http://127.0.0.1:8001;
proxy_set_header X-Forwarded-Host $server_name;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header Host $host;
}
}
We already have defined nginx_server_name
. This would allow us to easily in the future change what the URL is for our application and have it update both in our nginx config and in ALLOWED_HOSTS
when we redeploy.
Next we want to make sure we disable the default nginx config and enable ours, by deleteing and creating symlinks. Here are the two tasks that we should put into setup_nginx.yml
:
- name: Ensure that the default site is disabled
command: rm /etc/nginx/sites-enabled/default
removes=/etc/nginx/sites-enabled/default
notify: reload nginx
- name: Ensure that the application site is enabled
command: ln -s /etc/nginx/sites-available/{{ project_name }}
/etc/nginx/sites-enabled/{{ project_name }}
creates=/etc/nginx/sites-enabled/{{ project_name }}
notify: reload nginx
These should match up with the commands from the slides from yesterday.
Lastly, we need to make sure that nginx is actually running, so we'll put in one last task:
- name: Ensure Nginx service is started
service: name=nginx state=started enabled=yes
Run it! Then try to go to your URL in your browser. If we set up our script right, we should see our blog analytics application running in the browser.
Unfortunately when writing an Ansible script it is often hard and tedious to test every single change. There are also no great testing tools available and writing unit tests for DevOps tools is generally a hard thing to do.
Soon we'll see how other services abstract the entire process away. Still, it's important to have context for what's going on behind the scenes.