Bare-metal installation for Nginx and Jekyll

This blog is a bunch of Jekyll-created HTML which is served by the Ngix HTTP server. This post documents the process of getting Jekyll and Nginx setup from bare metal. It also shows a script being used to periodically pull and generate your blog from GitHub sources. The instructions that follow should work for RedHat 6 and derivatives (such as CentOS 6 which is what I’m using).

Create a user and setup ssh

With a new VM you’ll typically be given root access, but security 101 dictates that you avoid running commands as the root user as much as possible. Therefore the first thing you’ll want to do is to create a user, in this case bloguser:

shell$ useradd bloguser

Next, change the password for the user:

shell$ passwd bloguser

Now you’ll want to create a SSH public/private key set for your user. It’s recommended that you do this on your own machine, not your VM, since you don’t want your private key out there if it can be avoided.

shell$ ssh-keygen -t rsa

This will generate the following files on your local host:

.ssh/id_rsa
.ssh/id_rsa.pub

Once these files are generated, create the .ssh directory on your VM (these steps assume you’re logged-in as root):

shell$ su - bloguser
shell$ mkdir .ssh

Create .ssh/authorized_keys on your VM, and copy the contents of .ssh/id_rsa.pub from your local host:

shell$ vi .ssh/authorized_keys

Setup the permissions on the directory and file.

shell$ chmod 700 .ssh
shell$ chmod 600 .ssh/authorized_keys

Test out your ssh setup, by ssh-ing from your local host to your VM as the bloguser user:

shell$ ssh bloguser@<vm-host>

As root, allow the bloguser user to perform commands as root (if the /etc/sudoers file doesn’t exist, then you will need to install sudo with the yum install sudo command).

shell$ vi /etc/sudoers

Add the following line:

%bloguser       ALL=(ALL)       ALL

Setup some basic security

Next up is tightening-up the SSH configuration.

shell$ sudo vi /etc/ssh/sshd_config

Inside this file you will do three things:

Change the port from 22 to some other number (such as 52846 in the example below).
Disable password authentication, so that a private key must be used to login to the server.
Block the root user from ssh access to your host.

The file therefore needs to contain the following lines (make sure all other entries with these names are commented-out).

Port 52846
PasswordAuthentication no
PermitRootLogin no

Restart the ssh daemon to pick up the changes you just made.

shell$ sudo /sbin/service sshd restart

The next step is to setup a firewall to restrict incoming traffic to just ssh and HTTP. To do this create a file called vm-iptables.sh with the following content. You’ll be executing the following commands as root.


#!/bin/bash

# Flush all current rules from iptables
iptables -F

# Allow SSH and HTTP connections
iptables -A INPUT -p tcp --dport 52846 -j ACCEPT
iptables -A INPUT -p tcp --dport 80 -j ACCEPT

# Drop traffic on all other inbound ports
iptables -P INPUT DROP
iptables -P FORWARD DROP

# Allow all outbound traffic
iptables -P OUTPUT ACCEPT

# Accept any connection on the local port
iptables -A INPUT -i lo -j ACCEPT

# Accept packets belonging to established and related connections
iptables -A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT

# Save the iptables
/sbin/service iptables save

# List
iptables -L -v

After you’ve created the file, make it an executable and execute it to save your rules.


shell$ chmod +x ./vm-iptables.sh
shell$ sudo ./vm-iptables.sh
iptables: Saving firewall rules to /etc/sysconfig/iptables:[  OK  ]
Chain INPUT (policy DROP 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination
    2   104 ACCEPT     tcp  --  any    any     anywhere             anywhere            tcp dpt:ssh
    0     0 ACCEPT     tcp  --  any    any     anywhere             anywhere            tcp dpt:http
    0     0 ACCEPT     all  --  lo     any     anywhere             anywhere
    0     0 ACCEPT     all  --  any    any     anywhere             anywhere            state RELATED,ESTABLISHED

Chain FORWARD (policy DROP 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination

Chain OUTPUT (policy ACCEPT 2 packets, 264 bytes)
 pkts bytes target     prot opt in     out     source               destination

The output shows your new iptables configuration which reflects the rules we saved in myvm-iptables.sh.

Install and start Nginx

Add the EPEL yum repository into your configuration:

shell$ sudo rpm -Uvh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-7.noarch.rpm

Install Nginx using yum:

shell$ sudo yum install nginx

Setup Nginx so that it auto-starts at system start time:

shell$ sudo chkconfig nginx on

Start Nginx:

shell$ sudo /sbin/service nginx start

You can test that Nginx is up and running by pointing your browser at your VM IP address - you should see a page confirming that all is good.

Nginx welcome screen

Install Jekyll

The following commands will install Jekyll on your VM:

shell$ sudo yum install gcc rubygems ruby-devel
shell$ sudo gem install jekyll

Install Pygments (for code syntax highlighting)

shell% sudo yum install python-setuptools
shell$ sudo easy_install Pygments

Create a crontab entry and script to generate the blog

We’re going to setup Jekyll to write to the Nginx HTML directory, and since we’re going to do this as the bloguser user, we’ll first need to wipe-out the contents of that directory, and chown it so that the bloguser can write to it:

shell$ sudo rm -rf /usr/share/nginx/html/*
shell$ sudo chown bloguser:bloguser /usr/share/nginx/html

We’ll assume that you have a GitHub repository that’s hosting your Jekyll sources. Therefore you need to install git.

shell$ sudo yum install git

Create a directory to contain your blog source

shell$ sudo mkdir -p /app/blog
shell$ sudo chown bloguser:bloguser /app/blog

The script will send out an email if an error is encountered, so you need to install mail:

shell$ sudo yum install mailx

Next on our list is creating a script which will do the following:

Pulls the latest blog sources from GitHub.
Uses Jekyll to generate the HTML for the blog.
Sends an email if Jekyll exits with an error, or if the home page can’t be retrieved

Create a shell script in /app/blog/gen.sh:

shell$ vi /app/blog/gen.sh

Copy the following content into this file, which clones your github repo for the first time if it doesn’t already exist, or updates the local copy via the pull command:

#!/bin/bash

send_email_and_exit() {
  recipient=$1
  message=$2

  echo "Sending email and exiting due to error"

  /bin/mail -s "Blog generation failure" "${recipient}" << EOF
${message}
EOF

  exit 1
}

echo "Running at "`date`

basedir=/app/blog
gitdir=${basedir}/blog
nginxdir=/usr/share/nginx/html
githubrepo=https://github.com/alexholmes/blog.git
emailto="grep.alex@gmail.com"

if [ ! -d ${gitdir} ]; then
  echo "Checking out repo for the first time"
  mkdir -p ${gitdir}
  cd ${basedir}
  git clone ${githubrepo}
else
  cd ${gitdir}
  git pull
fi

cd ${gitdir}

rm -rf ${nginxdir}/*
jekyll --no-auto . ${nginxdir}/

exitCode=$?

if [ ${exitCode} != "0" ]; then
  send_email_and_exit "${emailto}" "Jekyll failed with exit code ${exitCode}"
fi

curl http://0.0.0.0:80/ >/dev/null 2>&1

exitCode=$?

if [ ${exitCode} != "0" ]; then
  send_email_and_exit "${emailto}" "Curl failed with exit code ${exitCode}"
fi

Make the file executible:

shell$ chmod +x /app/blog/gen.sh

Now all you need is a crontab entry to refresh your blog every 5 minutes:

shell$ crontab -e
*/5 * * * * /app/blog/gen.sh &>> /app/blog/gen.out

To check your crontab settings use the -l option:

shell$ crontab -l
*/5 * * * * /app/blog/gen.sh &>> /app/blog/gen.out

Now you can either wait for up to 5 minutes for the cron to execute the script, or simply run it yourself:

shell$ /app/blog/gen.sh

Now when you refresh your browser you’ll see your Jekyll-generated website!

Nginx serving up Jekyll-generated content

About the author

Alex Holmes works on tough big-data problems. He is a software engineer, author, speaker, and blogger specializing in large-scale Hadoop projects. He is the author of Hadoop in Practice, a book published by Manning Publications. He has presented multiple times at JavaOne, and is a JavaOne Rock Star.

If you want to see what Alex is up to you can check out his work on GitHub, or follow him on Twitter or Google+.

comments powered by Disqus

RECENT BLOG POSTS

Configuring memory for MapReduce running on YARN

This post examines the various memory configuration settings for your MapReduce job.
Big data anti-patterns presentation

Details on the presentation I have at JavaOne in 2015 on big data antipatterns.
Understanding how Parquet integrates with Avro, Thrift and Protocol Buffers

Parquet offers integration with a number of object models, and this post shows how Parquet supports various object models.
Using Oozie 4.4.0 with Hadoop 2.2

Patching Oozie's build so that you can create a package targetting Hadoop 2.2.0.
Hadoop in Practice, Second Edition

A sneak peek at what's coming in the second edition of my book.

Full post archive