Wednesday, 13 February 2013

MySQL and ELSA - When Your Storage Runs Out


There has been recent discussion on the ELSA users mailing list about variations on the following scenario:

o SysAdmin (SA) has a single large drive
o SA installs Linux or BSD, MySQL and ELSA
o Something consumes all free space
o SA realises they need to to move ELSA and MySQL to another drive

I didn't think a lot about it when it came up the first time. After it came up multiple times, though, I figured I would write up some instructions, starting with item three from above -- something fills up the hard disk.

The Environment


I already have a running ELSA installation on Ubuntu Server, and I configured it with a single drive, so all of the setup is already done. If you're following along with some of my earlier posts, the exact VM I'm going to work with is Debian_121_elsanode:


I know that I'm going to fill my drive, and I know I'm going to have to add a drive, so I'll go ahead and do that before I boot the VM. In this case I'll add an 8GB data drive called, "Debian_121_elsanode_mysql_data". The process is outlined in screenshots below.

In the VM settings, choose "storage" and then select the SATA controller. Click the icon for "Add Hard Disk":


I want to "Create new disk":


I like to use VMDK files if I think I might export the VM as an appliance:


I don't need to waste the time setting aside all of the space for the new drive, I won't have it long enough to use more than a couple of hundred MB -- and that's a stretch, odds are it's less than 100 MB:


Give the new drive an unique name inside of VirtualBox. "Create" will finally create the new drive:


When I booted the VM I verified Ubuntu picked up the two hard disks with:
dmesg | grep -e sda -e sdb
Note the drive sizes in the output - new drives are added in lexicographical order but it's always nice to verify you're working with the correct drive:


Prepping the New Drive


In reality, the new drive would get added *after* realising there is a storage issue. For the purposes of this demo, though, that's okay, it doesn't matter if the existing drive fills up and then I add the new one and move ELSA/MySQL or if I add the new drive, fill the old one and then move what I need.

So, let's go ahead and prep the new drive. I won't start moving data to it but I'll go ahead and partition and format it.

First, partition the drive with fdisk.
sudo /sbin/fdisk /dev/sdb


This starts fdisk. I can use 'p' inside of fdisk to show me the existing partitions for /dev/sdb:


To create a new partition that uses the entire drive, I'll hit 'n' for new, 'p' for primary, accept the default value of '1' for the partition number, accept the first and last cylinders, then use 'p' again to show the new partition:


Use 'w' to write the changes to the partition table and quit.

With the new partition ready, I need to format it before I can mount it. I prefer ext3, your mileage may vary. To format it in ext3, I'll use:
sudo /sbin/mkfs.ext3 /dev/sdb1
Note that when it runs, I get journal and superblock information. On physical drives, particularly large drives that aren't SSD, this can take a while to run.


One utility every Linux and Unix admin should use on a regular basis is 'df', for 'disk free'. I use the '-h' flag to get "human readable" output -- basically it just outputs all values in kilo-, mega- or gigabytes. It's  great for seeing, in short order, the amount of free space on a partition:


Fill The Old Drive


So I have about 16GB free on my root partition (where both ELSA and MySQL live). To quickly fill that up, I'm going to use a utility called 'dd'. My input will be /dev/zero (so I quickly get values - /dev/random and /dev/urandom can be considerably slower) and I'll output to a file called "consume_drive". Since I'm not giving 'dd' a count, it will run to completion - in this case, until the drive fills up.



Just to verify, I used the mysql command to connect to the locally running database instance, list the existing databases and try to create a new one (note it errors due to full disk):


Recovery Step One: Stop the Running Processes


At this point I would start receiving errors from Sphinx, MySQL and probably rsyslog saying I had disk issues. To kill all of them, I used killall with each process name and used the mysql init script to stop mysqld:
sudo killall -9 syslog-ng searchd perl
sudo /etc/init.d/mysql stop


Note that you DEFINITELY want to make sure searchd and elsa.pl aren't running, otherwise the system load can go through the roof when MySQL stops:



Recovery Step Two: Mount the New Drive, Move Data


With everything stopped, I mounted the new drive as /mnt using:
sudo mount /dev/sdb1 /mnt
I moved everything in /data to the new drive:
cd /data
sudo mv * /mnt/
Then I moved the MySQL database directory from /var/lib/mysql to the new drive:
sudo mv /var/lib/mysql /mnt/

Just for display purposes, the output of 'mount' and a directory listing of /mnt are included:


Recovery Step Three: Mount Point for the New Drive


Since I'm adding the new drive to house all of the MySQL and ELSA data, and I'd already decided to mount the drive as /data, I just need to add one line to /etc/fstab to reflect the new disk and mount point:


Recovery Step Four: MySQL Link


Since all of my MySQL data will live on /data, and I don't really want to fuss around with editing the MySQL configuration file to point to the new location, I'll create a symbolic link from /data/mysql to /var/lib/mysql (remember: hard links can't cross mountpoints, soft/symbolic links can) using:
sudo ln -sf /data/mysql /var/lib/mysql
Right now that location doesn't exist but it will on reboot, as long as the appropriate entry is in /etc/fstab and there are no filesystem issues.

EDIT -- 16 February 2013

I heard from Mike Miller at Miller Twin Racing that on machines with SELinux enabled, there are additional steps that must be taken. It turns out if you try to restart MySQL at this point then you get a failure, even though the symlink is in place:


If you look at the security contexts for /var/lib and /var/lib/mysql, you'll see that /var/lib has a context of var_lib_t and /var/lib/mysql has a context of mysqld_db_t:


There are multiple ways to solve this, the cleanest probably being to add a custom data_dir context that the mysql user can access/write. Since I am treating /data as an extension of /var/lib, a reasonable compromise for me was to give /data the same context as /var/lib and set the context for both /data/mysql* and the /var/lib/mysql symlink to that of the original /var/lib/mysql. This is accomplished with:
semanage fcontext -a -t var_lib_t /data
semanage fcontext -a -t mysqld_db_t "/data/mysql(/.?*)"
semanage fcontext -a -f -l -t mysqld_db_t /var/lib/mysql
restorecon -Rv /


Again, thanks to Mike Miller at Miller Twin Racing for that heads up and the correction!

Recovery Step Five: Reboot


Yes, really, reboot. You can remount /dev/sdb1 as /data to test but this is a virtual, non-production environment and I've followed the steps in this blog post a half-dozen times before actually writing it so I'm pretty confident of the outcome. On reboot /data gets mounted, the symlink for /var/lib/mysql is live and MySQL should be able to restart. You can verify all of this after reboot with a simple 'ps' and 'grep':
ps aux | grep -e perl -e syslog-ng -e sphinx -e mysql
If everything works, you should see output similar to:


Recovery Step Six: ... Profit


Okay, so maybe no profit, but you can rest comfortably knowing that you now have quite a bit of disk space available for your database and syslog needs, that you now can migrate services from a smaller disk to a larger disk with some basic Unix-fu and that you are almost certainly a better system administrator because of it!

No comments:

Post a Comment

Note: only a member of this blog may post a comment.

Logstash Profiling Part One: Time in the Pipeline

I have been pushing Mark Baggett's domain_stats.py (https://github.com/MarkBaggett/domain_stats) script out to my logstash nodes this we...