10 December 2012

BIND part 2: DNS blackhole via DLZs


In my last post I detailed installing BIND with DLZ support -- in this post I'll actually USE that option.

Step One: Database setup

PostgreSQL, by default, now creates databases in UTF-8 encoding. The DLZ driver in BIND uses ASCII/LATIN1, which introduces some interesting peculiarities. Specifically, querying for 'zone="foo.com"' may not work. That's okay, you can still create LATIN1 databases using the template0 template. Since PostgreSQL replication is already configured, everything but the BIND configuration is done on the database master. First, add the new database user:
createuser -e -E -P dnsbh
Since this user won't have any extra permissions, just answer 'n' to the permissions questions.

Now add the new database, as latin1, with dnsbh as the owner:
createdb dnsbh -E 'latin1' -T template0 -O dnsbh
The database schema is pretty flexible; there are no required column names or types, as long as queries return the correct type of data. I like to use a schema that reflects the type of data the column holds so I'll use the following create statement:
create table dns_records (
    id serial,
    zone text,
    host text,
    ttl integer,
    type text,
    data text
);
create index zone_idx on dns_records(zone);
create index host_idx on dns_records(host);
create index type_idx on dns_records(type);
alter table dns_records owner to dnsbh; 
This can be modified, of course. The data, type, host and ttl fields can have size restrictions put in place and you can drop the id field altogether. The bind-dlz sourceforge page lists several more columns but those are only necessary for a full DLZ deployment, where the DNS server is authoritative for a domain, not for a purely recursive caching server.

Step Two: BIND setup

If you read the bind-dlz configuration page you'll find a set of database queries that get inserted into named.conf. You can modify these if you like but it's much easier to use what's provided. There are, at a minimum, four lines that need to be included. The first indicates the database type and the number of concurrent connections to keep open (4). The next is the database connection string (host, username, password). The third returns whether the database knows about a given zone and the fourth returns the actual IP for any lookup that isn't for types MX or SOA. Add the following to /etc/namedb/named.conf and restart BIND (I split the fourth "line" into multiple lines for readability - you can put it all on one line or leave it as below):
dlz "postgres" {
database "postgres 4
{host=localhost port=5432 dbname=dnsbh user=dnsbh password='password_you_used'}
{select zone from dns_records where zone = '$zone$'}
{select ttl, type,
  case when lower(type)='txt' then '\"' || data || '\"' else data end
  from dns_records
  where zone = '$zone$' and
  host = '$record$' and
  not (type = 'SOA' or type = 'NS')}";
}
At this point BIND should restart and it should continue to serve DNS as normal -- meaning it's time to test the blackhole. A quick and easy test is to dig for sf.net:
dig @127.0.0.1 sf.net
Then add sf.net to the dns_records table on the PostgreSQL master:
insert into dns_records(zone, type, host, data, ttl) values ('sf.net', 'A', '*', '127.0.0.1', '3600');
Dig for sf.net again:
dig @127.0.0.1 sf.net
Note the 3600 second TTL. If a downstream DNS server were querying our blackhole then the value of 127.0.0.1 would get cached for an hour. To remove the sf.net entry from the blackhole, go back to the PostgreSQL master:
delete from dns_records where zone = 'sf.net';
www.malwaredomains.com and projects like ZeusTracker keep a list of domains seen in the wild that are used to provide malware. The malwaredomains.com site has a great tutorial on how to do DNS blackholes via BIND configuration files. Since they provide a list of domains it's pretty trivial to script something to pull those domains out and jam them into the blackhole table.

BIND part 1: Recursive resolver


One of the most commonly used services on the Internet is DNS. We use it every time we type something into Google, check our email or stream a video from YouTube. It is THE address book for nearly everything we do on the Internet.

It is also used nefariously. Malware authors will use DNS instead of static IPs so that they can constantly switch malware servers, making IP-based firewall rules useless. Groups like malwaredomains.com track the domains used by malware and provide a readily-accessible list so that DNS operators can blackhole those lookups. This works great but any changes mean a BIND restart/reload and adding them manually can be a real issue if you have multiple DNS servers and you're not managing them via something like fabric/chef/puppet/etc.

I have taken it one step further. I use readily-available information from groups like ZeusTracker and Malware Domains, then I use the DLZ functionality in BIND, so that I only have to enter the domain into a database table and it immediately becomes active in BIND -- since BIND checks the database before checking its local zones and its cache, no restart or reload is necessary.

In THIS post, I'm detailing the first step: just setting up a recursive resolver for general use with some groundwork laid for the second step, using DLZs for blackholing lookups.

I'm going to use FBSD_8_3_i386_102 as the database master (see the previous post), FBSD_8_3_i386_103 as a recursive resolver with DLZ support so that I can blackhole domains and FBSD_8_3_i386_104 as a test client. Because of the way networking works in VirtualBox, some changes need to be made to the virtual machine that is acting as the BIND server so that it can access the Internet but still serve DNS to the internal network.  Under Settings -> Network, I'm going to click "Adapter 2" and add a NAT interface. This lets me use Adapter 1 to communicate with the other virtual machines in the psql_test internal network and Adapter 2 to communicate with the world.

Now fire up the three FreeBSD virtual machines. On FBSD_8_3_i386_103, edit /etc/rc.conf. Even though VirtualBox says "Adapter 2" is the NAT interface, that adapter is really the new em0 and the old em0 (with a static IP) is now em1. The two lines in /etc/rc.conf should look like:
ifconfig_em0="dhcp"
ifconfig_em1="10.10.10.103 netmask 255.255.255.0"
The netif script doesn't restart dhclient so reboot and make sure networking works.

BIND is included by FreeBSD in the default install but it doesn't have DLZ support. That means installing BIND from ports:
sudo portmaster -t -g dns/bind99
There are a couple of options to toggle at the configuration. I generally disable IPv6, enable DLZ, enable DLZ_POSTGRESQL and then have the ports BIND replace the system BIND:

There are some cons to replacing the system BIND with BIND from ports, namely that a freebsd-update affecting BIND will require a re-installation of the custom-built BIND.

The default BIND configuration is almost perfect for a general caching server so let's start with that. In /etc/namedb/named.conf, find the line that looks like:
listen-on        { 127.0.0.1; };
Comment out that line (lines starting with // are treated as comments).  This tells BIND to respond to DNS queries on any IP. I wouldn't use that configuration on a production resolver but it works fine for this scenario.

If your ISP requires you to use their DNS servers then you can set your BIND instance to "forward only". In that configuration BIND will still serve from its cache but it doesn't try to go out to the root and TLD servers - it just queries your ISPs DNS servers. To enable this configuration in /etc/namedb/named.conf, locate the following section:
/*
    forwarders {
        127.0.0.1;
    }
*/
Remove the /* and */ (those are block commenting delimiters) and change the IP to that of your ISPs DNS server. Then find and uncomment the line that looks like:
// forward only;
To enable and start BIND, edit /etc/rc.conf and add the line:
named_enable="YES"
Then run
sudo /etc/rc.d/named start
An easy test of whether it's answering requests is to run
dig @10.10.10.103 google.com 
Notice that it took a second or so to return anything. That was BIND loading the data into its cache. If you run that command again it should answer almost immediately. An authoritative resolver requires considerably more work but for a basic recursive resolver, that just serves DNS from its cache or makes look-ups on behalf on clients, the setup is fairly trivial.

02 December 2012

Multi-slave PostgreSQL replication

I've recently had a rather large problem with an overloaded database server.  I had several machines with read-only access to a PostgreSQL instance but data was inserted via scripts on the local host and updated interactively via a web application. The scenario is fairly simple and a perfect use-case for multi-slave replication -- let the scripts and web application update the master, replicate to <n> slaves and have the servers with read-only access query the slaves.

This is incredibly easy to setup in a virtual environment, all the more so because the basic process applies regardless of the number of accounts that need access to the data, the number of replication slaves, the number of databases hosted on the master server, etc.

In the end I'll need at least two new virtual machines -- one to act as the database master and one to act as the slave. To keep it simple, I'll clone out the FreeBSD virtual machine. On Mac OS you can control-click the virtual machine and select "Clone":


When cloning, it's important to remember to reinitialise the MAC address of the NIC. This can always be done later but it's easiest to make sure that checkbox is selected:


A linked clone is a lot like deploying a new server but booting over the network from the hard drives of another server. It's best here to select "Full clone", this ensures a clean separation between the original and new virtual machines.


Once the new clone is ready you can spin it up and login (I used my 'demo' user from earlier). Since the master and slaves will need the same set of software installed, I can save a LOT of installation time by installing PostgreSQL and its dependencies *now* and then cloning that VM as the slaves. This can be done with
sudo portmaster -t -g databases/postgresql91-server
 Note this will prompt you to install the PostgreSQL-9.1 client, server and their dependencies:

Remember that portmaster will show you both what it plans on installing, as above, and then what it installed:

To set PostgreSQL to start at boot, edit /etc/rc.conf and add the line
postgresql_enable="YES" 
 This is also a convenient time to edit the hostname line and change it to reflect that this is a new virtual machine. I just changed the 101 to 102.

At this point I'm ready to shut down the virtual machine and make two clones that will be the slaves.  To isolate all of the PostgreSQL traffic I'm going to change the network type for the new VM to "internal network".  There is an option to set a network, I typed in "psql_test".


Note that now my virtual machine list has had FBSD_8_3_i386_102, FBSD_8_3_i386_103 and FBSD_8_3_i386_104 added to reflect the new master (102) and the two new slaves (103 and 104).

Start all three new virtual machines and login (again, I used my 'demo' user). Each VM will need an appropriate IP address added in /etc/rc.conf. The format of the entry is
ifconfig_<interface>="<ip_address> netmask <actual_netmask>"
In the case of FBSD_8_3_i386_102, the actual entry looks like this:
ifconfig_em0="10.10.10.102 net-mask 255.255.255.0"
For the other two virtual machines I used 10.10.10.103 and 10.10.10.104. The entire /etc/rc.conf for the master looks like:
Go ahead and reboot all three virtual machines at this point, just to make sure they'll come back up with their correct IP addresses and that each can ping the other two.

PostgreSQL does not create the necessary data directory and configurations at installation. It requires the user to run the PostgreSQL binary with the "initdb" option. To do this, use:
sudo /usr/local/etc/rc.d/postgresql oneinitdb
"oneinitdb" differs from "initdb" in that you don't need the line in /etc/rc.conf with "postgresql_enable" for "oneinitdb" to work. Here is it unnecessary but it is force of habit.

Now for the master configuration. Full PostgreSQL configuration is beyond the scope of this post, I just want to focus on the necessities to get replication working. First edit /usr/local/pgsql/data/postgresql.conf and add the following lines at the end of the file:
listen_addresses = '*'
wal_level = hot_standby
max_wal_senders = 2
This tells PostgreSQL to accept connections on any interface/IP and to allow up to two clients to pull the write-ahead log. max_wal_senders needs to reflect the number of slaves you are deploying.

Start and stop PostgreSQL so it creates the necessary WAL data -- this is a critical step or replication *will not work*. To do this, use:
sudo /usr/local/etc/rc.d/postgresql start
With PostgreSQL running, create a local admin user:
sudo -u pgsql createuser -e -E -P 
I named mine "demo". Test the new user and add a replication user, "rep_user", with a password of "reppass" and then exit the PostgreSQL client:
psql postgres
create user rep_user replication password 'reppass';
\q
To ease permissions for the next step, I'm going to set the password for the pgsql user to something I know. This can be done on each slave or, using the steps below, only on the master.
sudo passwd pgsql
Now stop PostgreSQL:
sudo /usr/local/etc/rc.d/postgresql stop
On each slave, copy the entire PostgreSQL data directory from the master. This allows PostgreSQL to start making updates on each slave using the streamed WAL (write-ahead log).
sudo -u pgsql scp -r pgsql@10.10.10.102:/usr/local/pgsql/data /usr/local/pgsql/
On each slave, edit /usr/local/pgsql/data/postgresql.conf and change the following lines:
listen_addresses = '*'
wal_level = hot_standby
max_wal_senders = 2
To:
wal_level = hot_standby
hot_standby = on
Then, on each slave, add /usr/local/pgsql/data/recovery.conf, with the following two lines (note: from "primary_conninfo" to "rep_pass'" is all one line, in case it wraps):
standby_mode = on
primary_conninfo = 'host=10.10.10.102 port=5432 user=rep_user password=rep_pass'
The master needs to be configured to allow the rep_user to connect from each slave. I added the following lines to /usr/local/pgsql/data/pg_hba.conf on the master:
host replication rep_user 10.10.10.103/32 md5
host replication rep_user 10.10.10.104/32 md5
Now, restart postgresql on the master and each slave with:
sudo /usr/local/etc/rc.d/postgresql start 
To test that replication is now active, on all three systems issue:
psql -l
Then, on the master, create a new database:
createdb demo
And verify on each slave that the new database was created:
psql -l
From here you can add databases, users, etc., as necessary. If you need to add more slaves (or rebuild a slave) after you already have a database server in production, no problem. Stop the PostgreSQL instance on the master, add the relevant IP to pg_hba.conf, configure the new slave as described above, start PostgreSQL on the slave and restart it on the master.

A New Year, A New Lab -- libvirt and kvm

For years I have done the bulk of my personal projects with either virtualbox or VMWare Professional (all of the SANS courses use VMWare). R...