Backing up a Drupal Database

I was just dumping a database using mysqldump and I noticed that tables such as cache (and its cousins cache_page, cache_menu, etc), sessions and watchdog can be pretty big and are also not often essential for backing up. I mean, when you restore your web site do you really care about restoring people's logged in sessions from when the backup took place? I can understand maybe keeping watchdog; but then again should you lose your site you would probably lose it several hours after the backup so would miss out on any relevant watchdog notices.

Anywho… I did a mysqdump for a large database for a website I maintain and the dump came out at 400Mb. I then spent a few minutes cobling together a small script which would do a mysqldump but had some pre-programmed Regular Expressions to match specific groups of tables which it would only dump the structure for (ie, no data). After running this script, the SQL dump was only 220Mb. Much better! It also runs considerable quicker too and will cause less table locking.

So - the script?

#!/bin/bash

# Define DB Login
USER="username"
PASS="password"

# Get the database from the commandline
DB=$1

# Define the 'structure only' tables
STRUCTURE_ONLY="/^(prefix1_|prefix2_)?(watchdog|sessions|cache(_.+)?)$/"

# Get the tables from the database
TABLES=`mysql -u$USER -p$PASS -B -N -e 'show tables;' $DB`

# Create the SQL file
DBFILE="${DB}.sql"
 > $DBFILE

# Status message
echo "Starting dump of ${DB}"

# Loop over the tables
for t in $TABLES; do
  # Test if the table matches the 'structur only' regex
  RESULT=`echo "$t" | gawk "$STRUCTURE_ONLY"`

  # if a match...
  if [ $RESULT ]
  then
    # ... dump structure only onto the end of the SQL file
    mysqldump --opt --no-data --user=$USER --password=$PASS $DB $t >> $DBFILE
  else
    # dump full table onto the end of the SQL file
    mysqldump --opt --user=$USER --password=$PASS $DB $t >> $DBFILE
  fi
done

# Finish Message
echo "Done"
[adsense:336x280:9994499560]

There are a few lines in here you might want to configure… For example, the USER & PASS variables will need to be the ones for your database. This script also assumes you're running on localhost. The variable STRUCTURE_ONLY hold the Regular Expression to match the 'no data' tables. You might want to change this to match your database structure more accurately - for example you might want to change or remove the prefix option. In my case I have 1 database hosting 3 sites, 2 of which have a prefix but share several tables with no prefix (user, session, sequences, etc). This is how it would look for a database with no prefixes.

STRUCTURE_ONLY="/^(watchdog|sessions|cache(_.+)?)$/"

To run the script, copy the code into a file (eg, 'drupaldump') and save it. Make it executable by running something like this:

$ chmod +x drupaldump

The finally run the script as follows…

$ ./drupaldump [database]

… where [database] is the name of the database you'd like to dump.

Please leave a comment if you can think of any improvements! There are a few I can think of but don't need (ie, its not my itch!) such as option to pass username and password in from the command line, changing the host for the database… There is plenty of space for improvement.