Jul 23

Unidata Triggers, FTW!

Trying to reverse engineer a packaged application can be an exercise in frustration – you have to figure out what is being updated, how tables/files relate to each other.

One surprisingly effective technique is to turn on logging and figure out what is being updated. One way to do that in recent versions of Unidata is to create an update and delete trigger on *every* table in your database, and then do something in the application. This way you get to watch and understand which tables are being updated, and take a peek at the before and after values to understand what’s changing.

X all the things!

TRIGGER.CREATE (github link) is an example of creating triggers on all the tables.

TRIGGER.LOG (github link) is an example of a routine to log the updates. In 8.x this gets easier/more efficient with “AFTER UPDATE” support.

Feb 22

Checking rsync or cp backups with md5sum

It’s nerve-wracking to move gigabytes of data around and not be absolutely sure that everything is there completely. A neat (but time consuming) way to do a check without creating huge intermediate files is to generate md5 checksums for each file on the local side, then send the results to the remote side and check those md5 hashes against the corresponding remote checksum.

# The full command:

$ find ./Ian -type f -print0 | xargs -0 md5sum | ssh www 'cd /home/owncloud/data/mcgowan/files ; md5sum --quiet -c -'

Let’s unpack that into the 3 components. ¬†It can be helpful to run each of these commands in isolation to see what they produce.

1. find ./Ian -type f -print0

This command searches the ./Ian directory for all files (-type f, no directories or other things), and then outputs the list of files to stdout delimited with null characters (-print0).  The -print0 is useful because the only characters that you can absolutely guarantee not to be in a filename are / and null.

2. xargs -0 md5sum

This is the easy one. ¬†For each null-terminated (-0) parameter passed from the find command, run the md5sum command and output the checksum and filename to stdout. ¬†This looks something like “d41d8cd98f00b204e9800998ecf8427e ¬†./Ian/null.txt”

3. ssh www 'cd /home/owncloud/data/mcgowan/files ; md5sum --quiet -c -'

There’s a lot going on in this command. ¬†We’re using ssh to run a command on another server (www), not login to it. ¬†A subtle point here is that the stdout from the previous command (md5sum) is connected to the stdin of whatever command is run on the remote side. This is extending our pipeline of commands to a different machine (in this example, a machine in a different country!). ¬†Another easy-to-miss trick is that we are not getting prompted to login to www, because we are using ssh-agent and password-less ssh logins.

The command run by ssh is everything between single quotes. ¬†Since the files are in a different directory on the remote side, the first sub-command is a “cd” to get us to the right place. ¬†Then we run “md5sum -c” to check the lines of data that are piped into our ssh command. ¬†Since we don’t want to see files that match, the “–quiet” option is added. ¬†Remove that when starting out to see a lot of “OK” messages that make you feel good about your pristine backup.