Hi there,
We had a problem with duplicate entries in archives. Every day, some messages would be doubled, tripled, or even more. Tobit's answer was to upgrade to Zehn, which we'll do shortly. In the meantime, a little Perl script quickly de-duplicates everything each night, with a cron job. It does its job for the entire server within a minute. It leaves its backup files in the directories, but we haven't needed them yet. Every month, I delete the backup files manually.
Perl
#!/usr/bin/perl -w
use strict;
use POSIX qw(strftime);
my $timestamp = strftime "%Y%m%d%H%M%S", localtime;
$| = 1;
system "/usr/david/util/linux/dvstop";
print "Tobit David DeDupe, (c) 2008 Convolution.\n";
print "Deduping...";
system qq(find /usr/david/archive -name archive.dir -exec perl -i.bak$timestamp -ne'BEGIN { \$/ = \\430; } print unless \$seen{ join ":", sort unpack "x113 Z* Z* Z*", \$_ }++' {} \\;);
print " done.\n";
system "/usr/david/util/linux/dvstart";
Alles anzeigen
Feel free to copy and modify