IMAP folders are really because you can have your e-mails synchronized on multiple devices, without losing e-mails across your devices when retrieving your new e-mails. IMAP folders actually also aren’t that cool because e-mails are usually never deleted or even archived. Having millions of e-mails can make some e-mail readers on certain devices really slow.
The below script is an example how to clean and archive e-mails. The script was written in Perl and tested on a Courier IMAP server. Remember the Perl devise: there are million ways to write Perl scripts.
The following modules were used:
use Net::IMAP::Simple::SSL; use Email::Simple; use Getopt::Long qw/:config bundling/; use DateTime; use YAML qw/LoadFile/; use Log::Log4perl; use Pod::Usage; use Data::Dumper;
The configuration file is written in YAML and the logging of the script is handled by Log4Perl.
log4perl.logger.shihai= DEBUG, shihaiLogfile log4perl.logger.hostlistdb= DEBUG, shihaiLogfile log4perl.appender.shihaiLogfile = Log::Log4perl::Appender::File log4perl.appender.shihaiLogfile.filename = /var/tmp/shihai.log log4perl.appender.shihaiLogfile.layout = Log::Log4perl::Layout::PatternLayout log4perl.appender.shihaiLogfile.layout.ConversionPattern = %d %p [%x][%r millis][%c][%F{1}:%L][%M] %m%n
The configuration file contains the login credentials and the threshold values:
imap: host: 'mail.example.com' user: 'bob@example.com' pass: 'myultrasecretpassword' threshold: archive: years: 3 delete: years: 8
The threshold values are in fact the same parameters that can be used in the DateTime method ‘subtract()’;
The main loop of the program will:
- Connect to the IMAP server
- Retrieve all IMAP folders and loop through them
- Loop through all messages in each mailbox
# Connect to the IMAP server my $imap = connect_imap( $config->{imap} ); # get all mailboxes and loop through them my @mailboxes = get_mailboxes($imap); foreach my $mailbox_name (@mailboxes){ # Skip all Archive boxes next if $mailbox_name =~ /Archive/; # select the mailbox and get the number of messages my $mb = $imap->select($mailbox_name); unless(defined $mb){ $logger->error("Mailbox [$mailbox_name] doesn't exist: ", $imap->errstr()); next; } $logger->info("Scanning $mailbox_name"); # loop through the messages foreach my $i (1 .. $mb){ my ($from, $subject, $date, $year) = get_mail_header($imap, $i); if(defined $date){ if($date < $delete_date ){ delete_mail($imap, $i); } elsif($date < $archive_date){ $logger->info("Archiving [$i][$from][$subject][$date]"); my $archive_box = get_archive_box($imap, $mailbox_name); move_mail($imap, $i, $archive_box) } } } $imap->expunge_mailbox($mailbox_name); } $imap->quit;
Let’s go through the different subroutines called in the mainloop.
connect_imap()
sub connect_imap { my ($cfg) = @_; my $logger = Log::Log4perl->get_logger('shihai.archive_mail'); my $imap = Net::IMAP::Simple::SSL->new($cfg->{host}) or $logger->logdie("Unable to connect to IMAP server: $Net::IMAP::Simple::errstr"); $logger->info("Connected to IMAP host $cfg->{host}"); unless( $imap->login($cfg->{user}, $cfg->{pass}) ){ $logger->logdie("Login failed: ", $imap->errstr); } $logger->info("Logged to IMAP host $cfg->{host} as user '$cfg->{user}'"); return $imap; }
The parameters expected in the ‘$cfg’ hash are:
- host
- user
- pass
get_mailboxes()
sub get_mailboxes { my ($imap) = @_; my @mailboxes = $imap->mailboxes; my $logger = Log::Log4perl->get_logger('shihai.archive_mail'); return @mailboxes; }
Ok, I admit, I shouldn’t have created an extra subroutine for it… but i was kind of in a flow!
get_mail_header()
sub get_mail_header { my ($imap, $i) = @_; my $logger = Log::Log4perl->get_logger('shihai.archive_mail'); my $header = $imap->top($i); unless( $header ){ $logger->error("No header found for message $i in ", $imap->current_box); } my $email = Email::Simple->new(join '', @{ $header }); unless( $email ){ $logger->error("No Email::Simple object, skipping..."); return } my ($subject) = $email->header('Subject'); my ($date) = $email->header('Date'); my ($from) = $email->header('From'); # $logger->debug("Got e-mail [$from] [$subject] [$date]"); unless(defined $date){ $logger->error("No date found: ", $email->header_obj->as_string); delete_mail($imap, $i); return; } my($junk, $day, $month, $year) = ( $date =~ m/(...,\s+)?([0-9]{1,2})\s+(...)\s+(\d{4})/ ); my $date_obj; if(defined $year && defined $month && defined $day){ $date_obj = DateTime->new( year => $year, month => $months{$month}, day => $day, ); } return ($from, $subject, $date_obj, $year, $month, $day); }
This subroutine takes the ‘$imap’ object and the message number as input parameters. It will then retrieve the mail header and convert it to an ‘Email::Simple’ object. I’ve chosen this module so I can easily extract header fields.
If no ‘Date:’ field was found in the e-mail, then the tool will just delete the email. I don’t like e-mails with wrong or missing headers 😉 (they’re usually spam anyway).
delete_mail()
sub delete_mail { my($imap, $i) = @_; my $logger = Log::Log4perl->get_logger('shihai.archive_mail'); if( $imap->delete($i) ){ $logger->info("Deleted message number $i from ", $imap->current_box); } }
Pretty straightforward.
get_archive_box()
sub get_archive_box{ my($imap, $mailbox_name) = @_; my $logger = Log::Log4perl->get_logger('shihai.archive_mail'); my ($archive_box) = ($mailbox_name); $archive_box =~ s/INBOX/INBOX.Archive/; if( not grep /^$archive_box$/, @mailboxes) { create_mailbox($imap, $archive_box); subscribe($imap, $archive_box); } return $archive_box; }
This subroutine will assemble the archive mailbox name. It will then check if the mailbox already exists and otherwise create it and subscribe to it.
create_mailbox()
sub create_mailbox { my($imap, $mb) = @_; my $logger = Log::Log4perl->get_logger('shihai.archive_mail'); $imap->create_mailbox($mb) or $logger->logdie("Mailbox creation '$mb' failed: ", $imap->errstr()); $logger->info("Created mailbox $mb"); }
It will basically just create the mailbox and log about it.
subscribe()
sub subscribe { my($imap, $box) =@_; my $logger = Log::Log4perl->get_logger('shihai.archive_mail'); $imap->folder_subscribe($box); $logger->info("Subscribed to mailbox $box"); }
move_mail()
sub move_mail { my($imap, $i, $new_box) = @_; my $logger = Log::Log4perl->get_logger('shihai.archive_mail'); if( $imap->copy($i, $new_box) ){ $logger->info("Copied message number [$i] from ", $imap->current_box ," to [$new_box]"); delete_mail($imap, $i); } }
Moving an e-mail consists of copying it first to the new mailbox and then afterwards removing it from the old mailbox.
And that’s basically it!
Hi, good job.
can you post all script for donwload?