Skip to content
Johnny Morano's Tech Articles

Johnny Morano's Tech Articles

Ramblings of an old-fashioned space cowboy

Menu
  • About
  • Privacy Policy
Menu

Watch a directory for uploaded files

Posted on March 2, 2012 by Johnny Morano

In some situations, you need to watch a directory for uploaded files and move/process them immediately. Many scripts work in a polling kind of way, which will check every x amounts of seconds for new files. This is nowadays completely out of fashion and generaly not cool.

A real watcher script uses the OS’s built-in features and the Linux OS is therefor the number one choice for these kind of operations.
The Linux kernel has a feature called inotify. inotify is an inode-base file system notification mechanism, which can be used in almost every real programming language.

The following example uses the Perl programming language. The Perl module that will be used is called Linux::Inotify2, which is able to catch the inotify notifications.

The example script will watch a certain directory for files (which will be uploaded using HTTP, FTP, SFTP or which will be just moved into that specific directory).
It will also use other Perl modules, such as:
– AnyEvent: for event looping
– File::Copy: to move the uploaded to another directory
– Digest::MD5: to check if a file has been already uploaded, we don’t want duplicates
– Storable: to keep a state file of uploaded files
– use features: to use to cool new features of Perl

The first part of the script calls all the required modules and configures some variables that will used in the script.

#!/usr/bin/perl
use strict; use warnings;

# Requirements
use AnyEvent;
use Linux::Inotify2;
use File::Copy qw/move/;
use Digest::MD5 qw/md5_hex/;
use Storable qw/nstore retrieve/;
use feature qw/say switch/;

# Config
my $savedir   = '/tmp/savedir';
my $chroot    = '/tmp/upload_dir';
my $statefile = '/tmp/upload_statefile';
my $state;

Next the script will check if there is already a statefile on disk and if so, the script will load it.
It will also catch the SIGTERM and SIGINT signals (when the script is killed or stopped) and store the statefile when the signals are caught.

if(-f $statefile){
    $state = retrieve($statefile);
}

# Store state by 'murder murder murder, kill kill kill'
$SIG{TERM} = sub { $state && nstore( $state, $statefile ); exit 0 };
$SIG{INT}  = $SIG{TERM};

Now the fun part starts. The script will
– initialize the event loop
– create a watcher
– wait for events.

Only three notification signals will watched:
– IN_MOVED_TO
– IN_CLOSE_WRITE
– IN_DELETE
Other signals are not important. When the IN_CLOSE_WRITE is received, so when a file is uploaded or moved into the directory, our action will be called. For the sake of the example, the script will just move the new file into a new directory if it passes some tests.

# enable event loop
my $cv = AnyEvent->condvar;

# watch dir
my $dir_inotify = Linux::Inotify2->new;
my $dir_w = $dir_inotify->watch(
    $chroot,
    IN_MOVED_TO|IN_CLOSE_WRITE|IN_DELETE,
    sub {
        my $e = shift;
        my $filename = $e->fullname;

        given($e->mask){
            when(IN_MOVED_TO)       { say scalar localtime() . " file:$filename IN_MOVED_TO" }
            when(IN_CLOSE_WRITE)    { say scalar localtime() . " file:$filename IN_CLOSE_WRITE";
                                      check_n_move($filename);
                                    }
            when(IN_DELETE)         { say scalar localtime() . " file:$filename IN_DELETE" }
            default                 { say scalar localtime() . " The end of the world!";
                                      unlink $filename;
                                    }
        };
        
    }
);
my $inotify_w; $inotify_w = AnyEvent->io (
       fh => $dir_inotify->fileno, poll => 'r', cb => sub { $dir_inotify->poll }
);

# Wait for events
$cv->recv();

The action used in the IN_CLOSE_WRITE notification, is defined below. It is pretty straightforward is left for the reader to analyze.

#
# The Subs
#
sub check_n_move {
    my ($filename) = @_;
    return unless -f $filename;
    my $basename = (split /\//, $filename)[-1];

    # MD5 sum
    my $filedata = do {local $/;local @ARGV="$filename";<>};
    if($@){
        say scalar localtime() . " MD5 of $filename failed: $@";
        unlink $filename;
        return 0
    }
    my $md5 = md5_hex($filedata);
    say scalar localtime() . " $filename has MD5sum $md5";

    # Check state
    if(exists($state->{$basename}) && $state->{$basename} ne $md5) {
        say scalar localtime() . " $filename already uploaded, but has new content, updating...";
    }
    elsif(exists($state->{$basename}) && $state->{$basename} eq $md5){
        say scalar localtime() . " $filename already uploaded, skipping...";
        unlink $filename;
        return 0
    }

    # Move file
    my ($newfilename) = "$savedir/$basename";
    eval { move $filename, $newfilename };
    if($@){
        say scalar localtime() . " move of $filename to $newfilename failed: $@ ($!)";
        say scalar localtime() . " $filename removed, skipping...";
        unlink $filename;
        return 0
    }
    say scalar localtime() . " moved $filename to $newfilename";

    # Update state
    $state->{$basename} = $md5;

    return 0;
}

1 thought on “Watch a directory for uploaded files”

  1. Pingback: Information about inotify in Linux based Operating Systems | Admin Junkie

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Recent Posts

  • Use multiple Azure subscriptions in Terraform modules
  • Read the HAProxy UNIX socket file using Perl
  • A Prometheus Exporter framework written in Perl
  • Managing LDAP passwords with Perl
  • Libvirt guest startup issue with AppArmor
  • Deploy a PostgreSQL database with an initial schema using Ansible
  • Using Ansible to finalize Hashicorp Packer images

Categories

  • Automation (8)
  • Blog (60)
  • Database (4)
  • Development (37)
  • Linux (26)
  • Mac OS X (5)
  • Media (2)
  • OpenBSD (3)
  • Perl (34)
  • Photo (2)
  • PostgreSQL (4)
  • Terraform (5)
  • Web (11)

Tags

Ajax (3) Android (1) Ansible (2) API (5) AppArmor (1) Automation (5) Azure (3) azurerm (2) Bash (4) Cloud (2) CPAN (4) CSS (1) Debian (4) Dev (35) DevOps (11) EXIF (1) Facebook (1) Geotag (1) GMail (1) Google (3) Hack (2) Hashicorp (4) Hetzner (2) HTML (4) IMAP (2) IPTables (6) JavaScript (4) Libvirt (2) Linux (25) Logging (2) MacOSX (5) Media (2) Monitoring (6) MySQL (3) OpenBSD (4) Packer (1) Perl (35) PF (2) Postgresql (6) Security (7) SysAdmin (24) Terraform (4) Ubuntu (2) UNIX (9) Web 2.0 (3)

Archive

  • April 2022 (10)
  • March 2022 (6)
  • December 2016 (1)
  • March 2016 (1)
  • November 2015 (1)
  • November 2014 (1)
  • August 2014 (1)
  • May 2014 (1)
  • February 2014 (2)
  • December 2013 (1)
  • October 2013 (2)
  • September 2013 (2)
  • August 2013 (2)
  • October 2012 (1)
  • August 2012 (4)
  • March 2012 (3)
  • July 2011 (1)
  • June 2011 (2)
  • April 2011 (3)
  • March 2011 (4)
  • February 2011 (2)
  • December 2010 (2)
  • October 2010 (4)
  • September 2010 (1)
  • August 2010 (5)

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org

Footer

  • Shihai Corp
  • My Photo website
© 2022 Johnny Morano's Tech Articles | Powered by Superbs Personal Blog theme
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies.
Do not sell my personal information.
Cookie SettingsAccept
Manage consent

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
CookieDurationDescription
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytics
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Others
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
SAVE & ACCEPT