Project: File Conveyor: better Unicode support + event merging

published on April 4, 2012

Technologies: 
File Conveyor
Python
Time range: 
August 2011
As a: 
freelancer

http://whitehouse.gov’s problem was that files were being synced too slowly from their web servers to their CDN, because rsync needs to scan the entire directory tree to detect changes. File Conveyor1 is faster, because it relies on inotify to detect changes.

Improved Unicode filenames compatibility (this can be quite painful in Python 2.x, definitely in combination with pysqlite) and automatic restart after unhandled exceptions. Furthermore, many minor things such as better documentation, logging improvements, thread naming, etc.

Very notably as well: so-called “event merging” was implemented, which depending on the file system usage patterns can greatly speed up the syncing by merging different file system events. E.g. a queued “delete”, “create”, “modify” event sequence can be merged into a single “modify” event, thus only requiring one file system operation to be synced instead of many.