The following sentence is false.
« The televised funeral of Michael JacksonReal Time Photo: Mmm, bready »

imap2maildir: a tool for mirroring IMAP to maildir

Permalink 07/04/09 09:25, by admin, Categories: Geekery, Howto , Tags: , , , , , , , , , , ,

Link: http://github.com/rtucker/imap2maildir/

For awhile now, I've had that paranoia kicking in about my online data. Almost all of us have a lot of useful information out there that is entirely under someone else's control: if someone messes up, or the wrong component fails, or the wrong business process fails, or a company goes out of business with your data as their asset, poof! It's gone. The only way to ensure that your data has a chance to survive is by properly backing it up. I am a heavy user of Gmail, Google's gift to those who love e-mail. It's got a solid web interface, spam filtering that lets you forget spam exists, and (best of all) IMAP support. It's also got a huge-ass quota, so you can keep your mail around forever. This has resulted in me accumulating over 80,000 e-mails so far. That's a lot. In the unlikely event Gmail loses my mailbox, I'd be right miffed. However, I'd at least have my mail, thanks to... imap2maildir! Faced with a lack of any quick and reliable way to do exactly what I wanted to do, I wrote a quick script to incrementally back up any IMAP mailbox (defaulting to Gmail) to a local maildir store. This stores messages in individual files, and is readable by pretty much any IMAP server one might need to deploy in a hurry. This script kind of sucks, but that's what open source is all about, right? It's got a few limitations and is lacking some important stuff, and there's no companion maildir2imap tool yet, but I suppose if Gmail does eat my mailbox, I'll be motivated to create one. So, give it a spin, let me know what you think. It will definitely work fine under Linux, although you should, in theory, be able to use it with Windows (assuming the maildir filenames aren't eaten by NTFS). You can download it from http://github.com/rtucker/imap2maildir/, or you can clone the repository with git clone git://github.com/rtucker/imap2maildir.git.
5 comments »

5 comments

Comment from: jd [Visitor]
****-
Thanks for sharing this.

I was able to grab a few messages from Gmail, but eventually I get a MemoryError in Python's imaplib, on line 1150:

data = self.sslobj.read(size-read)

I understand this isn't your code but do you have any ideas on how to fix this? This is using Python 2.5 (on Windows, but that shouldn't matter).

Thanks...
08/07/09 @ 18:30
Comment from: Ryan [Member] Email · http://blog.hoopycat.com/
I have noticed that on my end, too. I've been working on a number of major changes in a new branch, which should help with memory performance... not yet ready for prime time (and it kinda took the back seat to some other projects recently), but I think it will be better in the long run.

Glad to know it at least tries to work under Windows! :-)

The new branch is at: http://github.com/rtucker/imap2maildir/tree/newiterator
08/08/09 @ 12:16
Comment from: jd [Visitor]
Looking forward to your changes...

08/08/09 @ 23:03
Comment from: Ryan [Member] Email · http://blog.hoopycat.com/
OK, I've reworked a lot of the IMAP handling stuff in the newiterator branch, and am able to process my mailbox of great girth (92,470 messages) with no abnormal memory issues.

It is slower, alas. It took about 10 hours on my mailbox (!), so I've added some additional caching and verification of the local files (aka "turbo mode"). This has reduced the time on my mailbox down to a less-unreasonable ~3 hours.

I think I might be able to do better, but this is a huge mailbox, and that's why we have cron, right?

On a 192-message mailbox:

master branch: 5.0 seconds
newiterator branch, no turbo mode: 33.0 seconds
newiterator branch, turbo mode: 2.1 seconds

But more importantly: no memoryerror crash!

I'll probably merge newiterator into master early next week, unless something bursts into flame... -rt
08/14/09 @ 12:38
Comment from: jd [Visitor]
Sent you a msg via gmail regarding your recent changes.
08/15/09 @ 13:43

Leave a comment


Your email address will not be revealed on this site.

Your URL will be displayed.
PoorExcellent
Code:
(Line breaks become <br />)
(Name, email & website)
(Allow users to contact you through a message form (your email will not be revealed.)

post snap widget by EdB feed reading by SimplePie

Welcome to Ryan Tucker's standard output blog. Here, you'll find variety of geeky projects, random prognostications, and other miscellany. Strive at all times to bend, fold, spindle, and mutilate.

Recent Twitterings

    Stalk me with RSS

    Bogroll

      Search the Blog

       

      Support the Beer Fund

      Life's too short for crappy hosting

      [Powered by Linode]

      Dehumidifier

      powered by b2evolution

      © 1973-2010 by Ryan Tucker (Public Key)

      Contact | b2evolution skin by Asevo | blogging tool | blog hosting | Francois