1. Locking for Creating and Writing Files

    Posted January 05, 2009 at 10:00 AM 0 comments

    Patrice Neff's recent blog post regarding atomic file overwriting raises some interesting questions. Coincidentally, I've just been dealing with similar issues while implementing the comment feature of this blog.

    I wanted to make sure that when a comment is posted, it gets a unique filename, and doesn't overwrite any existing comments. The filename format I chose for each comment is timestamp-based, but I wanted the resolution to only go down to the second. Since more than one person could post a comment in the same second, I also append a counter to the filename. The naïve way to do this would be something like this:

    for i in itertools.count():
        fn = '%s-%d' % (timestamp, i)
        if os.path.exists(fn):
            continue
        with open(fn, 'w') as f:
            f.write(comment)
        break
    

    The problem with the code above is that there is a race-condition. Another thread or process could write a comment in between the call to os.path.exists and open, causing this code to overwrite the other comment. The solution I came up with, which is slightly different than Patrice's, is this:

    for i in itertools.count():
        fn = '%s-%d' % (timestamp, i)
        try:
            fd = os.open(fn, os.O_CREAT|os.O_EXCL|os.O_EXLOCK|os.O_WRONLY)
            break
        except OSError, err:
            if err.args[0] != errno.EEXIST:
                raise
    with os.fdopen(fd, 'w') as f:
        f.write(comment)
    

    The way this works is by squeezing the check for existence, opening of the file, and acquisition of the lock all into one atomic system call. The check for existence is accomplished by using the O_CREAT and O_EXCL flags. Using these flags, if the file already exists, the open call will raise the OSError with the EEXIST code. If that happens, the loop continues.

    The lock is acquired by specifying the O_EXLOCK flag. One thing we have to be careful of with this solution is that the lock is just an advisory lock, meaning that the OS doesn't enforce the lock—we have to explicitly check for it when reading. The corresponding code I use for reading comments is something like this:

    try:
        fd = os.open(fn, os.O_SHLOCK|os.O_NONBLOCK)
        os.close(fd)
    except OSError, err:
        if err.args[0] == errno.EAGAIN:
            return None
        raise
    with open(fn, 'r') as f:
        comment = f.read()
    return comment
    

    Note that this does not bother waiting for the file to be unlocked (O_NONBLOCK), and the lock-acquisition and opening steps are not atomic. For my purposes this is sufficient however, because given the way the writing code above works, we know that if the file exists and is not locked, the full comment is written to disk (excepting for power-failures and stuff). If the file is locked, we just return None, which we interpret as “this comment doesn't exist (yet).”

    All this code exists to make sure we don't send a comment to a browsing user if it is only partially written to disk. My initial solution to this was to rely on the atomicity of os.rename, but that has a limitation—if the file exists, it will be silently overwritten (on posix anyway, which is all I'm concerned with supporting). This doesn't work for me, since I don't want to overwrite another comment, and if I use os.path.exists, I run into that race-condition again. If we do want to atomically overwrite something, we can just os.rename it into place.

  2. py2app and Virtualenv

    Posted December 19, 2008 at 09:00 PM 0 comments

    With the recent fervor over pip, I decided to give it and the very slick virtualenv a try. One of my current personal projects is built using pyObjC, so I created a new virtualenv and stuck pyObjC in there. I fired up an interpreter, and tried an import objc and all was well. Cool!

    The trouble began when I tried to build my project under my new virtualenv. I activated the virtualenv and ran python setup.py py2app -A to build the app bundle in alias mode. Then, when I tried to invoke my app, I got a nasty error on my console:

    A Python runtime could be located. You may need to install a framework build of Python, or edit the PyRuntimeLocations array in this application's Info.plist file.

    I assumed the message was trying to tell me that a runtime could not be located, so I tried adding a PyRuntimeLocations array to the plist, pointing to the .Python symlink that virtualenv uses which points to the real python framework runtime. This took care of that error message, but then I got a new one. Progress.

    I didn't much like the idea of changing the PyRuntimeLocations array in the application, since in a perfect world, an app built with py2app should just work out of the box, even in a virtualenv. So I did a quick search to see if anyone else had ever tried running py2app under a virtualenv. My search eventually led me to Gary Bernhardt's Halloween post on pythonmac-sig. Later in that thread Gary says, “If I copy libpython2.5.a to where py2app wants the .dylib to be, it will build an app.” Aha.

    This gave me the idea to open up the default plist that py2app generates for the bundle to see where it pointed. The obvious entry points to <prefix>/lib/libpython2.6.dylib. I added a symlink in my virtualenv's lib directory pointing to ../.Python and tried building the app again with the default value for PyRuntimeLocations. This worked, and I was back at the same place as before, with that new error message, which was an ImportError, like the one Gary was getting.

    That new error message was a scary looking traceback with a bunch of \x00's and the like. The scary looking string of 0's has to do with how py2app builds app bundles in alias mode. A closer look at the traceback revealed an ImportError: No module named Carbon.File. That's odd. The Carbon package is in the python standard library—not even part of PyObjC. How could it be causing problems? I fired up the virtualenv's interpreter and did a quick import Carbon.File. No error. Hmm.

    I looked in the lib directory of my virtualenv to see what was there. There were a bunch of symlinks to the real python's lib dir, but no plat-mac, which is where the Carbon package lives. I didn't know exactly how virtualenv coaxed python into finding modules at this point, but I figured it would work if I added a symlink for plat-mac as well, so I tried it. It worked, but gave me a new ImportError. After repeating this process a few times, I finally got the same error Gary was getting, in distutils.

    At this point, I decided to dive into the virtualenv code to see just how it worked. Adding a bunch of symlinks didn't seem like a great solution anyway.

    I knew at some point that virtualenv did some of its magic by manipulating sys.path, but didn't know just how it worked. It turns out virtualenv installs its own site.py in a new virtualenv, which is where sys.path is constructed. The virtualenv site.py figures out where the real python is, and adds the real python's lib, as well as the virtualenv's lib to sys.path, so that when the virtualenv is activated, python can find all the right modules. It's a pretty genius solution.

    Finally, a lightbulb went off and I checked the bundle py2app built just to see. Yep, py2app also has a custom site.py to make sure your app's interpreter can find all the right modules too. That's it!

    I hacked around a bit and moved virtualenv's magic into the virtualenv's sitecustomize.py, which gets imported by site.py, even in a py2app bundle. I removed the virtualenv's site.py and symlinked it to the real python's version. I double-checked the interpreter to make sure I could import all the right things. That all worked, and, most satisfyingly, the py2app-built app worked as well! Woohoo!

    I stuck all this into a script to automate the process of making a new virtualenv py2app-friendly. You can download it here. Create your virtualenv and activate it, then run fix-pyobjc-venv.py. This will create the symlink to your .Python file, add sitecustomize.py to your virtualenv's lib dir, and change site.py into a symlink to the real python's version. Stick PyObjC in your virtualenv, and away you go!

Zognot.org