Locking for Creating and Writing Files
Patrice Neff's recent blog post regarding atomic file overwriting raises some interesting questions. Coincidentally, I've just been dealing with similar issues while implementing the comment feature of this blog.
I wanted to make sure that when a comment is posted, it gets a unique filename, and doesn't overwrite any existing comments. The filename format I chose for each comment is timestamp-based, but I wanted the resolution to only go down to the second. Since more than one person could post a comment in the same second, I also append a counter to the filename. The naïve way to do this would be something like this:
for i in itertools.count():
fn = '%s-%d' % (timestamp, i)
if os.path.exists(fn):
continue
with open(fn, 'w') as f:
f.write(comment)
break
The problem with the code above is that there is a race-condition. Another thread or process could write a comment in between the call to os.path.exists and open, causing this code to overwrite the other comment. The solution I came up with, which is slightly different than Patrice's, is this:
for i in itertools.count():
fn = '%s-%d' % (timestamp, i)
try:
fd = os.open(fn, os.O_CREAT|os.O_EXCL|os.O_EXLOCK|os.O_WRONLY)
break
except OSError, err:
if err.args[0] != errno.EEXIST:
raise
with os.fdopen(fd, 'w') as f:
f.write(comment)
The way this works is by squeezing the check for existence, opening of the file, and acquisition of the lock all into one atomic system call. The check for existence is accomplished by using the O_CREAT and O_EXCL flags. Using these flags, if the file already exists, the open call will raise the OSError with the EEXIST code. If that happens, the loop continues.
The lock is acquired by specifying the O_EXLOCK flag. One thing we have to be careful of with this solution is that the lock is just an advisory lock, meaning that the OS doesn't enforce the lock—we have to explicitly check for it when reading. The corresponding code I use for reading comments is something like this:
try:
fd = os.open(fn, os.O_SHLOCK|os.O_NONBLOCK)
os.close(fd)
except OSError, err:
if err.args[0] == errno.EAGAIN:
return None
raise
with open(fn, 'r') as f:
comment = f.read()
return comment
Note that this does not bother waiting for the file to be unlocked (O_NONBLOCK), and the lock-acquisition and opening steps are not atomic. For my purposes this is sufficient however, because given the way the writing code above works, we know that if the file exists and is not locked, the full comment is written to disk (excepting for power-failures and stuff). If the file is locked, we just return None, which we interpret as “this comment doesn't exist (yet).”
All this code exists to make sure we don't send a comment to a browsing user if it is only partially written to disk. My initial solution to this was to rely on the atomicity of os.rename, but that has a limitation—if the file exists, it will be silently overwritten (on posix anyway, which is all I'm concerned with supporting). This doesn't work for me, since I don't want to overwrite another comment, and if I use os.path.exists, I run into that race-condition again. If we do want to atomically overwrite something, we can just os.rename it into place.