Lessons learned: no unicode in filenames on my blog.
I had the great idea, a few days ago, to write a short blog post about my trip to Cologne, and to give the file a name with a UTF-8 character in it (the รถ, to be exact). Except that I forgot how my blog actually works...
I use blosxom to manage my RSS feed and other parts of the blog; but to integrate it properly in my website, I wrote a whole lot of stuff around that. It's not just blosxom:
wouter@samba:/var/lib/svn/blog/hooks$ grep -v '^#' post-commit | wc -l 20
It takes care of updating the repository in /var/local/blosxom, and then fixes the timestamps on those files based on the timestamp of the original commit in the subversion repository. It's an ugly amalgam of svnlook date, svnlook history, touch and things, but it works. Sortof. Except if I remove a file, or if the svnlook history thing doesn't find the file itself. Finally, it will call blosxom itself, in the mode in which it creates files on disk rather than trying to do CGI output.
Apparently, however, subversion thinks differently about UTF-8 filenames if you do a checkout from a repository when using a http:// or a file:// URL. As a result, there were some issues in the post-commit subversion hook that I wrote, resulting in the svn up part of the post-commit not entirely succeeding. Or some such. Then, the touch is done, and the svn up doesn't work at all, anymore. In short, things started to break horribly, resulting in empty posts (because the files were created by touch rather than svn up), the comment thing being confused about filenames (and, as a result, postgres complaining about incorrect UTF-8 encodings), and similar other ugliness.
So, I just renamed the file, and did a cleanup of the subversion checkout in /var/local/blosxom. I'll just have to cope with the fact that my setup doesn't like unicode, I guess. Or, perhaps, finally switch to ikiwiki some day, which I've been thinking about ever since Joey first blogged about it a few years ago. But that's not urgent...