It all started because I wasn't feeling well and had to stay at home for a few days. But since it can be pretty boring, I decided to sort out my limited knowledge of Hebrew by putting it into a kind of a web book or something like that. Maybe, just maybe, if I really learn Hebrew someday, it will turn into a nice language course for nerds like me. And if I just get bored of it before I actually finish, then I'll at least waste enough time doing it so I won't be bored for a while. It's a win-win idea.
One would think that writing a web book is something that is easily done in XXI century, provided, of course, that you know what to actually write there. I mean, there is Unicode, there is appropriate markup for bi-directional text in HTML, and it seems to be well supported by the mainstream browsers. Just pick up the right tools and write! Or so I thought.
The first issue is to find the right format. Writing directly in HTML isn't the best idea (although it can be done too) because HTML lacks internal structure. You'll have to struggle with chapter numbering, table of contents and splitting the thing into the right number of HTML pages. HTML is the best output format for web publishing, but it's not quite right to actually write in it.
The first format I thought of was Lyx, as its WYSIWYM (what you see is what you mean) idiom gives the most content with the least effort required to format everything. I already used it for my "Endgame: Singularity" Impossible Guide and found it quite nice, especially when there is math involved. I knew there were some problems with Unicode support in TeX/LaTeX, but I thought that such a nice piece of mature software should have all those solved already… Boy, was I wrong!
There is a whole bunch of UTF-8 encodings, using XeTeX or whatever-TeX, but none of them seemed to actually work. After struggling with it for a while, I realized that using TeX in any way cripples the very idea of using Unicode to get rid of all multi-language troubles even before they appear.
OK, so I thought I needed some decent format with native support of Unicode. Something like XML. Of course, raw XML is kind of useless unless I wanted to write my own XSLT sheet, which I didn't. So what I needed was an XML-based format that is designed for writing structured documents… Looks like Docbook is the way to go! It is designed for technical documentation primarily, but nothing stops from using it for anything else, as long as nothing special is required of it, and my requirements were quite simple indeed.
Now the only thing that I needed was a sort of WYSIWYG/WYSIWYM XML editor with Docbook support because I didn't want to write raw XML with Vim or something like that. Given Docbook's popularity, there must be plenty of them, right?
The horror story