Monthly Archives: September 2013

How to write a web book in XXI century

It all started because I wasn't feeling well and had to stay at home for a few days. But since it can be pretty boring, I decided to sort out my limited knowledge of Hebrew by putting it into a kind of a web book or something like that. Maybe, just maybe, if I really learn Hebrew someday, it will turn into a nice language course for nerds like me. And if I just get bored of it before I actually finish, then I'll at least waste enough time doing it so I won't be bored for a while. It's a win-win idea.

One would think that writing a web book is something that is easily done in XXI century, provided, of course, that you know what to actually write there. I mean, there is Unicode, there is appropriate markup for bi-directional text in HTML, and it seems to be well supported by the mainstream browsers. Just pick up the right tools and write! Or so I thought.

The first issue is to find the right format. Writing directly in HTML isn't the best idea (although it can be done too) because HTML lacks internal structure. You'll have to struggle with chapter numbering, table of contents and splitting the thing into the right number of HTML pages. HTML is the best output format for web publishing, but it's not quite right to actually write in it.

The first format I thought of was Lyx, as its WYSIWYM (what you see is what you mean) idiom gives the most content with the least effort required to format everything. I already used it for my "Endgame: Singularity" Impossible Guide and found it quite nice, especially when there is math involved. I knew there were some problems with Unicode support in TeX/LaTeX, but I thought that such a nice piece of mature software should have all those solved already… Boy, was I wrong!

There is a whole bunch of UTF-8 encodings, using XeTeX or whatever-TeX, but none of them seemed to actually work. After struggling with it for a while, I realized that using TeX in any way cripples the very idea of using Unicode to get rid of all multi-language troubles even before they appear.

OK, so I thought I needed some decent format with native support of Unicode. Something like XML. Of course, raw XML is kind of useless unless I wanted to write my own XSLT sheet, which I didn't. So what I needed was an XML-based format that is designed for writing structured documents… Looks like Docbook is the way to go! It is designed for technical documentation primarily, but nothing stops from using it for anything else, as long as nothing special is required of it, and my requirements were quite simple indeed.

Now the only thing that I needed was a sort of WYSIWYG/WYSIWYM XML editor with Docbook support because I didn't want to write raw XML with Vim or something like that. Given Docbook's popularity, there must be plenty of them, right?
The horror story

Don’t set clock to match local time zone!

There are rumors that Russian time zones are going to change yet again. Hopefully we'll get rid of daylight time once and for all. This made me think of people who set their clocks to match new time zone each time they discover that their clock on a PC or a phone shows wrong time. Well, don't do it!

Setting a clock to match the new time zone works fine for mechanical clocks and primitive electronic clocks. So why is it so wrong to do the same thing with a computer or a phone? Because almost any such device runs its clock in UTC. That is, Greenwich time with no daylight corrections whatsoever.

Hey, but I don't see Greenwich time on my clock! I see the right time for my area!

That's right, if you live, say, in Houston, your clock will show (at summer) 8 AM when it's 1 PM UTC in the internal clock of your OS. Why? The answer is time zones.

Before displaying the time, your computer converts it to the local time zone. That's why you have to choose it when you set up your OS. So at any given time your OS knows two things: what's UTC time right now and how it differs from the local time. This makes all internal time processing incredibly simple: the OS doesn't have to take time zones into account. If it has two times, it always knows which one is later, and what's the exact difference is – it just has to subtract one time from another.

Now suppose your local authorities decide to change time zones in your area, like they wont to do it in Russia lately. Say, Texas decides that it no longer needs daylight time, so 8 AM becomes 7 AM now. But your clock still shows 8 AM, so you just go and set it to 7 AM. You now see the "correct" time, but then all kinds of strange things begin to happen. Mail shows wrong time, internet forums show wrong time, files on USB sticks have incorrect time stamps, and sometimes antivirus gives you vague alerts which not only you don't understand, but you get the feeling that the antivirus itself doesn't quite figure out what it doesn't like.

So what's wrong with just setting the clock to match the new time?

The reason all this crazy stuff is happening is because you didn't actually change 8 AM to 7 AM. While you might think you did something like this:

8 AM => 7 AM,

you actually did it like this:

8 AM CDT => 7 AM CDT = 6 AM CST

or, actually,

1 PM UTC => 12 PM UTC,

which is the same thing. Now when a mail arrives, your OS knows its time and its timezone, converts it to UTC, and then to your local time zone. Say, I wrote an e-mail to a guy in Texas. I am not too far away from Moscow, so my time is 5 PM MSK which is 1 PM UTC, which is 8 AM CDT or 7 AM CST. Now that guy has the time set to 6 AM CST, so here's what his OS thinks: "OK, so there's this mail that was written at 17:00 UTC+4, which is not a very convenient way to write it, so it would be much better for me to represent it as 13:00 UTC, but then again, while it's perfect for me, it's not the best way for my user, so I better convert it to the local time zone, which would be, well, let me think… 08:00 CDT (UTC-5). But wait! What's that? That's one hour into the future! Oh, that must be a junk mail or some sort of virus! I better warn my user!"

OK, what is the right way to do it then?

What you really need to do is not to fix the time, but to fix the time zone. Unfortunately, this is made incredibly hard for some reason. You can't just say, well, let's just set the time zone to UTC+4 and change it to UTC+3 at 03:00 AM on the last October Sunday. At best, you can just change it to UTC+4 or UTC+3, but then you have to adjust it manually unless your local authorities abolish daylight adjustments altogether. But even this strategy only works on some phones, and doesn't work in Windows 7, for example. It just lets you choose from a huge list with various places.

So what can be done if the OS on my PC or in my phone doesn't let me to set the right zone?

There are three ways.

The first way is to update the OS, hoping that developers already incorporated the new time zone rules. This has a nice effect of OS always showing the correct time, even for dates far in the past, because it actually remembers what all those rules were like long ago. This is actually the best way, and it can be as simple as just running Windows Update, or as hard as having to upload new firmware into your phone. In the worst case, you could be using an OS that is so outdated that no updates are available.

Here comes the second way. A geeky way. If there are no updates, you can make it yourself. It can involve registry edits or some really bizarre tricks like hex editing your phone's firmware. Someone could have done it for you already, so you may want to look for a ready recipe in the Internet. But there's an even easier way.

The third way is to just find a timezone identical to what you need. You need to switch to UTC+4, but your old OS doesn't know that Moscow is there already? Just choose Yerevan or whatever! Just don't forget to turn off daylight savings or you'll be in for a huge surprise – Yerevan's rules for daylight adjustments might not be exactly what you need. This has a disadvantage is that times in the past will probably be wrong because you now applied the whole history of that foreign place to your system's time zone rules. But at least the present will be present.