Tag Archives: it

How to write a web book in XXI century

It all started because I wasn't feeling well and had to stay at home for a few days. But since it can be pretty boring, I decided to sort out my limited knowledge of Hebrew by putting it into a kind of a web book or something like that. Maybe, just maybe, if I really learn Hebrew someday, it will turn into a nice language course for nerds like me. And if I just get bored of it before I actually finish, then I'll at least waste enough time doing it so I won't be bored for a while. It's a win-win idea.

One would think that writing a web book is something that is easily done in XXI century, provided, of course, that you know what to actually write there. I mean, there is Unicode, there is appropriate markup for bi-directional text in HTML, and it seems to be well supported by the mainstream browsers. Just pick up the right tools and write! Or so I thought.

The first issue is to find the right format. Writing directly in HTML isn't the best idea (although it can be done too) because HTML lacks internal structure. You'll have to struggle with chapter numbering, table of contents and splitting the thing into the right number of HTML pages. HTML is the best output format for web publishing, but it's not quite right to actually write in it.

The first format I thought of was Lyx, as its WYSIWYM (what you see is what you mean) idiom gives the most content with the least effort required to format everything. I already used it for my "Endgame: Singularity" Impossible Guide and found it quite nice, especially when there is math involved. I knew there were some problems with Unicode support in TeX/LaTeX, but I thought that such a nice piece of mature software should have all those solved already… Boy, was I wrong!

There is a whole bunch of UTF-8 encodings, using XeTeX or whatever-TeX, but none of them seemed to actually work. After struggling with it for a while, I realized that using TeX in any way cripples the very idea of using Unicode to get rid of all multi-language troubles even before they appear.

OK, so I thought I needed some decent format with native support of Unicode. Something like XML. Of course, raw XML is kind of useless unless I wanted to write my own XSLT sheet, which I didn't. So what I needed was an XML-based format that is designed for writing structured documents… Looks like Docbook is the way to go! It is designed for technical documentation primarily, but nothing stops from using it for anything else, as long as nothing special is required of it, and my requirements were quite simple indeed.

Now the only thing that I needed was a sort of WYSIWYG/WYSIWYM XML editor with Docbook support because I didn't want to write raw XML with Vim or something like that. Given Docbook's popularity, there must be plenty of them, right?
The horror story

Don’t set clock to match local time zone!

There are rumors that Russian time zones are going to change yet again. Hopefully we'll get rid of daylight time once and for all. This made me think of people who set their clocks to match new time zone each time they discover that their clock on a PC or a phone shows wrong time. Well, don't do it!

Setting a clock to match the new time zone works fine for mechanical clocks and primitive electronic clocks. So why is it so wrong to do the same thing with a computer or a phone? Because almost any such device runs its clock in UTC. That is, Greenwich time with no daylight corrections whatsoever.

Hey, but I don't see Greenwich time on my clock! I see the right time for my area!

That's right, if you live, say, in Houston, your clock will show (at summer) 8 AM when it's 1 PM UTC in the internal clock of your OS. Why? The answer is time zones.

Before displaying the time, your computer converts it to the local time zone. That's why you have to choose it when you set up your OS. So at any given time your OS knows two things: what's UTC time right now and how it differs from the local time. This makes all internal time processing incredibly simple: the OS doesn't have to take time zones into account. If it has two times, it always knows which one is later, and what's the exact difference is – it just has to subtract one time from another.

Now suppose your local authorities decide to change time zones in your area, like they wont to do it in Russia lately. Say, Texas decides that it no longer needs daylight time, so 8 AM becomes 7 AM now. But your clock still shows 8 AM, so you just go and set it to 7 AM. You now see the "correct" time, but then all kinds of strange things begin to happen. Mail shows wrong time, internet forums show wrong time, files on USB sticks have incorrect time stamps, and sometimes antivirus gives you vague alerts which not only you don't understand, but you get the feeling that the antivirus itself doesn't quite figure out what it doesn't like.

So what's wrong with just setting the clock to match the new time?

The reason all this crazy stuff is happening is because you didn't actually change 8 AM to 7 AM. While you might think you did something like this:

8 AM => 7 AM,

you actually did it like this:

8 AM CDT => 7 AM CDT = 6 AM CST

or, actually,

1 PM UTC => 12 PM UTC,

which is the same thing. Now when a mail arrives, your OS knows its time and its timezone, converts it to UTC, and then to your local time zone. Say, I wrote an e-mail to a guy in Texas. I am not too far away from Moscow, so my time is 5 PM MSK which is 1 PM UTC, which is 8 AM CDT or 7 AM CST. Now that guy has the time set to 6 AM CST, so here's what his OS thinks: "OK, so there's this mail that was written at 17:00 UTC+4, which is not a very convenient way to write it, so it would be much better for me to represent it as 13:00 UTC, but then again, while it's perfect for me, it's not the best way for my user, so I better convert it to the local time zone, which would be, well, let me think… 08:00 CDT (UTC-5). But wait! What's that? That's one hour into the future! Oh, that must be a junk mail or some sort of virus! I better warn my user!"

OK, what is the right way to do it then?

What you really need to do is not to fix the time, but to fix the time zone. Unfortunately, this is made incredibly hard for some reason. You can't just say, well, let's just set the time zone to UTC+4 and change it to UTC+3 at 03:00 AM on the last October Sunday. At best, you can just change it to UTC+4 or UTC+3, but then you have to adjust it manually unless your local authorities abolish daylight adjustments altogether. But even this strategy only works on some phones, and doesn't work in Windows 7, for example. It just lets you choose from a huge list with various places.

So what can be done if the OS on my PC or in my phone doesn't let me to set the right zone?

There are three ways.

The first way is to update the OS, hoping that developers already incorporated the new time zone rules. This has a nice effect of OS always showing the correct time, even for dates far in the past, because it actually remembers what all those rules were like long ago. This is actually the best way, and it can be as simple as just running Windows Update, or as hard as having to upload new firmware into your phone. In the worst case, you could be using an OS that is so outdated that no updates are available.

Here comes the second way. A geeky way. If there are no updates, you can make it yourself. It can involve registry edits or some really bizarre tricks like hex editing your phone's firmware. Someone could have done it for you already, so you may want to look for a ready recipe in the Internet. But there's an even easier way.

The third way is to just find a timezone identical to what you need. You need to switch to UTC+4, but your old OS doesn't know that Moscow is there already? Just choose Yerevan or whatever! Just don't forget to turn off daylight savings or you'll be in for a huge surprise – Yerevan's rules for daylight adjustments might not be exactly what you need. This has a disadvantage is that times in the past will probably be wrong because you now applied the whole history of that foreign place to your system's time zone rules. But at least the present will be present.

Reverse stereo – hardware solution

It all started when I decided to replay the Thief series games. Having finished Gold and Metal Age, I went on to Deadly Shadows, but then suddenly noticed one terrible thing: there is no option to reverse stereo in Deadly Shadows. Well, technically, there are three options in the config file, but neither of them is working. Well, it's the third game in the series, it must be worse than previous two, right? At least they didn't make it as bad as Doom 3 or Fallout 3.

So I thought, hey, it's just a matter of swapping stereo channels, right? There ought to be something like that in the driver settings or in Windows 7 control panel, or registry or somewhere! As it turned out, Windows used some built-in drivers for my on-board Realtek HD Audio card, and Windows 7 itself doesn't have the "reverse stereo" option. Well then, I must install Realtek drivers and there I'll find what I need. Or so I thought.

After installing Realtek drivers, I ended up with some crappy Manager in my tray, which had ridiculously many settings, including type of material that walls in my room is made of. But no "reverse stereo" option, of course. I should've guessed. There is no way hardware manufacturer will provide anything useful in their software except maybe drivers. If it was Linux, I know I could've done it with some ALSA settings, but I just didn't want to install Linux just to play a single Windows game, even though it's supposed to run there with no problems.

Now stupid people (or people assuming that I'm stupid) would say, "Hey, how about just moving your speakers?" Duh! If it was so easy, I would've done it when I was placing them there in the first place. The reason I couldn't do it was that the AC cable is too short to reach the place where the right speaker is, so I just had to place it on the left side. And no, using an AC extension cord is not an option: I don't want to increase the number of 220V cables and plugs around my PC more than I have to.
The solution

Продолжаем битву с HTML5 local storage

Итак, HTML5 local storage при ближайшем рассмотрении оказался файлом webappsstore.sqlite в профиле Firefox. К счастью, они не стали изобретать велосипед, а воспользовались стандартной БД SQLite, для которой полно приложений, в том числе есть даже плагин под FF, хотя в чём смысл плагина, работающего с локальными файлами по сути независимо от браузера, мне не слишком понятно. Но зато теперь можно сделать, например, вот такой запрос:

SELECT * FROM webappsstore2 WHERE key = ‘placeholders’;

И получить примерно следующее:

scope key value secure owner
moc.lanruojevil.oaya.:http:80 placeholders {“http://s42.radikal.ru/i097/1302/04/b553dec9adc2t.jpg”:true} 0

Что означают последние два столбца, я так и не понял, но особо не разбирался. С key и value всё ясно. Ну а первый столбец – это цирк какой-то. С http:80 всё ясно, а вот moc.lanruojevil.oaya – это, ясен пень, ayao.livejournal.com. Зачем это сделано – не совсем ясно, но вроде есть какие-то намёки на то, что это упрощает работу со строками. Например, можно удалить всё с livejournal:

DELETE FROM webappsstore2 WHERE scope LIKE ‘moc.lanruojevil.%’;

Теперь я понимаю, как чувствуют себя евреи, запустив в коммандной строке приложение, выдающее сообщения на иврите. Слева направо, естественно, потому что командная строка другого направления не знает.

Image placeholders или fuck HTML5 storage

Итак, дано: ЖЖ, френдлента, в ней – картинки, настолько большие, что на монитор не помещаются. Результат – нечитаемая френдлента. Найти: как с этим бороться?

Решение первое, дипломатическое. Убедить всех френдов прятать большие картинки под кат. Не самый идеальный вариант, но для кого-то может и сработать. Не мой путь, так как я считаю, что в своём блоге каждый должен постить что хочет и как хочет.

Решение второе, техническое. Включаем в настройках ЖЖ image placeholders. Теперь все картинки стали маленькими кнопками. Нужна картинка – нажимаем кнопку. Не нужна – не нажимаем. Победа? Как бы не так. Этот механизм делал то ли сатанист, то ли садист, то ли просто нормальный человек. Так или иначе, если картинку один раз загрузить, больше вы от неё не избавитесь. Что делать?

Соображение первое. Если картинка не нужна, то её загружать не следует. Как понять, нужна она или нет, если размер и содержимое неизвестны заранее? Оказывается, очень просто. Placeholder является не только кнопкой, но и ссылкой. Если открыть её в отдельной вкладке, то просто открывается оригинал картинки – placeholder при этом остаётся на месте, не заменяясь картинкой.

Казалось бы, задача решена. Но возникает вопрос – а что делать, если нечаянно кликнули на placeholder? Как-то же убрать картинку обратно можно? Наблюдения: очистка кукисов livejournal.com не помогает. Смена браузера – помогает, стало быть эта информация не хранится на сервере. Не в кукисах, не на сервере, так где же, блин?

Позвольте вам представить новую технологию HTML5 – local storage! Теперь у веб-разработчиков есть ещё один способ причинять страдания пользователям. Я-то по наивности думал, что все подобные технологии были порождены сумбуром зари веб-технологий… Ни фига, принцип остался тот же – лишь бы навредить. Разработчики браузера, естественно, в сговоре с террористами – функции "удалить на Хрен local storage" в Firefox нет. Разработчики плагинов, похоже, тоже в доле – есть плагины для просмотра содержимого local storage (и проклятые placeholders там очень хорошо видны), но удалять они не умеют. Но один всё-таки нашёлся: ClearConsole. Правда, удаляет он всё без исключения – удалить только placeholders или хотя бы только local storage ЖЖ невозможно, это вам не кукисы. Ну ничего, пользы от этого local storage всё равно нет, поэтому можно смело удалять всё. Победа разума над интеллектом!

Яндекс.Карты и NoScript

Задолбался ковыряться с Яндекс.Картами. Не работают панорамы – хоть ты тресни. В IE работают, в FF – нет. Стал отключать плагины – оказалось, дело в NoScript. Переключился на другой профиль – там оно почему-то работает, хотя NoScript тоже установлен. И тут меня осенило, что там версия NoScript допотопная. Стал искать, с какой версии это началось. Оказалось, с 2.5.8. Откатился было к 2.5.7, но вроде как-то нехорошо это. Решил бы написать багрепорт, а там написано – надо предоставить все сообщения с консоли ошибок JavaScript. Стал там копаться, а на вкладке "Messages" NoScript что-то ругается про скрипт с некорректным типом, включаемый Яндексом…

Короче, выяснилось, что в Яндексе какой-то скрипт имеет тип application/octet-stream, что для скрипта, мягко говоря, странно. NoScript это невзлюбил и заблокировал (не смотря на то, что Яндекс в белом списке). К счастью, для таких извращений нашёлся свой белый список в about:config под именем noscript.inclusionTypeChecking.exceptions – добавил yandex.st (через пробел) в конец, всё стало работать и в последней версии NoScript.

Google Drive file synchronization algorithm

I’ve just figured out another huge disadvantage of Google Drive. If a file is changed, Google Drive re-uploads the whole file again. Yes, that means that if you change 1 byte inside a 4 GB file, Google Drive will re-upload the whole 4 GB! I didn’t bother to check this when I first tried it because I didn’t even think about that possibility! Haven’t Google heard about rsync? That’s just ridiculous!

So using large TrueCrypt volumes in Google Drive is probably a bad idea. Its main use should be to store some random insensitive data like crappy photos that nobody wants to see anyway. For anything sensitive, something like Wuala should probably be used instead.

Google Drive

Google Drive is probably one of the most awesome Google services, and yet it demonstrates how stupid Google has become.

One important thing about cloud storage is encryption. With Google Drive you get no encryption, except for secure transmission. Well, that means that we don’t have to worry about man-in-the-middle attacks, but that’s it. If some asshole at Google wants to look at your files, he can do that. If Google wants to mine your data for anything, they can do that. In fact, they can do about anything with your files as long as nobody catches them.

Another thing is not that awful, but much, much more stupid. In fact, this is about the most silly thing I’ve ever encountered in widely used software. I’ve already mentioned about numerous ergonomic failures of Google, such as Instant Preview or whatever that’s called, or heavy (ab)use of JavaScript and Ajax. But this one beats everything in the way of stupidity, although thankfully it’s not even as remotely annoying as those previews. Now listen carefully.

Say, there is a program that asks you to select a folder to store something. Doesn’t sound like a ridiculously difficult feature to implement, does it? Well, for Google it apparently is! Google Drive asks you to select a folder to store the contents of your Drive. Suppose you selected the “d:\data\google” folder. Does that mean Google Drive will store its data in the “d:\data\google” folder? No, of course not! That would be distasteful at least! Google Drive is smart enough to guess that you actually want to store your data in the “d:\data\google\Google Drive” folder. What? That’s not what you meant? You’re kidding!

Can you change your Google Drive location to a folder you actually want your data to be in? The answer is that you can. It is even pretty easy, although not even remotely obvious. Here’s the procedure:

1. Choose any random location for your Google Drive.
2. Exit Google Drive.
3. Move/rename the Google Drive folder to whatever location you want.
4. Start Google Drive.
5. Click Google Drive icon and select the error message about the folder being not found.
6. Click “Locate folder” in the opened window and select your new folder.
7. Profit!

Now about encryption. The easiest way to deal with it is to use TrueCrypt (now dead, but there are alternatives) to create an encrypted volume in the Google Drive folder, then sync it. There’s some issues with this approach, though:

1. You’ll have to disable timestamp preservation for volumes, otherwise Google Drive won’t sync the modified volume. It can be implemented either by disabling this feature globally or by using “/m ts” when mounting from command line.
2. There’s no option to save the password. TrueCrypt is meant to secure locally stored data so it kind of makes sense. But if you don’t care about anyone with physical access to your computer accessing your data, you can just create a BAT file to mount the volume from the command line, specifying the password there. Protecting the BAT file with NTFS permissions is a nice idea anyway, though.
3. The biggest issue is that the TrueCrypt volume is fixed in size. You can use sparsely allocated NTFS file to store it, but Google Drive won’t know anything about it and will try to upload the whole volume anyway, including unallocated pieces. That means that if you have a 4 GB volume with 100 MB of data in it, it will still occupy the whole 4 GB in Google Drive. There’s no elegant way around it. One workaround is to choose the size of the volume sparingly and increase it as needed, which involves creating a new volume of a bigger size and copying your files there. In this case you should only store really sensitive data there. Another workaround is to create a volume as big as possible and to store everything there even the files you don’t really need to encrypt. In this case you won’t be able to share those files with anyone or to access them through the web interface.

Right now, something like Wuala is much easier to use if you want an encrypted drive. On the other hand, TrueCrypt is much more flexible and Google Drive is a way cheaper. A nice idea is to use a TrueCrypt volume stored on the Google Drive and sync some folders to Wuala to provide additional backup and convenient access.

One more thought about how stupid Google is: with Google Drive you can create folders for your Google Docs. Why is that stupid? Well, think about it once more: it took a few years of development to finally introduce a way to group Google Docs in folders, which should be possible from the very beginning! Google definitely seems to have some grudge against folders. After all, they still haven’t implemented them in GMail nor they intend to. Instead we’re stuck with these idiotic “labels” which force IMAP users to download all mail twice: once from “All Mail” folder, and once from “Inbox” folder. Oh, I don’t even mention additional labels if you have them.

So what are the hardest tasks for Google to implement? Apparently these:

1. Introduce folders anywhere. It was hard in Google Docs, it is impossible in GMail.
2. Make Google Earth map to always have north at the top side. I have no idea why Google Maps behaves like a regular map – perhaps some browser limitations prevented them from introducing such important usability failure. But that is compensated by the lack of many simple and useful tools there, like a ruler.
3. Introduce a way to disable Instant Previews. That feature is certainly useless enough to be forced on everyone.
4. Provide a way to attach regular files to e-mail. By “regular files” here I mean regular files, not “files that are considered safe to send by Google’s twisted rules that has nothing to do with real security”.

More? Probably. I don’t even want to know.

QObject auto-disconnect in destructor

It is pretty nice that QObject::~QObject() automatically disconnects everything connected to the object's slots. However, it would be pretty unwise to always rely on this useful feature. Here's one good example how it can screw up a seemingly harmless code.

Suppose you have some kind of networking class, say, MyProtocol. This class is a descendant of QObject that incorporates several fields, including a custom MyStream object. MyProtocol is also connected to the disconnected() signal of the socket used for network communication. Now MyStream has a destructor that sends some sort of "goodbye message" to the other end using the same socket. When this happens, the socket could realize that the connection is broken and emit disconnected(). Which would call the appropriate slot in the half-destructed MyProtocol:

This could be a source of rather subtle bugs. Since MyStream destructor is called from the MyProtocol destructor, it can’t be guaranteed that the MyProtocol object is still usable. And ~MyStream() doesn’t use it directly! Instead, it works with the socket, which in turn can emit some signals, and some slots will get called, some of them possibly in the half-destructed MyProtocol!

How to deal with this? Easy. Disconnect all the signals as soon as you don’t need them any more. Writing a custom destructor for MyProtocol that does it would work too. Too bad Qt doesn’t provide a way to disconnect every signal connected to a particular object, only the other way around (all slots connected to the object signals). The hardest part of this kind of bugs is to figure out what happens, as the program could break at any point later due to memory corruption.