Monthly Archives: April 2006

More about web development: do not try to be cool

Just one more little thing about web development: the less you try to be cool, the better result you get.

Take an example. I am looking now at the Doxygen documentation. It has some examples, marked with \code…\endcode. These commands are intended to produce beautifully marked up code, with syntax highlighting, fixed-width font etc. And indeed it does very nice job, but for some reason in HTML Doxygen output it sets “font-family: Fixed, monospace; font-size: 95%;” for it. I do not understand this “font-size: 95%” even a little bit! What the hell is it doing here? Font sizes are really very unstable thing, they tend to differ on almost any platform and difference of “5%” is unlikely to make things significantly better. In fact, it will actually more often make them worse, triggering some kind of badly implemented font scaling mechanism.

In my browser it looks a kind of crappy, but it is not those “5%” fault. Actually, these “5%” do not seem to do anything in my browser on my platform, and I am quite happy with it. The problem is caused by that “Fixed” font. For some reason it does not look good here. Removing this “Fixed” and leaving just “monospace” (note, standard CSS font family!) makes things look just fine.

And what exactly the reason for this “Fixed” to be there. It is code, so it should be displayed in fixed-width font, yes. So “monospace” is necessary. But why the hell is “Fixed”?! Not even “standard” “Courier New” or something. Probably because it was looking very cool on that guy’s platform. So he tried to make things look cool for everyone and inevitably failed.

Web is not the place for excessive artwork. If you are not sure whether you need this thing or not, better leave it out. This may make things look not-so-cool on just one platform, but will greatly decrease probability of looking them awfully on a lot of other platforms.

Avoding adhocity

I am thinking about how to avoid adhocity in the code. By “adhoc” I mean something that should not be there in the first place but was added later due to some reason: to fix a bug, to quickly add a feature or alike.

One way to deal with adhocity is simply bear with it. And it works sometimes surprisingly well. Especially in the case if you are never looking at that place in code again. What is wrong here is that you actually never know when and where you will look, so it is a kind of lottery – you may find yourself lucky later or you may get stuck in your own adhocs when you finally need to make some serious adjustments to the code.

I can think only about two other ways. One is to avoid adhocity altogether, another is to fix it later. First way definitely sounds better, but it will not help if it is already there. So if your code is already adhocish, well, you are stuck. You may try to fix it by refactoring the code, but it may take a lot of time which will be a waste if you never have to deal with it again. Or you may choose risky path and live with it, but then you are risking ending up with a mess.

To avoid this choice, it is better to go the first way and avoid adhocity altogether. What left is to only understand how. And here I have got some kind of philosophical idea. The idea is to believe that every problem has a proper solution (well, at least one) and that this solution is always better in long-term meaning than any kind of adhoc. I am sure that I am not the first one who came up with this idea, but I did not try to live by it yet so I can say nothing about whether it really works or not. What I mean here is that if you are making a change to your software that looks like it does not fit in well even slightly, you better stop correcting anything and carefully think about it. The goal is to find a way to implement the same but in the most elegant manner possible, even if that means significantly more coding.

For example, let us suppose we have some kind of function that processes one byte. It was implemented using some kind of table with 256 entries. Since one byte never can (and never will) hold more than 256 values, constant “256” was used literally in many places. Now suppose that we have to make this function process a special case when the input is empty. Now it accepts “int c” instead of “char c” with value c == -1 meaning “empty input”. It is 257th case in our function which we did not think of. Adhoc solution would be adding a lot of “if”s for empty output. Or globally replacing 256 with 257 and adding one more entry to the beginning of the table, and replacing “c” with “c + 1” in function. A more proper solution would be:
* to define a constant like CASES_COUNT = 257 and replace these 256s with this constant;
* define a mapping from the input to the index in the table. Impossible input should be mapped to something like “-1” so we will know right away that something is wrong (for example, if we get -5 or 257 on input).

Of course, this example is too simple. In this case I would probably go the right way anyway. But the whole point of all this blabbing is that it is always better to do it right, even if it seems like an overkill. I shall try it in QuaWeb development and see how it works. I am too tired of all these adhocs in my software, they look like nothing at the early stages, but then one day I find that my nice application turned into adhoc monster where it is harder and harder to change anything.

The most important things about Web technologies

One of the most popular areas in the software development nowadays is undoubtedly the Web. And at the same time it is the area of the most incompetence. Now I am going to say a few things that every Web developer must know. No excuses!

First thing. There is The World Wide Web Consortium.

Second thing. I do not know what is different between URI and URL, and think it is not worth to know anyway, but what is worth to know about them is that URL is not a "network path". And the other thing is that “cool URIs don't change”. What is the most important of all that is the very concept of the resource: a resource is something with fixed semantics that have global unique identifier (URI), file path is just a way to identify a bunch of bytes on the particular piece of hardware. They may look similar and may have a mapping between them, but they are conceptually different things. Using something that is not related to the resource concept as the part of the URI just like writing “please deliver this to the big blue house standing next to the crossroads at the east part of the town” instead of the street address on a letter. What will you do when the house will be repainted green? Or when there will be more crossroads in the east part?

Third thing. “L” in HTML stands for “language”. And this is true for XML and XHTML too. Basically this means that you can make lexical, syntax and semantic errors in the XHTML page as well as in C++, for example. Error is a bad thing. And the fact that the page with an error may look nice in your Internet Explorer, while even a small error in C++ will fail the compilation, does not make this error better. Only worse, actually, because this makes it harder to spot and fix the error. The same goes even if your page looks fine in Firefox, Opera, Lynx or whatever. Therefore I conclude: every Web developer absolutely must read at least HTML 4.01 specification. No, he is not going to learn it. He is going to understand basic concepts (for example, what the hell are the block and inline elements) and understand the language better. Oh, and there is no “L” letter in CSS, but that is no excuse.

Fourth thing. Surprisingly enough, technologies do not exist just to be used. They exist to solve particular problems. (X)HTML exists to mark up hypertext, PNG and JPG exist to store graphics data, CSS exist to describe the page appearance, PHP and CGI to generate content dynamically. Therefore, if you find yourself using something just because you can, stop and think first about what you are trying to do, and only then about what you should use to do it. For example, entire Web page could be made out of PNG files included in (X)HTML file without single English word in it. This will sure do the job, but not only it will be harder to do, it will also increase bandwidth usage, scare away search engines, make it impossible to read the page on the text terminal and the Eternity knows what else. This goes not only for the Web development, and even not for development or IT at all, but it is the Web development where people tend to abuse technologies just too often. Take Macromedia Flash, for example. Surfing a Web site made entirely of it is just like walking in the forest without a map and blindfolded – you do not know neither where are you nor where are you going.

If somebody just does not know about these things, that is okay as long as he or she wishes to learn. But as for people who know, but do not care about it, I would kill them all right away if I could.

Unix philosophy: reuse! Popt example

One of the concepts of the Unix philosophy is code reusing. Never write anything that is already written. Never invent the wheel. If you are not happy with any of the existing wheels, try to modify one of those instead of writing your own.

So I was writing class QuaArg for parsing command-line arguments, knowing that there are already such things as getopt and getopt_long. I was doing that because I needed more portable thing – I needed it to work on Windows too, for example. And I am not even sure whether getopt_long will work on HP-UX, for example.

But then I realized that I was still doing something wrong. The very idea of reimplementing such common thing as command-line arguments parsing by myself started to seem more and more ridiculous to me. So I stopped and thought a bit. Okay, so I am in need of portable command-line argument parsing library. Is it hard to implement? Not. Plain standard C is more than enough. So, am I the only one who needs it? Probably not. Therefore, there must be something like portable getopt_long implementation! There is no way it would not exist!

So I started to look for it and quickly found GnuWin32 project. There I looked for getopt_long, did not found it, but found popt instead. It is a library for command-line option parsing in C. A few external dependencies, but they are all portable C libraries too. Everything is available for Windows in that project. And it is of course available on any Unix, too. Easy to use, has all the features I need. I tried to use it, and it worked fine. I tried to compile then on Windows – worked fine, I only had to put DLLs in PATH, and properly set up paths to header files and libraries.

Now I already implemented half of my application that needed to use command-line arguments. If not for Unix philosophy, I would probably still be coding that QuaArg class, finding and fixing a lot of bugs and watching my code grow far beyond two simple functions.

Emacs and Vim

One of the most religious things in the Unix operating systems is probably a choice between Emacs and Vim – the two most powerful text editors in the world. I am not going to go into another kind of “Emacs vs Vim” philosophy, but instead I have something to say about Emacs and Vim together.

I am using Vim for a long time already, and I am not feeling like tossing it away. It is very good, if not even better. In fact, I have never seen an editor which allows to type so fast, and I probably never will. Well, at least while keyboard remains fastest input device. In addition to typing speed, Vim has very good basic editing primitives, syntax highlighting, ability to compile programs from within editor, folding, and tags support. Using folding you easily can edit large texts with structure such as programs and LaTeX sources. Very nice indeed. Syntax highlighting is not even worth mentioning because every good editor just must support it, but what is worth mentioning is that Vim supports about 400 different languages and dialects. Compiling programs from editor would be useless, but if compilation fails, Vim moves you to the first error – very useful. Tags support provides you a quick hypertext-like navigation mechanism – just press Ctrl+] on function name, help topic or alike – and you are there.

Having said it all, Vim still has some major flaws. Most important of them is a lack of the flexible interactive process integration mechanism. It does not work with debuggers, version control systems, spell checking software and everything else that requires non-batch interaction with external processes. It supports a spell checking, actually, but it is not very good, because it is based on periodical calls to the spell checker, not interactive communication. As for everything else mentioned, I have seen nothing good for this tasks in Vim.

And here is where Emacs comes in. While Vim is just powerful text editor and nothing more, Emacs is more like a generic template for some kind of IDE. It has very good version control system and debugger integration bundled right in. As for spell checking, it has much better engine, but I could not make it spell check UTF-8. Still, it does not change the fact that Emacs has very powerful interactive process integration mechanism, which Vim lacks. As for version control and debugging – it just rocks. Setting breakpoints from editor, tracing execution within it, syntax highlighting for interactive debugger output, basic version control commands bound to convenient keys, ability to view arbitrary file revisions from editor, and such.

Still, Emacs is much slower than Vim. All these “C-a C-u 3 C-k ESC S-<" instead of Vim's "3ddgg" can drive Vim's user crazy. Well, I am pretty sure that once I get used to it and find what operations I need the most and bind them to convenient keys, Emacs will become much more friendly to me, but I seriously doubt that it will give me typing speed comparable to Vim. Therefore I conclude that it is no wonder people keep arguing about these two editors, forgetting everything else. They are really good, each in its own way, and I see no other choice except using them both. What is left to decide is when use which. As far as I can see now, Vim is better when you need more typing and less other operations, and Emacs is better when you find yourself simultaneously fixing, debugging, committing, updating and such. Therefore, for mail - Vim is definitely better, for editing configuration files - too (I even use Vim to edit Emacs configuration file, and find it very useful), for initial stages of development, when you think about designing and coding more than about committing and debugging - Vim again. For debugging and quick fixing purposes, Emacs only. And what is really sucks about it, is that even basic operations are done in Emacs and Vim with completely different key bindings. For example, basic cursor movement in Vim is "h j k l" while in Emacs it "C-b C-n C-p C-f". I admit that latter is easier to remember (back-next-previous-forward), but if you look at your keyboard, you will have to agree that Vim's keys is much easier to use. Not to mention that in Vim you do not have to press Ctrl. I have to admit, though, that I only began learning Emacs. My Vim's knowledge is much better, although I am no expert in neither. One thing that is clear for me is that I have to use both to achieve maximum productivity.

Emacs Psychiatrist

Начал потихоньку осваивать Emacs, поскольку, как ведомо, не Vim’ом единым сыт программист. Сразу напоролся на бредовую вещь под названием “Emacs Psychiatrist”. Решил немного с ним побеседовать.

I am the psychotherapist. Please, describe your problems. Each time
you are finished talking, type RET twice.

Hell?

Can you elaborate on that?

I can.

Why do you say you can?

Ee?

I would appreciate it if you would continue.
больше бреда

FreeBSD 2.2.9 Released!

Пришло письмо из freebsd-announce:

Потрясающе. RFC 1149 с его реализацией – это, конечно, было ещё круче, но тоже неплохо.