Programming language evolution

It’s been a long time. I got married, and now have even less time to write stuff on the net than I had before. But here I am, sharing another thought when I have a little bit of time available right now.

This is inspired by Uncle Bob’s post Living on the Plateau. He speaks of “the metal”, and how the metal drove the evolution of the programming languages. Indeed, we jumped from assembly to C because we could afford that level of abstraction and get rid of the low-level dependency on the particular hardware. Then we could afford garbage collection and virtual machines, so we did just that. Then Robert states that functional languages were driven by the need for more CPU cores, which is probably true (I’m not an expert on functional languages).

But there’s one thing that was always nagging me a bit. If we can afford to think of our resources as infinite (unless we’re trying to solve NP-complete problems of huge sizes or load petabytes of data into RAM), then why on earth should our languages evolution be driven by the metal? Robert speaks of language evolution changing into craftsmanship evolution. While craftsmanship evolution is definitely a good thing, I still think we could use better tools to become better craftsmen. And, surprisingly, we’re still too close to the metal for that.

Think of it. We have very powerful hardware, we have virtual machines and interpreted languages. But take Java or C# for example. In Java, we still have these bare metal types, such as int or double. They are even called “primitive”! In C#, they are no longer that primitive, in the sense they at least inherit from Object, which is definitely a good thing (just look at API to understand what I’m talking about). But they are still too close to the metal. Why on earth, from the user’s perspective, 2147483647+1 should equal −2147483648? OK, Python goes further along and makes integer types virtually infinite, but… if you think of it, these are just very tiny steps to what I think should be the next step in the language evolution.

Look at this code:

Makes sense? Not to me. Why on earth userIndex has the inttype? The type should reflect what it is, and user index is definitely not just a random integer. Indeed, this would compile just fine:

It compiles, but it feels so terribly wrong that one might wonder if there’s something wrong with our language. And I think there is. The root of this problem is that we’re still thinking too close to the metal. Much closer than we can afford to, given the sheer computing power available to us.

Indeed, I think that low-level types, limited in size and capabilities, should still be available, but they must be very rare and exotic, not ubiquitous like they are in today’s software. For example, the code above might look like this:

Now that’s better, but still feels wrong. Why on earth all lists should use the same type for indexing?

Now that’s something that makes sense to me.

OK, but what if we’re working with a pure math problem? Shouldn’t we use raw types then? Suppose they’re not prone to low-level effects like overflow, as they aren’t in Python? No. Even in math, there are all kinds of integers that are not exactly interchangeable. You wouldn’t assign an X coordinate to a variable holding a Y coordinate unless you’re performing some kind of transpose operation, in which case your code should probably explicitly convert these types. Working with doubles, you wouldn’t assign an angle to a coordinate. Come to think of it, the very word double comes from the metal. It just reflects hardware precision.

I was recently explaining a simple LeetCode problem to my wife (she isn’t a programmer at all). I was able to explain this code to her pretty well:

But she was stuck at this bit: (m & 1) == 1. And indeed, it just doesn’t look right. If m is just an integer, int the math sense, then why on earth am I fiddling with bits instead of just saying m.isOdd()? Now that would look much cleaner. Indeed, I’d expect this code to look more like this:

I’m still using some literals here, and I’m not sure what to do with them. Perhaps our language should be able to implicitly convert them to the needed types, as long as it makes sense (if that’s even possible to detect at compile time). One thing that still feels terribly wrong about this code is count /= 2. Why on earth should we assume that division rounds down by default? That is another low-level bit that comes to bite us. There’s probably more, but my mindset is probably still too low-level to figure it all out.

Nowadays some libraries, especially those that use so-called fluent API, go along the route I’m talking about, to a some extent. For example, in AssertJ, we often write:

Not only it helps to make code more readable and resolves possible signature clashes, but it also prevents wrong code from compiling.

But these are again very small steps. These tricks exists only at the API level, whereas even the standard library continues to use ubiquitous ints, longs and whatnot. And in the next stage of programming language evolution, I’m eager to see higher-level concepts introduced at the language level, at the standard level. We must stop thinking in bits and bytes and start thinking in concepts. And even at low level, wouldn’t it be better to see

rather than

If anything, it would stop us from making stupid bugs when we accidentally mistake the order of the arguments or get confused between from, to and from, length kinds of arguments. How often did you write Arrays.copyOfRange(array, index, size)instead of Arrays.copyOfRange(array, index, index + size)? I know I did that many times. Unit tests catch these pretty quickly, but wouldn’t it be wonderful if it was plain impossible to even make this kind of mistake?

And while we’re not quite there, I encourage all developers of higher-level software to employ tricks like the one above to introduce more custom types, to force code look cleaner, to make it easier to understand and harder to get it wrong. Don’t think like a programmer unless you have to! Think as a business area specialist! This is like a premature optimization. Unless you have proven that using a custom type slows your software terribly, don’t go for that int and string types! You don’t work with separate bits when you can work with whole bytes, and you don’t work with whole bytes when you can work with integers, right? So don’t work with integers when you can work with list indexes, coordinates, object counts and other high-level stuff that really makes sense in the application area you’re working with.

The answer to the first-grader problem

The unknown numbers are, of course, 6, 5 and 3. What’s the difference between them? Well, you can’t find the difference between three numbers. You can find the difference between the unknown numbers on the right side, though. So you do just that and get 2. That’s another unknown number we found out. Now just find the difference between 6 and 2. The final answer is 4. Makes sense? What, it doesn’t? I agree. Figures.

Find the difference between the unknown numbers

This is a really stupid problem for Russian first-graders (7 years old or so). Given the picture below, find the difference between the unknown numbers. That’s it, really. I’m not joking.

Find the difference between the unknown numbers


  1. The difference means...

    “The difference” means you have to subtract something from something. Answers like “the difference is that there is one unknown number to the left and two to the right” sound OK, but unfortunately are wrong.

  2. The unknown numbers are...

    If you think that the unknown numbers are 6, 5 and 3, you’re correct. But how on earth do you find the difference between three numbers?

  3. Almost the answer

    Don’t forget that this problem is for 7 year olds and it’s all about subtraction.


Why I love Hebrew

After the last post I feel obliged to say a few words about Hebrew as well. And as it will probably turn out, there will be more than a few by the time I finish writing this.

Japanese is like art. You look at it, you’re fascinated with it, and you may see as many meanings in it as your imagination will allow you. Hebrew, on the other hand, is like science.

In many ways, Hebrew is very similar to Western languages. It has a proper alphabet (no vowel letters or capital ones, though), it has spaces (unlike Japanese), it has grammar and spelling rules and all that stuff. What makes it special (along with Arabic, probably) is the structure of a typical Hebrew word, I mean mostly nouns, adjectives and verbs here.

A typical Hebrew word consists of a pattern and a stem. In most cases the stem is either three or four letters. The pattern is whatever left after you take out the stem, or at least in the simplest cases. Let’s take the word “shalom”, for example, which literally means “peace” or “well-being”, but often just “hi”. There are different ways to write it. The most commonly used one is


(Read right-to-left.)

However, when learning Hebrew, it makes much more sense to write it as


Note all the three differences: a tiny dot above ש, a T-shaped mark under it and a tiny dot above ו. The first dot is not very interesting: it’s just that there are two different letters that look almost the same: shin (שׁ) and sin (שׂ). These dots just mark the difference, and normally they are omitted, just like you often omit various diacritics in English loanwords like “fiancee”.

The other two marks are much more interesting. They are vowels. Turns out, Hebrew predates vowel letters, and due to religious reasons it was not feasible to add them later on. So diacritic marks known as nekudot ([neh-koo-DOT], meaning just “dots”) were invented. The T-shaped one is “a”, and the dot above ו is “o”. A funny thing, even though the letter ו (vav) is normally pronounced as “v”, the presence of that dot makes it silent, therefore turning it into a vowel letter (the letter itself is silent, but the dot is still pronounced as a vowel). There is a good reason for this: before nekudot were invented, some consonant letters were used as both consonant and vowel ones. The most commonly used are ו (v/o/u), י (y/i) and ה (h/a/e, but can only be silent at the end of a word, where it typically is silent). But sometimes י stands for “a”, or ה stands for “o” at the end of a word or other weird things happen. This weirdness comes from religious texts too, which can’t be changed, so the language had to adapt to reflect the correct pronunciation.

But that all is annoying and not really fun. What’s important is that this word has a stem, namely ש-ל-מ. Never mind that ם turns into מ and vice versa. ם is just the right way to write מ at the end of a word, and that’s purely a graphical feature specific to just five letters (of 23). But since ש-ל-מ is not exactly a word, the standard forms of letters are used. So both שלום and שלומך have the same stem.

What’s the pattern then? In Latin, it could be written as 1-a-2-o-3. In Hebrew, it’s kind of hard to write because you can’t write nekudot without letters, and all nekudot are a part of the pattern (except the dot from the letter shin, which is actually a part of the letter and therefore belongs to the stem together with the letter). The most common way to write patterns is to put some random stem in it, preferably a one that doesn’t distort the pattern (which is a pretty common occurrence known as gizra [geez-RAH]). The usual stem is ק-ט-ל, which is a bit uncomfortable because its meanings are associated with killing, but it works the best as the letters are very gizra-neutral and don’t wreak havoc on the pattern.

The order of the letters is important. It’s also a very rare thing of the pattern letters to get intermixed with the stem letters. In fact, there is only one pattern where that happens, and there are strict rules for that one. Note, however, when I say “letters” here, I mean full-fledged consonants. וֹ in שלום is no longer a letter in that sense and therefore it jumps in between ל and ם quite easily.

The whole stem-pattern business has enormous impact on the learner. In a random Western language, when you see an unfamiliar word, you can sort of guess its meaning if you know its stem. Say, if you know what “white” means, you can probably guess the meanings of “whiting”, “whiteness” and similar words. Hebrew goes much further because sometimes even apparently unrelated words share a common stem or a pattern. Say, you know that פיגוע [pee-GOO-ah] is a terrorist act. And then you see פגיעה somewhere. You instantly recognize the stem, and knowing that the pattern קטילה [ktee-LAH] usually means some action, but not as intense as the pattern קיטול [kee-TOOL], you deduce that it must be some other bad action, but not as bad as a terrorist act. An assault perhaps, or some other kind of offense. Then you look at the context:

מה לעשות במקרה של פגיעה משריפה?

Even with my poor Hebrew, I can read that roughly as “What to do in a mikra of pgia from fire?” Now I know, as I’m reading this on a news website, that there is a lot of fires in Israel now, probably due to arsons and dry weather combined. Given all that, I’d guess that “mikra” is something like “situation” and “pgia from fire” is something like an arson, and it’s probably an article explaining what to do if you suspect someone of starting a fire or if you witnessed it, or something along that lines.

Now, I’m fairly sure that פגיעה is pronounced [pgee-AH], but I’m not sure at all about מקרה. It could be [mee-KRAH] or [mah-KREH] or anything. The lack of nekudot doesn’t help to figure out the correct vowels, but knowledge of the patterns does. If I knew what pattern that is, I could probably guess pronunciation with enough confidence, just like I did with פגיעה because it’s a pretty distinguishable pattern, thanks to the way how י and ה are placed.

I really just made up all this by just looking at a random news website just now. Now let me consult my dictionary and see what I got right.

First, מקרה is pronounced [mee-KREH]. Almost got it right. I guessed the meaning right from the context, it’s really an occurrence, or a case. “מה לעשות במקרה של” [MAH lah-ah-SOT beh-mee-KREH SHEL] is literally “what to do in case of…” And I would have been even more sure if I knew how the phrase [MAH kah-RAH] (what happened?) is spelled (that’s מה קרה, but I’ve never seen it before, only heard).

Second, פגיעה is a blow (a physical or a moral one), a hit, damage, an injury. And looking at the further text, it seems I was wrong about arsons after all. A more appropriate translation would be “What to do when fire hits?” or “What to do in case of fire injury?”, which kind of makes sense, as I think, because any reasonable person probably already knows that in case of a suspected arson the only right thing to do is call the cops, but many people don’t know everything that should be done in case of fire or fire injury.

Still, as you can see, even as I encountered two totally unfamiliar words, I was able to guess their meanings to some extent.

Thanks to gzarot (plural of gizra), though, this exercise becomes much more complicated. In this case, their effect is minimal (that “e” at the end that I got wrong). But there are cases when a mere presence of a certain letter in a certain position breaks everything. Say, words like מדבר [meed-BAR], מגדל [meeg-DAL] and מערב [mah-ah-RAV] belong to the same pattern, but see how the last one is pronounced? All thanks to the letter ע which doesn’t like “i” before it and the lack of a vowel after it, but absolutely loves the “a” vowel on both sides. Gzarot get even worse with words like שומר [shoh-MER] and גר [GAR] which also belong to the same pattern… or, rather, would belong to the same pattern if it was possible to squeeze the root ג-ו-ר into it. But since it’s impossible (for no reason), the middle letter of the stem just disappears and the whole pattern is changed so drastically, that it’s easier to say it’s replaced by another pattern altogether. Thankfully, gzarot are rules, not exceptions. Meaning that if it works that way with this particular stem and this particular pattern, it will work the same way with another stem/pattern pair that satisfies the same condition. If we take the same “shomer” pattern and the  ד-ו-ר stem, we know it’ll be דר [DAR] and nothing else. Sometimes the rules are ambiguous, though, but they still usually limit the possibilities to two or three.

And yet, with all this weirdness, the most of the language vocabulary suddenly turns from a list of seemingly random words into a nice table of stems (about 4000 of them) and patterns (about 200), and a set of gzarot to learn. That’s an order of magnitude less than tens of thousand words for a typical random language. What’s especially good about it is that once you know a word, it’s easy to restore both spelling and pronunciation. If you remember the word קרה from the phrase מה קרה and then you remember the pattern מגדר, then you may misread the word מקרה as [mee-KRAH], as I did above, but even if you don’t remember how it’s pronounced at least you’ll get the spelling right (without the vowels), so you won’t get [mee-KHRAH] or [mee-GRAH] because you know the stem. And once you’ve seen that type of a word, you’ll figure out the gizra, and then you can apply it to different words as well. Now when I see מפנה, I know it’s [mee-FNEH], and not [mee-FNAH]. Ditto for משנה [mee-SHNEH]. Apparently the gizra here is that when a stem with the third letter י (which often turns into a silent ה at the end, just like here) goes into the pattern of מגדל, the last vowel changes to “e” (before the silent ה):

מִגְדָּל ← מִקְרֶה

(The one dot is “i”, the three-dots triangle is “e”, the two dots are silent in this case, and never mind the dot in the middle of ד).

See how learning separate words turns into learning whole groups of words. What’s even better is that different gzarot often don’t interfere with each other. If I know that ע likes “a” around it and the silent ה “e” before it in this pattern, I can easily read מערה as [mah-ah-REH].

Then there are various suffixes and endings that can be attached to a word, and that could change the pattern as well, but that’s another story. It’s enough to say that there are rules for that too. And learning rules is much more pleasant process than learning seemingly random sequences of characters that are called meaningful words for some reason that escapes me completely.

Why I love Japanese

I absolutely love foreign languages, and I’m also exceptionally bad at them. It took me, what, like 20 years to get my English to this level, and that’s mostly thanks to practice. But I just had to. I used to play a lot of computer games, and Russian translations were so awful that it was easier to play in English even when I didn’t speak it. I used to read a lot of texts about programming, and they were naturally in English too. Then I had to talk to my European colleagues on phone with no interpreter. Then I watched anime with English subtitles because Russian were either unavailable or, again, ridiculously awful. So, yeah, I had to pick some English skills on the way.

Other languages are different. Like most programmers, I’m very lazy. I’d rather make my computer do my work for me, but, naturally, it can’t learn languages for me, and even if it could, it wouldn’t help me either. So… my usual way of learning a language is to find some learning material, ponder over it for a few minutes (or hours, if I feel really stubborn) and then just forget about that particular language for half a year or so. Naturally, it doesn’t work very well.

But… even when things are that way with German, French and Irish, they are just a little bit better with Japanese and Hebrew. Hebrew aside for the moment, I’d like to say a few things about Japanese.

In case you didn’t know, Japanese has two writing systems: kana and kanji. Kana is divided into two subsystems again: hiragana and katakana. Kana is a phonetic system with pretty much one-to-one correspondence between syllables and characters. There are few exceptions, but they are pretty well-defined too. Note that I say “syllables”, not “sounds”. That’s because Japanese doesn’t have separate sounds, so even if you really wanted, you couldn’t make up a word like the famous Russian “взбзднуть”, which roughly reads as “vzbzdnut’”. That’s right, six consonants at the beginning (and as far as I know, some Slavic languages even have words with no vowels at all). In Japanese you only have syllables that sound like consonant-vowel or just a vowel. And there’s also “n/m”, which is a separate syllable that is pronounced as “n” in most contexts and as “m” in the few remaining. This feature of Japanese language makes it sound very beautiful. It also makes it rather hard to adapt loanwords to it (lack of certain sounds doesn’t help either). Now you know why “the world” sounds like “za waarudo”. It goes like this:

  • Change ”th” to “z” because it’s the closest sound.
  • Change “e” to “a” because it doesn’t sound in English like “e” at all, and “a” is again the closest sound.
  • Change “or” to “a” for the same reason. Japanese does have long vowels, though, hence the “aa”.
  • There is no “L”, but Japanese “r” sounds very much like it. But there are no separate vowels, so it turns into “ru”. That’s because the “u” sound is considered the least noticeable vowel (sometimes it isn’t pronounced at all).
  • There is the “d” sound, but, alas, the syllable “du” sounds more like “zu”. That’s why “o” is used as the extra vowel here instead of “u”.

In case you wonder, hiragana and katakana differ only in their appearance, pretty much like printed letters differ from written ones. Hiragana is mostly used for regular writing (mixed with kanji), and katakana is typically reserved for loanwords and design purposes (banners, ads and so on).

And now we’re getting to my favorite part. Kanji. They are originally Chinese, and most of them still look identically to their original counterparts (I know of only one exception). They typically have several different readings, mostly grouped in two groups: original Chinese pronunciation, adapted to Japanese, and Japanese pronunciation, corresponding to the same word in Japanese like it was before there were any kanji at all. That’s why 道 can be pronounced as “dou” (read as “door” because “u” after “o” makes it long) or “michi”. To make things even worse, kanji in names often have totally different readings, which makes it next to impossible to read a Japanese name unless it comes with hiragana transcription known as furigana.

But that was the bad part. The good one is that, unlike most other languages in the world (except Chinese, naturally), Japanese words actually do make sense. Think about it. Does the word “two” make any sense to you? Why “two”? In Russian it’s “два”, which suggests a common origin. Without doing an etymological research, we can hardly say what that origin is. But even if we know it, we still can’t answer the basic question: why on earth does it sound like that? Why not “boom” or “swin”? Surely such a common word must be short, but there is no saying why it should sound or be written this way or that way. In Japanese, there is no reason why the word sounds like it does, but there is a good reason why it should be written like this:

It shouldn’t surprise you much that one, two, three are written as 一、ニ、三. Of course, things are not always that simple. For example, I have no idea why four is written as 四. There is certainly an etymological reason, but who needs one when you can clearly see two strokes in a square, and two squared is certainly four! So unlike Western languages, which lose original word meanings over time, kanji actually gain new meanings, and it’s up to your imagination how many explanations you can find.

It gets even better. Most kanji are actually composed of elements, and every element has its own meaning. There are just a couple hundred elements, as compared to about two thousand kanji in Japanese alone (much more in Chinese).

It makes learning Japanese (well, the kanji part at least) a very unusual thing. Instead of the usual cramming, you may ponder a lot over every character, trying to find new meanings behind it and see its internal beauty. There are many works already done on this topic, for example Henshall mnemonics or a very good book in Russian by Adyl Talyshkhanov. But there’s one thing about them: you gain much more by doing the same work yourself than by reading explanations written by someone else. And even if you do, it’s better to rediscover meanings again and again, rather than write them down once and refer to them each time. I’ve written down a lot of them (in Russian), but hardly remember any. For example, I encountered this word recently:


The characters correspond roughly to “super”, “ability” and “strength”. The most literal (and pretty appropriate) translation would be “superpower”, as in ESP and all that kind of stuff.

I have no problems whatsoever remembering the 力 part. It’s purely graphical: just imagine a strong man bending his arm to show off his muscles. However, the first two characters often leave me puzzled.

Let’s take 能 for example. It consists of 厶 (I, myself), 月 (month, moon, but Talyshkhanov sometimes refers to it as “meat”) and two of 匕 (which my dictionary lists as “spoon”, but it looks very much like a sitting person as well). Now, Talyshkhanov explains this riddle as “some people have the ability to see plain meat in this kanji, others have the skill to see a romantic crescent moon”. Ability, skill, talent—these are the meanings of this kanji. I don’t like that explanation much, and not just because it lacks “myself”—it can be easily reworded along the lines of “I may have the ability to see a moon here, but those people are just sitting there seeing nothing more than just meat”. But it just doesn’t click either way.

But I seem to have difficulty trying to find other explanations. The good part is, the more difficulty you have, the more time you spend on it, the better you remember it in the end. This process may get very enlightening. Say, forget about sitting people and remember the original “spoon” meaning. Now I can say that I have the ability to eat the Moon with two spoons. This is ridiculous enough to easily remember, but isn’t much enlightening. Or I could think how I worked for a whole month with two spoons to develop some ability. Or maybe I worked on my abilities so hard that I only ate two spoons in a whole month. Repeat that until just looking at this character makes you think of various abilities, even if it takes you spoon-feeding yourself with such nonsense for a whole month.

The first character, 超, is even more ridiculous. It consists of 土 (soil, earth, ground), something that isn’t considered a separate element, but looks suspiciously like a shoe or like a road from the 道 kanji, and then there are 刀 (sword) and 口 (mouth). If that is of any help, the ground combined with the “shoe” makes 走 (to run), and the sword combined with the mouth makes 召, which has ridiculously many meanings, such as seduce, call, send for, wear, put on, ride in, buy, eat, drink and even catch cold. For a laugh, try to imagine how you would do all that things with a sword and a mouth, like seducing someone with your mouth, holding a sword in your hand to make sure that your words come through.

That, however, doesn’t help us understand why 超 stands for transcend or super. Of course, one may think that running is walking super-fast and putting a sword in your mouth transcends all boundaries of reasonable, but it still lacks something. It’s a bit more reasonable to imagine a samurai with a sword, running and shouting something. Such a picture certainly suggests that he transcended a certain level of samurai-ness.

The whole word put together leads us to thinking of a strong superman with a sword running on the ground in moonlight, holding a couple of spoons in his mouth, because apparently it’s his special skill, or something like that.

MVC, MVP and MVVM, pt. 1: The Ideas

There is a lot of confusion going on about GUI design patterns such as Model–View–Controller, Model–View–Presenter and Model–View–View Model. I’m starting this series of blog posts to share my own knowledge and experience with these patterns, hoping to clear up things a bit. I’m not going to dive deep into the history behind these patterns. Instead, I’m going to concentrate on things as they are today.

I’ll start with the ideas behind these patterns. There is one single idea behind them all: separation of concerns. It’s a well-known idiom, closely related to the single responsibility principle, the S part of the SOLID principles. The most clear form of it says: there should be only one reason for a class to change. Separation of concerns takes that to the architecture level: there should be only one reason for a layer to change. The granularity of that reason is different, though. One may say: the Money class should only change if the logic of working with money changes. On the architecture level one would say instead: the view layer should only change when appearance should change (for example, money should now be displayed using a fixed-width font). In particular, the view layer should not change if the business logic changes (money should now be calculated to 2nd digit after the decimal point) or if presentation logic changes (money should now be formatted with 1 digit after the decimal point).


With these ideas in mind, let’s go over the three patterns, starting with MVC. It’s probably the most confusing of them all, and I think it’s mainly because separation of concerns is not complete in MVC.


To add to the confusion, there are many variations of MVC, and there is no single agreement on what exactly the components do. The view is the easiest part: its job is to display things and receive interactions from the user. You can’t really separate these two concerns: how would you separate displaying text that the user is editing and actually editing this text? There should be a single graphical component that does both of these things. You can do the next best thing, though: delegate user interactions to another component. And here is where the controller comes from.

The controller receives user interactions from the view and processes them. Depending on the interface between the view and the controller you may be able to reduce coupling between them, and that’s a good thing. Suppose your view is implemented with Swing, and there is the apply button. Instead of making the controller implement ActionListener, implement it inside the view and delegate the apply button click to the apply method of the controller, which is UI-agnostic (it doesn’t depend on Swing at all).

That was the easy part. But what happens next? The controller acts on the model, which contains the actual data the application works with. Then, at some point, it may be needed to display the updated data back to the user. Here is where the confusion starts. One possibility is that there is the observer pattern acting between the view and the model. In this case, the view subscribes to certain events of the model, and the model either sends the updated data to the view (the push model of the observer pattern) or just events (the pull model). In the latter case, the view needs to pull the necessary data whenever it receives the appropriate event.

Note that even though the model sends data to the view, it has no idea of its existence because of the observer pattern. This is especially important if the model is in fact the domain model, which should be isolated as much as possible. It should only communicate to the outside world through clean interfaces that belong to the model itself.

Another variation of the MVC pattern is often seen in web frameworks, such as Spring MVC. In this case, the model is a simple DTO (data transfer object), basically a hash map, easily serialized into JSON or something. The controller prepares the model and sends it to the view. Sometimes it’s just a matter of passing the object inside a single process, but sometimes the model is literally sent over the wire. This is different from a typical desktop observer approach where the controller doesn’t even know anything about the view. To keep this coupling loose, the controller often doesn’t send the model directly to the view, but rather sends it to the framework which then picks up an appropriate view and passes the model to it.


What makes this pattern especially confusing is that the model is no longer the domain model. Rather we have two models now: the M part of the MVC pattern is the data transfer model, whereas the controller acts on the domain model (maybe indirectly through a service layer), gets back the updated data, then packs that data into a DTO and passes it to the view to display. This very idea of the data transfer model is exactly what makes this pattern so suitable for web applications, where the controller may not even know in the advance what to do with the data: you may have to wrap it into HTML and send to the browser, or maybe you serialize it into JSON and send it as a REST response.

Either way, one problem with MVC is that view is too complicated. One thing about UI is that it tends to be both heavyweight and volatile, so you usually want to keep it as clean as possible. In MVC, view doesn’t only display data but it also has the responsibility of pulling that data from the model. That means the view has two reasons to change: either requirements for displaying data are changed or the representation of that data is changed. If the model is the domain model, then it’s especially bad: the UI should not depend on how data is organized in the domain model. If the model is a DTO model, then it’s not that bad, but it still can be changed, for example, to accommodate the need for a new view (or a REST client). Still, MVC is often the best choice for web applications, and therefore is the primary pattern of many web frameworks.

One major disadvantage of MVC is that the view is not completely devoid of logic, and therefore it can be hard to test, especially when it comes to unit tests. Another disadvantage is that you have to reimplement all that logic if you’re porting your view to another tech (say, Swing to JavaFX).


One natural way to improve MVC is to reduce the coupling between the view and the model. If we make a rule that all interactions between the view and the model must go through the controller, then the controller becomes the single point for implementing presentation logic. That’s what we call a presenter. The term presentation logic refers to any kind of logic that is directly related to the UI but not to the way how the components actually look (that’s the view’s responsibility). For example, we may have a rule that if a certain value exceeds a certain threshold, then it should be displayed in red color. We split this rule into three parts:

  1. If a value exceeds a certain threshold, then it’s too high.
  2. If it’s too high, then it should be displayed in a special way.
  3. If it’s too high, then it should be displayed in red.

The first part is the domain logic. It could be implemented, say, with an isTooHigh method, but it really depends on the domain. The second part is the presentation logic, and if it looks like a generalization of the third part, that’s exactly what it is. The presenter knows from the model that the value is too high, and therefore, passes it to the view with some kind of Status.TOO_HIGH enum constant. Note that it has no idea of colors yet. It’s the job of the view to actually map that constant to a color. Or maybe it could be something else than a color, like a warning sign next to the value.


In the MVP pattern, the view is completely decoupled from the model. The presenter is something similar to the mediator pattern. In fact, if the view is implemented as a set of independent graphical components (like multiple windows), and the model also consists of multiple objects (as it almost always the case), it would be exactly the mediator pattern.

Unlike MVC, there is no observer pattern between the presenter and the view. The reason for this is that the view contains so little logic there is no place for any event handling there, except for view-specific UI events (button clicks etc.). It’s the presenter’s job to figure out when to update the view and do so by calling appropriate methods. These methods typically belong to an interface fully defined by the presenter, which is an excellent example of the dependency inversion principle (the D in SOLID). The presenter doesn’t depend on any technologies the view uses. Well, in theory at least. For example, it would be very tricky to implement exactly the same interface with Swing, JavaFX and HTML. How do you call methods on HTML? You could have some server-side object that sends the data to the browser using AJAX or even Web Socket, I suppose, but it would be very tricky and at the same time not as powerful as MVC, where controller is free of presentation logic and therefore can be shared between views with different presentation needs (such as HTML and JSON).

The positive side is that since all presentation logic is in one place, porting to another view tech is a breeze. You just reimplement your view interface with another tech, and you got it. Well, that’s at least in theory. In practice you may run into various problems. Threading, for example. Who is responsible that view methods are only called in the appropriate threads? Should the view enforce that? Probably yes, because the presenter has no way of figuring out which thread is right if it has no idea what GUI framework is used in the first place. But that imposes additional burden on the view. But still, MVP is probably as close as you can get to the perfection of total independence from the GUI framework used.

The bad news is that presenter now contains a lot of boilerplate code. It was a part of the view before, so it’s not like it became any worse than it was with MVC, but still it’s always a nice idea to get rid of as much boilerplate code as possible. That’s where MVVM comes into the picture.

Model–View–View Model


MVVM is basically the same thing as MVP, except for one major difference. In MVP, the view only delegates user interactions to the presenter. Whenever the feedback is needed, it’s the presenter who takes action. It does it by literally calling methods on the view such as displayFilesList(files), setApplyEnabled(true), setConnectionStatus(ConnectionStatus.GOOD) and so on. That’s boilerplate code. With MVVM, the presenter becomes the view model, that is, a model that provides access to the ready-to-display data through the observer pattern, much like in desktop MVC. Except that now the view model can really prepare that data for display by filtering, sorting, formatting etc. So whatever presentation logic was in the view in MVC, it’s now in the view model. And while in MVP the presenter pushed that data from to the view, in MVVM the view pulls that data from the view model. This sounds like we’re adding responsibility to the view, and that’s a Bad Thing, right? Well, to a certain extent, yes. But the point is, this responsibility is typically almost entirely implemented by the framework. This is done through data binding, where you just specify that this component should display that property of the view model, and that’s basically it.

When your framework doesn’t support data binding, it’s usually a bad idea to use MVVM because you’ll essentially be moving the boilerplate code from the presenter to the view, which is indeed a Bad Thing. And even if you have data binding, it’s usually not that simple. Sometimes you have complicated structures to bind. Sometimes the order of updates is important and you have race conditions in your UI. Sometimes you have values of custom types that are tricky to display directly, you need to employ some sort of converters for that.

The good part is that with MVVM you typically only have problems when you have a non-standard situation. For most cases, it really decreases the amount of boilerplate code and displaying a person’s name in a text field becomes as simple as writing Text=”{Binding Person.Name}” in XAML.

Moreover, delegating user interactions is often implemented with data binding too. Well, as I say “often”, I really can’t think of any other MVVM implementation than .Net/WPF, so I guess it’s 100% of all cases, even though there is only one case in total! Nevertheless, using the command pattern, we can expose possible interactions as properties of the view model. The view then binds to them and executes appropriate commands when the user does something. One big advantage of it is that we can easily change these commands dynamically and the view will automatically update its interactions.

When choosing between MVVM and MVP, it’s important to consider several factors:

  • If your framework doesn’t support data binding, MVVM is probably a bad idea.
  • If it does, then how likely that you’ll want to switch UIs? How painful is it likely to be?
  • How difficult it would be to port your application from MVVM to MVP or MVC or vice versa?

All things being equal, it’s often the case that reimplementing the view interface for a new framework in MVP pattern is about as hard as switching from MVVM to MVP or whatever. In this case, it’s probably worth to use MVVM if that’s the thing with your framework. The same really goes about using MVC. When your framework offers you MVC, you probably don’t want to force yourself to use MVP instead unless you really plan to switch frameworks and design for it beforehand. Say, you’re using Swing right now, but you know you’ll have to move to JavaFX within 5 years.

One last thing to note is that it is possible to combine these patterns, although in most cases it’s likely to lead to over-engineering. For example, if your framework forces you to use MVC, you can really turn your controller into a view model, and then consider the whole view–view model part to be just a view for the MVP pattern. So when user does something, the view delegates that to the controller, which immediately delegates to the presenter. When the presenter gets updated data from the domain model, it sends that data to the controller (using a clean interface), which then stores it locally and fires an event to the actual view which pulls that data to actually display it. Sounds crazy enough as it is, doesn’t it? But sometimes it may be worth it, only experience can tell you. It’s probably best to start with the simplest thing possible, and complicate things only when you actually need it.

That’s it for now. I plan later to demonstrate all three patterns with a simple application. I’ll probably use Java for that, even though implementing MVVM would be tricky. But there is some limited data binding in Java, so it could actually work, if only for demonstration purposes.

Hide and show JTable columns, pt. 2

Now that we’re done with the menu, let’s make it work. The simplest test I can think of clicks on a menu item and checks that the number of visible columns has decreased:

I’ve changed getMenuItems to accept JTable instead of JPopupMenu to make the code less verbose. This test obviously fails as the menu does nothing yet. On one hand, the fix is not so simple. We need to add appropriate listeners to the items. On the other hand, something as silly as this will do the trick:

Okay, this is really silly. But our test passes now, and that means the problem is with our test. Why not use a data provider to parameterize our test?

This is better. Now it fails again. If you’re wondering about DP_COLUMN_INDEXES, it’s just a constant set to “columnIndexes”. Why bother? Because I don’t like hardcoding function names. What if I need to rename it later? Like this, I can name the provider function anything I want. Of course, it’s better to rename and change the constant as well, just to keep it consistent. But that’s just three changes: function name, constant name and the constant literal itself. And even if I forget to touch the constant, nothing breaks.

OK, so now the fix looks like this:

But it’s still obviously wrong. Well, maybe not so obviously, but remember that Swing has two coordinate systems: the view and the model. Columns can be rearranged in the view, so the mapping can change, even though it’s the identity by default. So we need yet another test. But first, we need to decide whether we want the menu items to rearrange when the columns are rearranged. I think not. First of all, it’s too much of a hassle. And it can be confusing for the user too. Besides, the menu contains all columns, and the view only some of them. Where do we put the hidden columns in the menu if they are not a part of the view order? So let’s keep them in the model order.

Now I have refactored setup code into a separate method. I haven’t annotated it with @BeforeMethod because it’s not a universal test fixture: the install method checks that the menu is not null, so obviously we can’t use it even before the test is started. I’ve introduced an @AfterMethod cleanup method, though (not shown), that just sets everything to null, just to be 100% sure that one test can’t possibly use something created by another.

Now to the fix:

I actually got it wrong on the first try—put the conversion call outside the listener. That froze vIndex forever, which is obviously not what we want. Note that this one of the cases where Hungarian notation is tremendously useful: both model and view indexes have the same type, so Hungarian notation lets us see immediately what kind of index we’re looking at.

Now I don’t really like the looks of the install method. And we haven’t even got to showing hidden columns. So let’s refactor it a little bit.

This is much better. More methods, the code is much longer, but each method is pretty clean and readable. Note that I’ve created a field for the JTable. It was captured by the lambda anyway, so I haven’t actually introduced any new state here. Just moved it from the lambda to the top-level class.

Now let’s check that the columns are properly shown (which, of course, they aren’t).

This fails, but with a very obscure message: java.lang.ArrayIndexOutOfBoundsException: -1. Why? Because the second click is trying to hide an already hidden column. Can we do something about this message? Well, not by modifying the testing code. But we can modify the hideColumn method! What should it do when the column is already hidden? Nothing? Or throw an exception? And should we even bother at all? Probably not at this point. Later on, we may want to make that method public or add some other API to hide and show columns programmatically. Then we’ll have to solve this problem. For now, let’s bear with the obscure message and fix the class.

The change is not trivial. I had to add a hash map of removed columns so we can show them later. The key is the model index.

The action listener became a bit large, so I should either refactor it into an inner class or make it smaller. It’s not large enough to justify a class, so I’ll opt for another private method:

This is better.

We got it almost working! One last obvious thing is that we append the column to the end when we show it. This is not good. But where should we put it? The order of columns could have changed since we hid it. There is no perfect solution, so I’ll choose a simple one: put it where it belongs if the columns are not rearranged, otherwise put it somewhere reasonable. With this vague requirement we only need to test that the column reappears in its place if we keep the model order, and that it reappears at all otherwise.

Let’s start with the simple case.

And the fix is:

Now this can easily break if the model index is not a valid view index. How? Well, imagine that we removed some other columns and the column count is now very low. If the column removed was somewhere near the end, its model index may be very well outside the bounds now. Let’s test it.

The fix:

At this point I felt like this thing is going to work now. So I ran the demo I created last time. But as I tried to hide a column I got this:

The funny thing is, there are no our methods in this trace! That shows clear enough that TDD is not a magic silver bullet. Even though from our tests we expected that our code should work fine at least under normal circumstances, it crashes immediately. Why? After investigating a little bit, I think it’s because of a subtle bug in Swing. When we right-click on a column to show the menu, it thinks we’re about to drag that column. Indeed, dragging with right mouse button works, sort of. Sometimes it leaves the column floating in mid-drag. Then, after the column is removed, dragging breaks because the column has no valid view index (hence the index out of bounds exception).

There are different possible workarounds. We could try to install a custom table header that would override the getDraggedColumn method and return null if the column is hidden. But that would prevent users from using their own header. Of course, we could wrap the existing header into our own. But that would require delegating all of its methods to the wrapped instance, and there is a lot of them.

Another possible way is to consume the right click to prevent it from dragging anything. Alas, the default event handler is the first in the line. By the time we consume the event, it’s too late.

A really silly way is to just set dragged column to null whenever we hide it. It’s so simple and stupid it might actually work. Let’s try it.

Yay! It works, and with no visible glitches too.

At this point I’d like to conclude this self-educating tutorial. Of course, there is a lot of things still to be done, such as: provide a way to uninstall it, check what happens if we install it twice on the same table or try to reinstall on another, provide an API to show and hide columns from code, prevent the user from hiding the last visible column (or the header will disappear and it will be impossible to get them back). These are just the ones I can think of right off the bat.

For those interested to improve it or study the full history of its evolution, the code is available at

The point at which this tutorial ends is tagged tutorial-pt2.

Hide and show JTable columns

Swing is getting old, but is still widely used. To my surprise, it turns out that its JTable doesn’t support hiding and showing columns at user’s whim. Well, it’s time to fix that!

Note that a similar work has already been done. So the purpose of this post is mainly educational. I’m going to do it using TDD and keeping the code as clean as I can. But first, it’s time for some design.

What we need is a menu. So it looks reasonable to extend JPopupMenu. However, we need more than that. We also need some boilerplate logic that will bind that menu together with JTable. We can extend JTable to do that, but that doesn’t sound like a good idea because that would prevent anyone with their own derivatives of JTable from using our code.

So it looks like ideally we would like to have a class that we could instantiate and install on a JTable to handle all that logic. Perhaps it will use a separate class for a menu, perhaps some other classes. Let’s not bother with these details for now. Instead, we should think of a name. JTableColumnSelector sounds fine: the J hints that it’s a Swing class, and it openly tells us that it is used to select columns. Maybe it is not very clear that it hides or shows columns, but JTableColumnHiderShower just doesn’t sound right, and besides, shower is something entirely different.

Before I begin, I should mention that for TDD I’m using TestNG, AssertJ and Mockito. That’s my usual set of tools.

The first TDD iteration looks rather stupid: write a single line test with new JTableColumnSelector(), then make it compile by creating an empty class. At this point I’m making an important design decision: by choosing to use a no-args constructor, I’m making life easier for anyone willing to extend my class. Because I’m going to have a separate install method instead of passing a JTable directly to the constructor, it is guaranteed that install will only be called after the object is fully initialized.

Speaking of install method, we need another test:

What I like about Swing here is that it doesn’t really care that we’re calling its methods from a random testing thread. As long as it’s just one random thread, it runs just fine. What I don’t like about it, though, is that this very same feature makes it fail mysteriously at random moments when you call its methods from a wrong thread, thus violating the rule of repair terribly. Useful for testing, dangerous in production, as it often happens. Ideally, there should be a way to control this behavior, something like -Djavax.swing.allowCallsFromAnyThread.

But let’s get back to TDD. To make the test above pass, we need the appropriate method. And I also correct the constructor javadoc while I’m at it:

Now we need to test that it does what it should do. What should it do? Well, for one thing, it must create a popup menu on the table header, so let’s test it:

It fails. Good! Now let’s fix it:

Now we need to check that the menu contains… what? Obviously, a list of items. There should be as many of them as there are columns in the model. Wait, our table doesn’t have a model yet. So maybe here is where we should start using Mockito:

Here, I set A_REASONABLE_COLUMN_COUNT to 10. The test fails, but isn’t terribly readable, so I’m going to refactor it a bit first.

This looks a bit better. Now we need to make it pass.

OK, what next? I’m worried about two things now. First, the model might be null. Will getColumnCount() properly return zero or will it just throw a NullPointerException? And is it even a good idea to ask the table about column count? Shouldn’t we ask the model instead? What if some columns are already hidden by some other code? Should we display them in our menu? Let’s assume for now that we want to list all model columns. But then the code is wrong and we need a test that shows it.

Another thing I’m worried about is that we incorrectly created JMenuItems, while we should have used JCheckBoxMenuItem or whatever it’s really called. But that should become apparent later, when we start selecting menu items. So let’s deal with column counts now.

It fails. Cool. Let’s fix:

Now we have a real problem if model is null. We need another test for that:

Hmm… It passes! Why? Oh, I forgot that the model can’t be null! The table creates a default empty model for that case. Good. I hate nulls. But then we need to rename our test. installsProperlyWhenTableHasDefaultEmptyModel is a bit too long, but descriptive enough, so I’ll keep it.

Cool. Now let’s get back to install test. We need to check that all menu items have the right labels. But first we need to make our mock return that labels. Unfortunately it isn’t terribly easy to do with Mockito. No, wait, it’s actually easy, but not very elegant:

Maybe I should have used a real model instead of the mock. But it doesn’t that bad, so I’ll keep it like this for now. Only refactor this ugly class into a nested static class.

Now we need a couple of helper methods to extract column names from both the model and the menu. The lists should be the same. I’m feeling functional, so these methods turned like this:

And the new test is:

This is appended to the end of install, but I’m not repeating everything again and again. The proponents of the one-assert-per test idiom are probably cursing me now, but I think I’m doing the right thing here: I’m still testing that this thing installs properly. If I need three asserts for that, so be it!

Of course the test fails, and with a clear message too except that it’s too long. So I’ve changed A_REASONABLE_COLUMN_COUNT to 3. Now we have to fix it the test. Just one line has to be changed:

And now for the last piece of installation. We need to check that all of the menu items are selected. We need another helper method for that. Or maybe I’ll refactor this one:

And the test is now:

Oops! Looks like JMenuItem has isSelected method too, just like JCheckBoxMenuItem or whatever. Well, for now let’s just fix the test:

This goes into the loop of the install method. OK, what about the wrong class? We could just continue and then let it surface later, perhaps during manual testing. But I find it rather silly. Since I’ve noticed it already, why not fix it now? Changing JMenuItem to JCheckBoxMenuItem everywhere in the test seems to do the trick. Now the test fails with a clear message: javax.swing.JMenuItem cannot be cast to javax.swing.JCheckBoxMenuItem. Cool.

That’s it for today, except that I want to see how it looks on the screen, so I create a very simple demo:

Aaaaaand… it works!


Next time we will add some logic to make it really work and do what it’s supposed to.

Getting started with JavaFX 8 custom controls

I need to develop a custom control for JavaFX 8. Unfortunately, most of the tutorials concentrate on the FXML way to do it, but I need to code in some custom painting.

How would I do it in Swing? Extend some base class and override paint. That’s it. In JavaFX, the right way seems to be overriding two classes: the control itself and the skin. OK, this actually looks like a good idea: the control is responsible for behavior, and the skin is responsible for the painting. So let’s look at the skin API:

What? Where is the paint method? According to the docs, getSkinnable() simply returns the associated control, dispose() detaches the skin from the control and getNode() “Gets the Node which represents this Skin”. What the…? So we have one node that is the control itself and another node which is the skin? I hope we don’t need to skin the skin, considering that it’s a kind of node itself!

After looking at some examples, I got the general idea. The skin is just a bunch of nodes, and getNode() just returns the root node. If you want to really customize your paining, you can always use a canvas as a skin. But I decided to try to use some shape nodes instead.

OK, I can create some shapes, put then into a Group, for example, and then what? The skin obviously needs to handle resizing. But how does it know when to resize exactly? I could just subscribe to the control’s width and height properties (and unsubscribe in dispose). But that feels ugly. Still, Han Solo himself does exactly that, so maybe it’s the right way after all?

After trying a lot of various things, I still couldn’t get it right:

  • If I just put my shapes into a Group, the control doesn’t resize properly.
  • If I put my shapes into a Group and inherit from SkinBase instead of implementing Skin, the control does resize, but…
  • All shapes are centered and I can’t position them. Looking at SkinBase sources, turns out it’s hardcoded.
  • If I draw a vertical line of length exactly equal to the control’s height, the control automatically increases its size by one pixel at each repaint. So if I keep resizing it horizontally, for example, it keeps growing vertically forever.

All of that didn’t make any sense. After further studying SkinBase sources, I got a feeling that a skin acts like a layout manager. That is, it’s responsible for managing the relative positions of its children. It is done by applying the appropriate transformations the result of which can be queried by calling getLayoutX() and getLayoutY() on the components.

Another thing is that SkinBase cheats around getChildren() being protected in the Control class. That allows it to directly manipulate the children of the control—no Group needed.

So in the end I concluded that:

  • A skin is best implemented by inheriting SkinBase.
  • To add components, just call getChildren().addAll(children).
  • To position the components needed to draw the skin, override layoutChildren. From it, call layoutInArea for every child that needs to be positioned.
  • All shapes should be drawn in an imaginary coordinate system that is tied to the shape itself. If you need a line, you might as well start it from (0, 0). layoutInArea will move it to the required position anyway, so the lines (0, 0)–(10, 10) and (10, 10)–(20, 20) will look exactly the same in the end.

The resulting control prototype is this:

The resulting graphics:


As you can see, it resizes nicely and the lines are positioned exactly as I want them.

P. S. Further prototyping revealed that it still resizes randomly sometimes, especially as I update values and/or resize window with lots of controls in it. The reason is that by default, SkinBase calculates preferred width/height based on preferred widths/heights of its children. The problem is that preferred width/height of a primitive equals to its actual size (since it’s not directly resizable). Therefore, once a control is resized, its preferred size is now different. If it was the same size as other controls before that, not only it’s no longer the case, but the preferred sizes are different, so layout gives different sizes to different controls. This is repeated on each resize, which leads to a funny “rich get richer” scenario where bigger controls are given more and more space because their preferred size is greater. This issue is fixed by overriding computePrefWidth/Height to return something sensible.

Replacing javadoc for a Maven artifact

I was playing around with a library depending on the JMS API. It downloaded geronimo-jms_1.1_spec-1.1.1.jar as a dependency. Unfortunately, this JAR goes with a javadoc JAR that is virtually empty! It is actually there, but I couldn’t find anything useful in it. And NetBeans IDE insists on displaying javadoc from nowhere else.

Turns out it is quite easy to replace this abomination with docs from the Java EE SDK:

  1. Download the JDK.
  2. Pack the contents of the glassfish4\docs\api dir into a ZIP (the root must contain the package-list file).
  3. Rename it to geronimo-jms_1.1_spec-1.1.1-javadoc.jar.
  4. Move it to %USERPROFILE%\.m2\repository\org\apache\geronimo\specs\geronimo-jms_1.1_spec\1.1.1\.
  5. Nuke the geronimo-jms_1.1_spec-1.1.1-javadoc.jar.sha1 file there, just in case somebody checks the hash and finds out it’s wrong. Or recompute it and edit the file if you feel like it, but it seemed to work fine for me without the file.
  6. Restart the IDE and enjoy the well-written docs.

Of course, it should work just as well with any random JAR file. Of course, Maven can re-download the file, but why should it? Unless you move to another version, everything should be fine.