Monthly Archives: January 2016

Closest Binary Search Tree Value II

Continuing on to the next interesting problem that took me a while to solve because I never actually done iterative tree traversals before, although I was aware that there is such a thing and knew the general idea how to do it.

The problem is to find k values in a BST that are closest to the given target. The target is a double, and the nodes are integers so the tree may or may not contain the target itself.

The linear solution is so trivial that I didn’t even try to do it. Indeed, just perform an in-order traversal, keep the last k values in some sort of ring buffer array and terminate when the next value is worse than the worst so far.

What was interesting is how to do that faster. Or not really faster because the test cases seem to be tailored for the linear solution, but still, how to get better time complexity?

The idea is pretty obvious. We need to find the target or some value close to the target (previous or next, doesn’t matter) and then look in the neighborhood for k closest values. But how do we look in the neighborhood. If we do recursive binary search, then we’ll lose all information about where we find it once the recursion returns. The best we can get is a reference to the found node, but no way to get back up. So it looks like we need iterative binary search.

The iterative binary search itself is very, very easy. But we also need some way to keep track of where we are, so we need a stack. The first part would look somewhat like this then:

This locates either the target or the next value before or after the target. So now what? And here is where I got lost. A typical iterative in-order traversal would look like this, if starting from the root:

However, the stack here and the stack I got from the binary search above is not the same stack! Here we save on stack space by only pushing previous elements if we know we’ll have to get back to them eventually. And we know that it’s only when we go left, because when we go right, we won’t have to return to the already processed elements. So when we hit a dead end when going right and pop the next element from the stack, it magically takes us to the parent of the current subtree, not the parent of the current element.

For example, consider the following tree.

tree

If we traverse this tree using the algorithm above, we first push left children into the stack until we hit null. That gives us the following pictures:

tree1tree2

tree3

And then we start popping elements and processing them (the second branch). The first element is a leaf, so it doesn’t have a right child, and therefore on the next iteration we pop another one, thus processing -1, -2:

tree4tree5

The current element (-2) now has a right child, so we go there instead and push it onto stack before going left. However, since there is nowhere to go, we immediately pop it back and process it:

tree6tree7

Now look at this! The current element is -1, but the stack only contains the root! So the next thing we do is pop it and jump all the way back up, just right to the next element in the sequence.

However, when performing a binary search and pushing elements to the stack, we never get a stack that looks like this. In fact, the stack always contains every element in the sequence leading up to the root, node-by-node. So at first glance, the stack I got by performing binary search, was not very suitable for this in-order traversal algorithm. Indeed, if I were looking for, say, -1.5, then I’d end up with something like this:

treebHere the green element is the current element at which the search stops. In fact, I now have the two closest elements to the target: one is at the top of the stack, another one is the leaf I’m at. However, if I was to perform an in-order traversal using the algorithm above, I’d quickly end up in trouble. What would the algorithm do? It would first push -1 to the stack. Then it’d pop and process it. Then it’d try to go right, but there is nowhere to go. So it’d pop another element from the stack. But that’s -2, which doesn’t come in the right order!

Now at this point, what I was supposed to do, and what most people at LeetCode did, is to create two stacks instead of one when performing the binary search. Then the whole thing would look like this:

Now why does this work? It works because the two stacks follow the same rule as the stacks used during the classic iterative traversal: when we do in-order traversal, we only push elements to the stack if we are going left, and that’s exactly what stackGT follows here, so it can be later used to perform an in-order traversal starting at the point where we stopped. The same goes for stackLE, but for reverse in-order.

But as I’ve said, I didn’t quite get it at the time, so I invented a rather unorthodox approach. I noticed that even though the stack I got was unsuitable for the typical space-saving concise iterative algorithm, it was just the type of the stack used for recursive solutions! Indeed, recursion always unavoidably stores everything in the stack simply because there’s no way around that. Consider a typical recursive algorithm:

How does it work for the tree above? It pushes 0 first. Then it goes left. There it pushes -2. Goes left again. Pushes -3. There is no left, so it processes -3 and tries to go right. But there is no right, so it pops -3 and goes up. So far it’s no different from the iterative approach.

But then things change. When it returns to -2, it processes it without actually popping it and then pushes -1. So at that point we end up with exactly the same stack as in the binary search algorithm. Why is that? How is that it works and the iterative algorithm doesn’t?

The answer is that the iterative algorithm lacks one thing: the return address. Remember that the computer doesn’t only push function arguments to the stack. It also pushes the return address, so when it pops the stack, it instantly knows what to do next: process the current value (if returning from the left subtree) or exit (if returning from the right one).

So it looks like I could add some imaginary return address to the stack in order to fully emulate the recursive algorithm. But I could do even better than that. When I pop the stack, I can compare left and right references of the popped node to the current node. If the left reference equals to the current one, that means I am returning from the left subtree, otherwise I am returning from the right one.

Note that I still need to perform two traversals: in-order and reverse in-order to locate the closest elements. So even if I have just one stack during the binary search, I need to make a copy of it to proceed further. The resulting solution was this:

It isn’t as beautiful as the “right”, but has the same time complexity and is a fun one. Runs in the same time (6 ms). I have no idea why most solutions only run for 5 ms, but then again it’s probably within the margin of error.

Either way, now I’m more than familiar with both recursive and iterative traversal algorithms and can even emulate one through the other, so that’s one step further towards becoming a better programmer.