Lets assume we have a set of points and we want to find all of the points that are contained in a query rectangle. And we also want to find the closest point to a query point. These are called "range search" and "nearest-neighbour search". I'll show the usage of a 2d-tree for this problem.
Kd-trees have many applications including computer animation and image retrieval.
Lets start with a brute-force approach and look at all the points in the input without any efficient data structure for the specific problem:
public class PointSET {
private TreeSet set = new TreeSet<>();
public void insert(Point2D point) {
if (point == null) {
throw new IllegalArgumentException();
}
set.add(point);
}
public boolean contains(Point2D point) {
if (point == null) {
throw new IllegalArgumentException();
}
return set.contains(point);
}
public Iterable range(RectHV rect) {
if (rect == null) {
throw new IllegalArgumentException();
}
List result = new ArrayList<>();
Iterator iterator = set.iterator();
while (iterator.hasNext()) {
Point2D current = iterator.next();
if (rect.contains(current)) {
result.add(current);
}
}
return result;
}
//..
}
Since we use a search tree, insertion and contains operation has logarithmic complexity. This is good but what about the "range" operation? It's comlexity is O(n) and we need to use a faster way.
I'll show the usage of
2d-tree which is an extension of a binary search tree. It helps us to perform efficient operations on two dimensional keys.
public class KdTree {
private Node root;
private class Node {
private Point2D point;
private Node left;
private Node right;
private boolean horizontal;
public Node(Point2D point2D, boolean horizontal) {
this.point = point2D;
this.horizontal = horizontal;
}
}
public boolean contains(Point2D point) {
if (point == null) {
throw new IllegalArgumentException();
}
Node current = root;
while (current != null) {
if (current.point.equals(point)) {
return true;
}
if (current.horizontal) {
if (point.y() >= current.point.y()) {
current = current.right;
}
else if (point.y() < current.point.y()) {
current = current.left;
}
}
else { // vertical cut
if (point.x() >= current.point.x()) {
current = current.right;
}
else if (point.x() < current.point.x()) {
current = current.left;
}
}
}
return false;
}
As we see, each node in the tree cuts the plane in two halves. But if a parent key cuts horizontally, the child node cuts the plane vertically. This is a very nice storage of the points in the tree structure. The contains operation gives an idea of the organization of the nodes in the tree. Lets look at the insert operation:
public void insert(Point2D point) {
if (point == null) {
throw new IllegalArgumentException();
}
if (root == null) {
Node node = new Node(point, false, null);
++size;
root = node;
return;
}
doInsert(point, root);
}
private void doInsert(Point2D point, Node current) {
if (point.equals(current.point)) {
return;
}
if (current.horizontal) {
if (point.y() >= current.point.y() && current.right == null) {
Node newNode = new Node(point, false, current);
current.right = newNode;
++size;
return;
}
if (point.y() < current.point.y() && current.left == null) {
Node newNode = new Node(point, false, current);
current.left = newNode;
++size;
return;
}
if (point.y() >= current.point.y()) {
doInsert(point, current.right);
}
else {
doInsert(point, current.left);
}
}
else {
if (point.x() >= current.point.x() && current.right == null) {
Node newNode = new Node(point, true, current);
current.right = newNode;
++size;
return;
}
if (point.x() < current.point.x() && current.left == null) {
Node newNode = new Node(point, true, current);
current.left = newNode;
++size;
return;
}
if (point.x() >= current.point.x()) {
doInsert(point, current.right);
}
else {
doInsert(point, current.left);
}
}
}
If the current node cuts the plane horizontally, we look at the y coordinate and call the recursive method with the right or the left node. Now lets have a look at the range operation. Following image shows the tree structure and the query rectangle:
We start with looking at the point 1 and see that the rectangle is on the left. Then we look at the point 3 and since it cuts the plane horizontally we see that the rectange intersects the up and down of this point. We need to look at both sides. We look at points 4 and 6, then the point 5. When we look at point 4, the rectangle is on the left so we look at point 5.
It is obvious that we haven't consider the right side of the point 1, because the rectange cannot be there. But in the brute force approach, we were looking at all the points!
public Iterable range(RectHV rect) {
if (rect == null) {
throw new IllegalArgumentException();
}
rangePoints = new ArrayList<>();
doFindRange(rect, root);
return rangePoints;
}
private void doFindRange(RectHV rect, Node current) {
if (current == null) {
return;
}
if (rect.contains(current.point)) {
rangePoints.add(current.point);
}
if (current.horizontal) {
// between
if (current.point.y() >= rect.ymin() && current.point.y() <= rect.ymax()) {
doFindRange(rect, current.left);
doFindRange(rect, current.right);
}
else if (rect.ymin() > current.point.y() && rect.ymax() > current.point.y()) {
doFindRange(rect, current.right);
}
else {
doFindRange(rect, current.left);
}
}
else {
// between
if (current.point.x() >= rect.xmin() && current.point.x() <= rect.xmax()) {
doFindRange(rect, current.left);
doFindRange(rect, current.right);
}
else if (rect.xmin() > current.point.x() && rect.xmax() > current.point.x()) {
doFindRange(rect, current.right);
}
else {
doFindRange(rect, current.left);
}
}
}
Finding the nearest point is similar but sometimes we need to look at both left and right subtrees. If we are getting closer, there is no need to look at the other subtree. If we don't utilize pruning, there is no point in using a tree because then we'll be looking at all the points like the brute-force approach. Below are two example unit tests:
@Test
public void shouldFindPointsInRange() {
KdTree kdTree = new KdTree();
kdTree.insert(new Point2D(0.3,0.3));
kdTree.insert(new Point2D(0.1,0.6));
kdTree.insert(new Point2D(0.2,0.8));
kdTree.insert(new Point2D(0.7,0.3));
kdTree.insert(new Point2D(0.8,0.5));
kdTree.insert(new Point2D(0.31,0.4));
RectHV rectHV = new RectHV(0.4,0.4, 1, 1);
Iterable result = kdTree.range(rectHV);
assertEquals(new Point2D(0.8, 0.5), result.iterator().next());
}
@Test
public void shouldCalculateNearestPoint() {
KdTree kdTree = new KdTree();
kdTree.insert(new Point2D(0.7, 0.2));
kdTree.insert(new Point2D(0.5, 0.4));
kdTree.insert(new Point2D(0.2, 0.3));
kdTree.insert(new Point2D(0.4, 0.7));
kdTree.insert(new Point2D(0.9, 0.6));
Point2D result = kdTree.nearest(new Point2D(0.35, 0.87));
assertEquals(new Point2D(0.4, 0.7), result);
}
SUMMARY:
Binary search trees help us to organize data for efficient search operations. But a general purpose binary tree may not be very heplful in some situations. Sometimes we need special binary search trees that is organized for our special queries. We can organize 2d points in a 2d-tree and perform efficient range search and nearest point search operations.
Thanks to Princeton University, Robert Sedgewick and Kevin Wayne for the wonderful programming questions.
Comments