SR Sofware Overview

Acquire frames

Several low resolution (LR) frames are needed to reconstruct the (HR) high resolution frame. There are two main scenarios in which SuperResolution (SR) is commonly applied:

  1. A moving foreground object is tracked and an SR representation of the object is obtained.
  2. The camera view changes while photographing frames of a static scene. The consecutive frames are then registered and an SR representation of the scene is obtained.

I will be looking in more detail at the second scenario.

Register frames

The relative translations/rotations (or, if we want to be more general, the attributes of a certain geometric transformation) between the LR frames must be estimated (TODO: give more information on this H-matrix).

I hoped to do this using the KanadeLucasTomassi (KLT) tracker, applying an outlier-rejection algorithm to remove feature points that do not correspond well to the motion model.

It turns out that, for a process as sensitive as SR, the KLT is simply not accurate enough. I am therefore very interested to find a robust SubPixelRegistration algorithm that can handle translation as well as rotation.

Several routines have been implemented, including the one suggested by Zokai and Wolberg. Their algorithm makes use of the LogPolarTransform, which provides registration parameters which seed an interative Levenberg-Marquardt optimisation routine. The LogPolarTransform unfortunately is too computationally intensive to my liking, so I intend to use three methods of increasingly accurate registration, as described in FusionRegistration.

Translation-Rotation warp model

Assume that the coordinate translation consists of a translation and a rotation, i.e.:

`[[x'],[y'],[1]] = [[cos(theta),-sin(theta),t_x],[sin(theta),cos(theta),t_y],[0,0,1]]` `[[x],[y],[1]]`

After tracking, we have two sets of coordinates: `(x_i,y_i)` and `(x'_i,y'_i)` for `i = 0 ... N`.

We can rewrite the above matrix representation as a set of equations of the form

`a cos(theta) + b sin(theta) + c t_x + d t_y = q`

where `c` and `d` are alternately 0 or 1, selecting the appropriate translation `t_x` or `t_y`, depending on whether `q` represents an `x` or a `y` value.

These equations can be solved for the parameter matrix

`bb{p} = [[t_x],[t_y],[theta]]`,

using a linear least-squares method.

Is there a way to formulate this problem so that the solution is non-iterative?
Other methods (?) mentioned in article on registration

Solve the SuperResolution problem

Least-squares Super-Resolution

The classic method, as described in the literature.

Knowing the parameters of the transformation, the warp matrix can be estimated. I do not implement this method using matrices, though — uses too much memory, so I rather work on a more intuitive, per-pixel basis.

Results look good, since the noise is averaged out, but is slightly blurred.

Geometric Super-Resolution

Results look very similar to those of the least-squares method, but are slightly sharper (execution time unfortunately much longer). I should generate some data with known transformations, and use RMS-errors to quantify this observation "sharpness".


While doing these experiments, it became clear that the real problem with SuperResolution is that it is very difficult to do sub-pixel accurate registration. I will be focusing my efforts on finding such a method, that can provide me with a highly accurate estimate of affine transformations.

David Lowe has a brilliant keypoint algorithm, but it has a patent pending. I would therefore prefer to find my own solution to this (much more specific) problem.

1. Previously I referred to the external LevMarLibrary, but it proved too complex to use for simple function fitting. I then started using the widely available GNU Scientific Library (GSL) instead, but since I'm doing everything using Scipy these days, I don't have need for that either.

sub-sahara africa
sahara trip IanKnot rainbow