Image stitching

As I said in the previous post, the first assignment in that computer vision course I took was to write an image stitching program. The basic idea is to take a series of pictures by rotating around the point where the camera is located. Then you find how each “maps” to the others and “stitch” them into a single coherent panorama.

At first I thought this couldn’t make sense for a first foray into CV algorithms and/or would take a gargantuan amount of time. Turns out only the second part was right, but mostly because I had to learn to use Mathematica and program in OpenCV along the way. And the hardest part was already done for us by either of these tools, anyway: finding correspondences between pictures.

The basic principle, as I said, is to find corresponding points between pictures. To do that you need to identify “keypoints”, basically points that look unique yet can be identified in another picture to find matches, even if in the other picture they’re slightly different. There’s a whole world of research on how to do that, but in our case it boiled down to either using SIFT or SURF features in Mathematica and OpenCV. Then some “distance” is used to find points that match between images. It measures how much the region around keypoints differ between potential matching points.

Once you’ve got these matches between pictures, supposing for the moment they’re all correct, if the pictures were taken “right” you can find a transform matrix H (an homography) such that, if x is the coordinate of a point in the first image, Hx gives the coordinate of the point in the second.

So, skipping some details about weeding out bad matches, what I did was find an homography between pairs of consecutive images, and bring everything back into the coordinates of the central image. Starting from these, say:

I get this:

Cropping to keep only the central part:

Now I will concede my C program probably doesn’t have all the gizmos of other commercial options (notably if you look at the full size version above, you’ll see problems stemming from auto lighting on the camera), but hey, for a first assignment I was pretty impressed with the result. (And it turns out it was actually a pretty good first assignment, as those transformations and keypoint finding operations were fundamental for the rest of the course).

(As a last technical point: to filter out bad matches, we used RANSAC and other constraints on distances (in my case the first match must be much better than the second one).)

Computer vision project: overlaying 3D reconstruction from webcam on the original scene

I’ve been taking a few graduate courses at Université de Montréal in the past two years, but the last one, the computer vision course, was by far the one with the most “showable” (mighty Google says that’s a word) projects. I’ll be posting later on the first project, a basic image stitching app. For now, I’ll just leave this here:

It’s an example result from my term project. In a nutshell, skipping the technical:

  • I take a couple of pictures with a camera for which I know some parameters, notably focal length;
  • In the scene there’s this augmented reality tag which helps me find out where the camera is in each pose;
  • I find corresponding points between images to create a (sparse) 3D point cloud of the object;
  • I then use MeshLab to get a mesh (I only have points, I need triangles) and then find some “good enough” textures for the triangles;
  • I then use the 3D model that comes out of all this and overlay it on the original scene, so the model ends up being … well where it should be in the first place.

I wrote this using ARToolkit+ and OpenCV. Reprojecting onto the original image is done with OpenGL.

Some technical details for those who are into this kind of thing:

  • I pre-calibrate the camera model with OpenCV calibration routines and a “chessboard”;
  • Getting the camera position from the detected tag is simply decomposing an homography between the tag plane and its image, which happened to be something we did for another course work;
  • I find corresponding points with SURF, and filter correspondences with epipolar line constraints found from camera poses, among other filters;
  • I first find correspondences between image pairs, and then merge them correspondences from all pairs using an algorithm I cooked up;
  • Finding a texture for each triangle involves finding a “good” image to project the triangle onto, then building a huge map of the textures for OpenGL from the extracted bits of the original images.

I know the tracking’s pretty bad (I had to hack ARTookit+ which would return false positives at times) and the model is far from the reconstruction quality you’d get with less sparse techniques (another student in the course tackled reconstruction with structured light, for example, and had ~2 million points… I get ~2000), but hey, it (kinda) works and uses only a webcam (for the example shown above).

Maybe at some point I’ll get around to making the tracking smoother, at least. And to posting a more detailed demo of the app itself. It shows a bunch of fun stuff about the reconstruction process, like epipolar lines and camera positions for each pose.

Small updates to Javascript speed reading app

Just a note concerning two small new features I added to my Javascript speed reading (RSVP) app:

  • You can now change the speed using your keyboard’s up/down arrows keys
  • Text and background colors may now be selected using a color picker (based on JsColor)

These were features some users asked for either on the blog or in the comment form in the app. Thanks for the feedback!

Simple Javascript memory game

Here’s a little memory game I just finished, using jQuery. It’s very bare bones, and I might add features to it, but it works, doesn’t have a bunch of ads floating around (like most do on the Web), and the board size can be changed (up to 60 total cards for the moment).

For the context: when we launched Clusterify, one of the early projects I proposed was a simple “matching pairs” game. Some almost-complete code I wrote up has been sitting on my computer ever since, just needing a few last fixes, and the addition of actual pictures. So I did those last fixes, adapted stock photos for it, and now here’s the game.

Changelog

  • 2010.02.22: as per a commenter’s (Jebadiah’s) suggestion, added a score and a timer. Also, images are now shuffled so the last ones (cats and birds) show up in the smaller grid.

Plyn: Yet another textfile-and-scripts based ToDo system

Warning: this post is

  1. out of the normal scope of this blog (it’s about personal _information_ management, not personal _knowledge_ management).
  2. mostly for geeks/programmers who will never be fully satisfied by any planning system, ever.

Over the years I’ve tried different ways of handling my ToDo, planning and work logging. This is my Xth iteration. I wonder if anyone but a programmer could use this, but hey, programmers are a non negligible fraction of society (which I happen to be part of)!

I’ve long wanted to create a program which would do precisely what I want in terms of planning, but was always put off by the “*knocks head on wall* the GUI is so long to code!” aspect. Well to hell with the GUI! Let’s deal with raw information, rarrr.

Err, sooo… Plyn (ie. this system) is inspired by the todo.txt scripts of Gina Trapani (LifeHacker author). Basically it allows you to have a very simple yet powerful todo.txt file, and the file is meant to be read directly (in contrast to other programs which use databases only the program can read). The difference is that my version:

  1. is written in Python (in contrast to Bash for Gina’s todo.txt)
  2. allows for hierarchy, empty lines, comment lines, etc. in the file, so the file can really be structured and read by itself, and a good deal of everyday tasks can be done without ever using the scripts
  3. includes a work log aspect, ie. you can record how much time you spent on tasks to keep stats.
  4. includes time estimates, but for the moment it’s not very developed.

So I’d say it departs from the need to be simple, to be expandable and support other dimensions of planning&logging.

Google Code link for the project & code: http://plyn.googlecode.com

The todo.txt format is pretty simple. Here’s an example of content:

12 Elephant in refrigerator project ||| Yeah, I shouldn't try myself at humor.
	# Open refrigerator door
	# Put elephant in refrigerator
	# Close refrigerator door

	-- This line is just a comment

A few observations:

  • You see a task may be nested in another one (which you can see as a project), simply using tabs.
  • Each line begins either with an ID (number) or with #. The # is replaced by a proper ID by cleanchanges.py (more on this later).
  • The ID is followed by a title, then |||, which indicates the start of parameters/comments.
  • You can have blank lines, and comments lines (starting with –).

After the ‘|||’ characters, you can place different parameters. In more detail, the format of a line is:

(INDENT) ID TITLE ||| {PRIORITY} <MINUTES_DONE/MINUTES_TODO> [START_DATETIME-ENDDATE_TIME] COMMENTS

As you can see, many more options may be specified (see the “format.txt” file for detailed information about each of these parameters), and of course this can be expanded (it all relies on a huge regex). But everything following ||| is optional.

So you can edit the todo.txt file manually, but there are, of course, helper scripts to automate certain tasks. The one you’d use the most is today.py. It gives you a list of all high-priority tasks, late/coming up tasks, and tasks awaiting feedback (“+feedback” tag in title). By editing the script you could add whatever other list you need.

You can also, of course, filter tasks by text using grep. So you could have tags or contexts, for example, if you’re into GTD.

The cleanchanges.py script will replace the # at the beginning of the line by an ID which can then be used to refer to the ToDo item in other scripts. cleanchanges.py will also transform dates, so you can write:

-- Today is 2009/03/14
# Clean refrigerator ||| [-+15]

and the item will be changed into

15 Clean refrigerator ||| [-2009/03/29]

ie. the date can be specified as the number of days in the future, which saves finger mana.

The work log is also simple. To say you’ve just spent the last 3 hours cleaning the refrigerator, you would do:

./log.py 15 180 "Some comment to add to the log"

(where 15 is the task ID and 180 is 3 hours expressed in minutes). This will add a line to the log.txt file, and will change the MINUTES_DONE field of the item in todo.txt.

Scripts are meant to be called from a command line you keep open somewhere in the scripts directory, so you can use autocompletion. Path for todo.txt and other files are configured in cfg.py.

And, of course, the whole thing can be extended as you please. My ultimate goal is to have a script with which I can truly estimate the free time I have, ie. to determine if I can engage in a new task or not.

If anyone ever uses this, be sure to let me know! I’m especially interested in hearing of other must-script-the-procrastination-away coders who expand this thing in whichever direction their urges take them.