Weekly log: December 12, 2009
This is a “weekly report” for the lab I study in, mostly intended for other lab members. See the first one for further explanations.
Lab-related projects
- As I had written I might do in an earlier post, I reimplemented the convolutional net for COIL-100 based on Torch, by “filling the easy parts/blanks” in code Hossein Mobahi had sent me. I learned some Lua along the way. I’m quite surprised (but then should I be?) to see that it seems to work for 30 objects: it reached ~30% before I killed it to try on 100 objects. ~9 hours later, tho, error is still around 100%… I’ll need to relaunch that test for on a longer period.
- Last week, I implemented a very (very) lilliputian version of the similarity-based cost, with a custom one-neuron-followed-by-tanh “net”. Following Hossein suggestion, I had used two classes based on overlapping Gaussians. I first used only two “incorrect” (off-center and very misleading) points for supervised training, then “corrected” using the similarity-based cost. It indeed reduced the error by ~3% (35% vs 38%, mean over many, many tests).
Misc
- I went to a Montreal Python meeting on Wednesday, and someone named Jeremy Barnes happened to be (very quickly) presenting a library he’s coding for machine learning tasks, which is using a Python-C bridge (boost::python).
Readings
I’m trying to read more on recurrent networks. To start off, I’ve finished reading:
- D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internal representation by error propagation,” in Parallel Distributed Processing, D. E. Rumelhart and J. L. McClelland, Eds. Cambridge MA: MIT Press, Bradford Books, vol. 1, 1986, pp. 318-362
and I’ve read about the first half of this paper:
- J. L. Elman, “Finding structure in time,” Cognitive Science, vol. 14, pp. 179-211, 1990
For a class project, I’ve read a bit of:
- P. Simard, D. Steinkraus, J. C. Platt, “Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis”. ICDAR 2003: 958-962
Course work
- I’ve implemented the elastic distortion method proposed in Simard et al. 2003 (see above) for the 4th assignment for the ML class. So using their settings (800 hidden units, about same learning rate schedule) I seem to obtain ~1.1% validation error on MNIST. I still haven’t reached the minimum, so I’ll maybe let it continue (requires very few effort on my part, at this point!).
Plan for next few weeks
- I need to do (none of it done yet) the last assignement for my NLP class, which involves classifying documents according to author (should be short, which is the intention of the teacher).
- I’ll continue reading on recurrent nets for a while, when I have time.
- I need to relaunch the Torch-based test for 100 objects, make sure what I’m doing is all right, and if it indeed works, figure out what’s the difference with my Python/PyLearn implementation.

Hossein Mmobahi:
Glad to hear that the toy problem was helpful. I did not get what you mean by “it seems to work for 30 objects: it reached ~30% before I killed it to try on 100 objects”. So for 30 objects, what doe 30% mean exactly?
16 December 2009, 10:44 pmFrancois:
Hi again. By 30%, I meant 30% error on the test set (ie. all images). When I say I use 30 objects, I mean I use 30 objects (4 angles) for supervised training and all the angles of the very same objects for the similarity-based cost (vs using other objects, as in videoCNN V:COIL”70″).
Of course 30% is not very impressive (with Standard CNN you (and I, IIRC) could get ~15%), but the behavior in training was still much better than anything I had when using the similarity-based cost with my Python version. It gave me hope it could work with 100 objects, so I stopped that test and tried on 100, but maybe I didn’t run it for long enough. Also, I still need to integrate the normalizing of inputs as you suggested in your mail.
I’ve been rather busy with other stuff (got a final today, etc.) for the last few days, but that’s what I’ll try next when I get some time.
17 December 2009, 8:38 am