Archive for October 2008

Cleaning up web pages with Aardvark Firefox extension

Browsing the web, we see tons of different layouts: each site has his own. Though that makes for a more diverse experience, it’s not the best when you want to sit down and take the time to read a long article.

Those that use Firefox have certainly encountered extensions such as AdBlock Plus and Flashblock, which help in making web pages look less like a stress test for epilepsy. More general (cross-browser) solutions exist by using a proxy mechanism to filter incoming content, such as Privoxy.

Yet one can go even further to isolate the text of an article. Some sites offer a “print” version of their articles (usually a single, clean page), but that’s not the general case. That’s where the Aardvark extension comes in. It allows you to delete elements from a page and rearrange it quickly so you only keep the part you want.

Overview of Aardvark’s modification commands

Once installed, you navigate to a page you want to clean up and you launch Aardvark (Tools -> Start Aardvark). You then see a red rectangle over elements when your mouse pointer hovers over them. You  press keys to activate different editing operations for the selected element (press ‘h’ to get the list of commands).

Aardvark‘s help (list of commands)

It helps here to understand how web pages are coded (HTML), but in essence a page is made of rectangular zones inside bigger zones (ex: an image in a paragraph), forming a hierarchy. As your mouse pointer hovers over a given rectangle (say a paragraph title), you may want to select its parent in the hierarchy (the paragraph itself). To do it, you press ‘w’ to ‘widen’ the selection. The inverse operation is ‘n’ for ‘narrow’.

Example of Aardvark‘s rectangle selection

You can delete elements in essentially two ways. The first is the straightforward one: you select an element and press ‘r’ to remove it. The other is the opposite: you press ‘i’ to isolate the selected element, ie. keep only this one, remove all the rest. ‘i’ is very useful to select the page main element that contains the whole text, and then you can work the details with ‘r’.

If the isolated text is too narrow (doesn’t fill the page horizontally), you can press ‘d’ to ‘de-widthify’, which means that the ‘width’ attribute (which prevents the block from filling the page) is removed from it. You may have to fiddle a bit until you find the element on which the ‘width’ is applied, though.

Saving the result with ScrapBook

When the modifications are over, I save the page in its modified state using ScrapBook (which I covered in another blog post). I can then read in the format I want, and add notes and highlights. (The ScrapBook extension does have a “delete” feature, but it’s not as featureful as Aardvark’s.)

If an article is spread over multiple pages, you can use ScrapBook “Combine Wizard” (in the SB sidebar: Tools -> Combine Wizard) feature to merge them in a single page.

Repetition and my WikidPad dynamic search extension

Digression on repetition

Information overload has numerous causes, and one of them is plain old repetition, e.g.: two sources delivering the same information, with superficial differences. It’s natural to repeat information for various reasons.

As an example, when students take notes on a teacher’s lecture, they all duplicate basically the same information. If they all decide to put their notes online, bam, 30 new versions of “Notes on Heisenberg uncertainty principle”. Same goes for journals and bloggers reporting on a given event.

Of course there might be additional value to each version, different points being made, but for someone doing research on recent events, he still gets to read again and again the same basic facts.

Clearly there’s no simple solution. In fact I might mention here that discussion in the blogosphere does create repetition, but makes that information evolve. Something similar happens for students exchanging notes. In this light, repetition appears as a necessary evil.

If we really want to get philosophical, let’s just say repetition is unavoidable from the very start, as production of repetitive information is just the consequence of information flowing in the social graph and of different human beings going through similar experiences and train of thoughts. And clearly it’s not because one of them has eaten apple pie that humanity can move on and experience other stuff.

Gratuitous picture of humanity’s bane (source)

(Ah, of course, the irony here is that this very article is just some remix of ideas told a zillion times over).

My WikidPad extension

Yet, being aware of the problem, you can at least work on making your own set of notes as repetition-free as possible. That’s another core reason why I love personal wikis. Instead of rewriting information on two pages, as you’d do in paper notes because you don’t have your old notebooks handy, you simply link to the other page and voilà! you just avoided adding a little more repetition to this world (why not add some grandiose here? 🙂 ).

Yet there are cases where where linking is not enough. Say I’m taking notes on the differences between two programming languages, C# and Java. I have a page on C#, a page on Java. Where do I put the notes? I could create a page dedicated to that topic, but I don’t have enough material for the moment to justify that. So say I put them in the page about Java. Consequence: when on C# page I have to navigate to the other page to read the info.

Diagram explaining the extension

What my extension does is grab the info on the Java page (and any other page) and dynamically bring the relevant sections in the C# page. Technically, you give the extension a keyword, and it will search your whole wiki to find pages that contain it. Then, in those pages, it searches for precisely the lines that contain your keyword and some context around it (“sections”). It then prints a list of those sections.

Now it doesn’t matter as much where I put the notes. As long as I label the sections correctly, I can centralize them in the relevant pages when needed, and I don’t need manual copy anymore.

Grab the code & read details here:

Speed reading and my RSVP web application

A few days ago I published a JavaScript-based web program, which takes a text as input and flashes groups of words successively. It’s inspired by many other similar programs available on the Web, some free, some not. The technique is called RSVP, for Rapid Serial Visual Presentation. I baptized the program “Faster!” (well, I had to pick a name 🙂 )

<br />
A screenshot of Faster!

Go ahead and try it out. Nothing to download, just click “Play that text”.

In the rest of the post I elaborate on similar software and the effectiveness of speed reading in general.

Continue reading ‘Speed reading and my RSVP web application’ »

Small speed reading JavaScript app

This is a work in progress, but I’ve developped a small speed reading application in JavaScript. It’s definitely not the first of its kind, but I wanted web version with more options. I’m also planning to open source it, and I’ve yet to see an open source web app of this kind.

The app:

You simply click “Play that text” and you should grasp the principle real quick.

This is thought by some to increase your reading speed if used consistently. It teaches to not go back while reading, not “subvocalize” (ie. hear the words in your head, or even whisper them as you read) and by displaying more than one word at a time, to read more in one glance.

If you like it and want to come back to it, there’s a bookmarklet in the “About & download” tab that will allow you to select text in any web page and use it as input. Or you can simply copy & paste the text in the text area.

I’ll post something more lengthy on speed reading at some point.

Memorization: optimizing flashcard review with spaced repetition

We often hear about how important it is to understand, to not just memorize information blindly. But there are situations when long term memorization is an essential part of the study process, for example if you’re trying to learn a new language. A common tool to help in this is flashcards, ie. cards each with a precise question on one side, its answer on the other.

When digging deeper on how to maximize the efficiency of flashcard review, the “when to review such or such card?” question naturally comes to mind. It turns out some people have been researching that very question and have come up with interesting tools and theories on how to best use your time when reviewing.

SuperMemo and spaced repetition

One such person is Piotr Wozniak, a rather eccentric Polish profressor who, since the early 1980s, has been studying and perfecting what he calls “spaced repetition“. His software, SuperMemo, implements this technique.

Essentially, once a flashcard is made and reviewed for the first time, it is then scheduled to be reviewed some time in the future, say in 2 days. Then, depending on how well you remember it in 2 days, it is rescheduled for a next review, but this time further in the future, say in a week. The process is then repeated, spacing repetitions further and further until a point where you won’t forget about it anymore.

SuperMemo 2006 screenshot

The core idea behind spaced repetition in SuperMemo is to present the flashcard just when you’re about to forget about it. Dr Wozniak has developped and refined models of how memorization and forgetting happens in time. The end goal, of course, is to repeat a minimum number of times a given flashcard, so you can repeat more flashcards total.

The SuperMemo website details his findings and tips concerning memorization, and can make for hours of reading. In particular, there are techniques to make better flashcards (how to formulate the question, etc.) and to use the SuperMemo software, of course. There is also an affiliated site where you can find ready-made flashcard sets available for purchase.

EDIT: On the website, you can notably find more detailed information on the mathematical details of the algorithm used by the program and its evolution. As mentioned in a comment below by FlashcardDB creator, the Leitner system is another spaced repetition system which doesn’t rely on computers (it was created in the ’70s), but on moving real cards in real decks, as explained on the FlashcardDB site itself.

Alternative software

Altough it seems Piotr Wozniak has been a pioneer in creating spaced repetition software, other programs have recently appeared that follows a similar model.

If you want something cross-platform (I run Linux so that’s my case), you can go for Mnemosyne. It’s much simpler than SuperMemo (much less options, formatting in flashcards, etc.), but it’s open source and free.

Screenshot of Mnemosyne

There are also Web versions. SuperMemo itself has an online version. Very recently, websites SpicyElephant, Mind Picnic and Flashcard DB (mentioned above) have appeared that follow the spaced repetition model. Being online communities, they allow you to share flashcards and reuse those made by others.

Update January 2010: a very good reference for alternatives is this comprehensive database of flashcard software.