Meta: change in approach for the blog

This blog seems to suffer from severe parental attention deficit, lately. The cause, I believe, is in the format: I’ve (consciously or not) chosen a limiting post format. I attacked broad personal knowledge management questions with what I could find in my own experience (mostly).

Problem is, that experience is, err, ever expanding, but also limited. Some topics I didn’t want to cover as I felt I did not have a sufficiently broad understanding and view of.

Generally speaking, here’s the new approach:

  • I’ll now mostly have shorter posts focusing more specifically on, say, a given software tool, instead of a whole category at a time. There’ll be more “here’s something interesting” posts.
    • Ergo, I’ll touch topics which wouldn’t otherwise appear here, due to aforementioned lack of broad experience with said topics :)
  • Long posts will start to evolve, and significant changes to a past post will be mentionned in the feed in the form of a short new entry.

Transformation to something akin to a wiki

Those changes don’t mean I won’t aggregate information in the old “overview” format, which I believe is better in the long run as it give’s a bird’s eye view of a topic, allowing to see the forest then the trees (top-down), not the other way around. That’s why I’ll start modifying old posts as relevant new tools appear and new points come to my mind, similar in principle to a “normal web site” (ie. non blog, set-of-page) or a wiki. The timeline aspect isn’t as important for these articles, anyway.

A consequence of changing old posts is that these changes won’t directly appear in the RSS feed. Instead, short blog posts (in RSS) will now simply draw attention to significant changes on pages: new tools that appeared, new explanations, etc.

Each “evolving” post will also be equiped with a Changelog section, in which I manually enter a description of the change. That way, if you’ve already read the page you’ll know what’s new. Also, context for comments prior to a change will still be available (I’ll try to add an “editor’s note” to previous comments if context changes real bad).

Side note (technical): I’ve yet to find a WordPress plugin that allows me to actually show the differences (à la UNIX diff) between two post versions.

*  *  *

My goal by doing this reorientation is to lower the “barrier to entry” for a post, meaning I won’t have to write for a few hours, consecutive blocks of which are hard to come by :) Furthermore, incremental changes make more sense for broad posts.

Using a personal wiki for bookmark organization

Bookmarks have been around since the earliest browsers. With years I’ve accumulated thousands of them, and I’ve heard of similar numbers from other people. As it grows, it becomes obvious some organization is needed.

The organization scheme that comes naturally, at first, is folders. Those have been there since the early beginnings with Adam and Eve when the Web was young and domain names were aplenty in the tree of TLDs. That’s what I relied on for about six thousand years, and it became a huge mess. I still have old folders from my antique “classification system” I never look at anymore except for quick-and-easy nostalgia.

Tags and multiple axis for classification

So it quickly becomes obvious more axis are needed to classify. The most self-evident case is when doing a project: you’ll quickly accumulate a bunch of links which are contextually related because of the project, but otherwise would end up in different categories.

For example, if you’re creating a web site on video games, you accumulate links on, say, Nintendo, HTML, marketing and Ramen noodles, but in the Grand Scheme of Things (ie. Dewey classification, or some directory like dmoz.org), these are not usually put together.

So you end up trying to set up some keyword/tag system, hack together for 20 hours some frail Firefox extension, and then Firefox 3 comes along and does it for you anyway. The end result is you can create a tag for your project, yet also tag with proper general categories.

With a personal wiki

But in my experience that still doesn’t work, based on the fact that I never look at the bookmarks again, except when I have a very precise idea of what I’m looking for. And then there’s Google anyway.

In fact, currently, I only use local bookmarks for

  1. Transition until I put them in my wiki;
  2. pages and sites I use all the time, so I need quick access (online tools etc.).

Why is it that my old bookmarks were still condemned to live unfullfillingly in the dark for eons? There are many reasons, but the main one, for me, was that bookmarks don’t offer any formatting options and their context is not rich enough, even with tags or folders.

When you write a blog post or wiki entry, you can use context, explanations, and the links make sense. They’re part of the text, and when you look back at the entry, you don’t just see a list of equally-created bookmarks, but each has its place and the content you summarize, the description you make create a whole, and of course it’s text so you can have sections, bullet points, images and whatnot.

So my current system is one where I put my bookmarks in wiki entries related to their topic, with some summary explaining why it’s there and what I extracted from it, if I read it. If I need more axis, I’ve developed a WikidPad extension to tag a part of a page.

It seems to work: I reuse the links much more often, and it actually creates value for me as the content slowly grows with the links and knowledge instead of just being an anonymous bunch of pointers.

Of course it requires a bit more work for each link, but in the end if you’re not willing to spend 30 seconds writing a quick note, perhaps it wasn’t worth bookmarking anyway.

Organizing code snippets and programming knowledge

(This post is geared towards programmers.)

This blog is about structuring your personal knowledge. Code snippets and, more generally, programming language information, are interesting in that everyone and their cubicle neighbor seem to have their own approach to organizing them. Here I survey some interesting software and approaches I’ve read about, their features, and present my own method based on my personal wiki.

UPDATE July 29, 2011: there’s a good discussion about the wide range of options people use over at StackOverflow.

This post is an example where wiki features come in handy (by opp. to a thorough survey of Code Snippet Management as, err, an academic field of study).

Software and approaches

A code snippet manager is a piece of software which allows you to organize short pieces of code to reuse later. Yet I’m also seeking the ability to integrate general information about the language (explanations, elements of theory, etc.): in my experience, snippets are often examples of a notion I’m learning.

In researching a bit on existing systems, I’ve found a few feature families:

  • Code features
    • Syntax highlighting
    • Management of multiple files (a plus if you want to add entire libs to your snippet database)
    • More specific:
      • automatic indentation on insertion
      • dependency management
      • IDE integration
      • (other noteworthy?)
  • Organization and retrieval features
    • Hierarchical: by language, by functionality/algorithm
    • Tags
      • Tags are particularly useful here (vs pure hierarchical) because I’ll often stumble on situations like:
        • I need a snippet in whatever language for a quick sort algorithm.
        • I need a C++ snippet with an iterator loop.
    • Full-text/regular expression search
      • Regular expressions are especially useful since you often seek specific constructs and regular text indexing won’t cut it.
    • Hyperlinks (well, hallmark of wikis here)
    • Date and other general fields
  • Sharing

There are lots of different approaches and systems. Specialized software exists that allows you to organize your snippet library in a standalone and dedicated manner. Google Snippely is an example:


Screenshot of Google Snippely

A whole bunch of sharewares exist that do similar jobs. Some IDEs come with a snippet manager integrated, as is the case with Visual Studio. Most of these local programs offer a basic outline for organization with more or less search capabilities. If you’re looking for an online version with tagging, check out Snipplr, which, being online, also allows you to share and search others’ submissions.


Snipplr homepage screenshot

In the homebrew solution department, this thread is interesting. Some people talk of filesystem based solutions. A few even use a custom database. Personal wikis (as I use, see bellow) and general outliner software clearly need mention too. For example, this blogger says she uses Microsoft OneNote to organize her snippets.

Getting a bit less personal, it should be noted a quite a few bloggers describe their blog as being a “repository for them to search later”. Therefore blogs and websites somehow count as personal snippet libraries (I did a bit of this with my old me-me-me blog over yonder). These score high on integrating other information (ie. free-form formatted text) with the snippets, and of course on the sharing aspect. Community wikis (ie. not personal) are also a great way to organize and share snippets and knowledge (examples here, here).

As a side note, it’s pretty clear we won’t only rely on our own snippets when coding. “The Web + Google” describes my most often used “system” when searching for coding solutions. Yet there are specialized search engines for this job: Google Code Search (you can use regexps on the whole DB!), Koders, and quite a few more.

My approach

Given earlier posts, this doesn’t need a drum roll introduction: I use my personal wiki to organize my snippets and my programming language learning. Of course, this solution allows for inclusion of formatted text. I admit I have a strong tendency to use my snippets for learning more than for reuse, so that factor might weight more than usual in my choice.

A wiki will allow for many different types of retrieval. For example, using the right combination of plugins, with WikidPad I have hierarchical organization, tags/keywords, full text and regular expression search, and, of course, linking. Most popular wiki systems will have plugins to allow for syntax highlighting, and WikidPad is no exception.


Code snippet screenshot in WikidPad
(using the PrettyCode extension)

Where that solution might be lacking is in the IDE integration department, and in the management of multiple files. In the last case, I have a separate personal code (file system) directory to which I may refer using file:// links.

Cleaning up web pages with Aardvark Firefox extension

Browsing the web, we see tons of different layouts: each site has his own. Though that makes for a more diverse experience, it’s not the best when you want to sit down and take the time to read a long article.

Those that use Firefox have certainly encountered extensions such as AdBlock Plus and Flashblock, which help in making web pages look less like a stress test for epilepsy. More general (cross-browser) solutions exist by using a proxy mechanism to filter incoming content, such as Privoxy.

Yet one can go even further to isolate the text of an article. Some sites offer a “print” version of their articles (usually a single, clean page), but that’s not the general case. That’s where the Aardvark extension comes in. It allows you to delete elements from a page and rearrange it quickly so you only keep the part you want.

Overview of Aardvark’s modification commands

Once installed, you navigate to a page you want to clean up and you launch Aardvark (Tools -> Start Aardvark). You then see a red rectangle over elements when your mouse pointer hovers over them. You  press keys to activate different editing operations for the selected element (press ‘h’ to get the list of commands).


Aardvark‘s help (list of commands)

It helps here to understand how web pages are coded (HTML), but in essence a page is made of rectangular zones inside bigger zones (ex: an image in a paragraph), forming a hierarchy. As your mouse pointer hovers over a given rectangle (say a paragraph title), you may want to select its parent in the hierarchy (the paragraph itself). To do it, you press ‘w’ to ‘widen’ the selection. The inverse operation is ‘n’ for ‘narrow’.


Example of Aardvark‘s rectangle selection

You can delete elements in essentially two ways. The first is the straightforward one: you select an element and press ‘r’ to remove it. The other is the opposite: you press ‘i’ to isolate the selected element, ie. keep only this one, remove all the rest. ‘i’ is very useful to select the page main element that contains the whole text, and then you can work the details with ‘r’.

If the isolated text is too narrow (doesn’t fill the page horizontally), you can press ‘d’ to ‘de-widthify’, which means that the ‘width’ attribute (which prevents the block from filling the page) is removed from it. You may have to fiddle a bit until you find the element on which the ‘width’ is applied, though.

Saving the result with ScrapBook

When the modifications are over, I save the page in its modified state using ScrapBook (which I covered in another blog post). I can then read in the format I want, and add notes and highlights. (The ScrapBook extension does have a “delete” feature, but it’s not as featureful as Aardvark’s.)

If an article is spread over multiple pages, you can use ScrapBook “Combine Wizard” (in the SB sidebar: Tools -> Combine Wizard) feature to merge them in a single page.

Repetition and my WikidPad dynamic search extension

Digression on repetition

Information overload has numerous causes, and one of them is plain old repetition, e.g.: two sources delivering the same information, with superficial differences. It’s natural to repeat information for various reasons.

As an example, when students take notes on a teacher’s lecture, they all duplicate basically the same information. If they all decide to put their notes online, bam, 30 new versions of “Notes on Heisenberg uncertainty principle”. Same goes for journals and bloggers reporting on a given event.

Of course there might be additional value to each version, different points being made, but for someone doing research on recent events, he still gets to read again and again the same basic facts.

Clearly there’s no simple solution. In fact I might mention here that discussion in the blogosphere does create repetition, but makes that information evolve. Something similar happens for students exchanging notes. In this light, repetition appears as a necessary evil.

If we really want to get philosophical, let’s just say repetition is unavoidable from the very start, as production of repetitive information is just the consequence of information flowing in the social graph and of different human beings going through similar experiences and train of thoughts. And clearly it’s not because one of them has eaten apple pie that humanity can move on and experience other stuff.


Gratuitous picture of humanity’s bane (source)

(Ah, of course, the irony here is that this very article is just some remix of ideas told a zillion times over).

My WikidPad extension

Yet, being aware of the problem, you can at least work on making your own set of notes as repetition-free as possible. That’s another core reason why I love personal wikis. Instead of rewriting information on two pages, as you’d do in paper notes because you don’t have your old notebooks handy, you simply link to the other page and voilà! you just avoided adding a little more repetition to this world (why not add some grandiose here? :) ).

Yet there are cases where where linking is not enough. Say I’m taking notes on the differences between two programming languages, C# and Java. I have a page on C#, a page on Java. Where do I put the notes? I could create a page dedicated to that topic, but I don’t have enough material for the moment to justify that. So say I put them in the page about Java. Consequence: when on C# page I have to navigate to the other page to read the info.


Diagram explaining the extension

What my extension does is grab the info on the Java page (and any other page) and dynamically bring the relevant sections in the C# page. Technically, you give the extension a keyword, and it will search your whole wiki to find pages that contain it. Then, in those pages, it searches for precisely the lines that contain your keyword and some context around it (“sections”). It then prints a list of those sections.

Now it doesn’t matter as much where I put the notes. As long as I label the sections correctly, I can centralize them in the relevant pages when needed, and I don’t need manual copy anymore.

Grab the code & read details here: http://www.fsavard.com/flow/wikidpad-dynamic-search-results/