WikidPad dynamic search results

NOTES

  • The second version, using insertion, is available down there.
  • 2009.12.07: made a quick fix to make this work with WP 2.0 (it should work, but I’ve only tested it very superficially)
  • 2008.10.22: added the “dynsearch_re” syntax to use regular expressions.
  • 2008.10.18: correction for version 1.8 of WikidPad (the second version of the extension only worked in 1.9 otherwise).
  • I’ve only done basic manual testing in WikidPad 1.8rc9 and 1.9beta17. Please send me an email if you have reports of it (not?) working in other versions.
  • This is not yet linked to from other parts of the blog, as it’s subject to many active changes;
  • This is still a beta version, so be very careful (have backups!);
  • The content here is an extension on a post I made on the WikidPad Yahoo! group.

I’ve just programmed a small WikidPad extension that displays search results in a dynamic section of a page. It’s an extension on the “Copy search results to clipboard” concept.

What it does

  • It does a wiki wide search for a given keyword activated by an insertion syntax, e.g.: [:dynsearch: my keyphrase](you may also use regular expressions, see Usage)
  • It then it grabs what I call “sections”. This can simply be the line where the keword occurs, or it can be the lines from there until a blank line is met, or all the child bullet points in a list (currently only the last two are implemented).
  • It outputs the resulting section list when the page preview is generated.


Diagram explaining the idea

Why would this be useful?

  • The goal is this: if you have information that concerns two (or more) pages, you can simply put it in one, “tag” it with the other using a syntax you determine, and on the other page you collect it by doing a search.
  • For example, say you take notes on the recent Russian-Georgian conflict. You put the info on the Georgia page, but also mention “Russia”. Then in the page on Russia you create a dynamic search section with the keyword “Russia”. The script will collect “sections” from other pages that contain it, including the Georgia one.
  • The high-level goal is to decrease manual repetition of information.
  • Of course there are other ways around this problem, say creating a Russia-Georgia page and linking there from both pages. But sometimes you just want a short blurb, yet it concerns many pages.
  • See my blog post extending on these ideas and repetition in general.

Comparing with existing functionality

  • It’s an extension on the “Copy search results to clipbard” concept, as said above. But no sections, and not in updateable area.
  • I’ve looked at ToDo extensions (esp. the Christian Ziemsky’s one), but they’re all geared towards, well, todos (nothing wrong with that :P ). In particular, they produce whole ToDos pages, not subareas in a page, and they’re based on the getTodos() mechanism (from what I could see). Wiki-wide search are pretty fast (for me at least), and I think they’ll be a bit more flexible for the purpose of the extension.
  • I also looked at the TagView extension, but it focuses on whole pages. I really wanted something that grabs small chunks of info in a flexible manner.
  • Same thing applies for “attributes” and dynamic pages/views (I get a warning when I use the same attribute twice on a page).
  • There’s an old (July 2006) thread on the Yahoo! WikidPad group that seems to say this was not possible.

The code

Usage: Just copy this code in a file named DynSearchResults.py in the “extensions” directory of WikidPad. Then when you insert a dynamic search in a wiki page, e.g.: “[:dynsearch: avocados]“, the preview page should contain results of that search. If you know regular expressions, you can also use the [:dynsearch_re: "regex_here"] syntax (be very careful here, as you may hang WikidPad if your expression ends up being too general).

'''
I'm reusing the license used by WikidPad. I guess it'll make things simpler (?).

---

BSD License

Copyright (c) 2008, Francois Savard (francois@fsavard.com, http://www.fsavard.com)
All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

    * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
    * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
    * Neither the name of the  nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

---

Dynamic search results for WikidPad

2008.10.14: first version, which was using direct insertion in the source text
2008.10.16: now using Insertion mechanism ([:dynsearch: ...]) instead of direct source insertion
2008.10.18: correction for the extension to work with WP 1.8 too, and for results from current page to be excluded
2008.10.22: added support for regular expressions with the "dynsearch_re" syntax
'''

from pwiki.SearchAndReplace import SearchReplaceOperation
import re
import wx
from pwiki.StringOps import uniToGui

RESULTS_TITLE = "Dynamic search results: "
RESULTS_TITLE_REGEXP = "Dynamic search results (regexp): "
RESULTS_END_STRING = "*(end of results)*"
RESULT_INDENT = "\t"

# ----------------------------

WIKIDPAD_PLUGIN = (("InsertionByKey", 1),) 

def describeInsertionKeys(ver, app):

    return (

            (u"dynsearch", ("wikidpad_language",), DynSearchInsertionHandler_NoRegexp),
            (u"dynsearch_re", ("wikidpad_language",), DynSearchInsertionHandler_Regexp),

            )

# ----------------------------

def wikiEscape(name):
	'''
	Escape characters of a Regexp which might not show in Preview.
	'''
	result = []
	for c in name:
		if c == "[" or c == "]" or c == "\\":
			result.append("\\" + c)
		else:
			result.append(c)
	return u"".join(result)

# ----------------------------

indentFindingPattern = re.compile(r"[^\-_\s]")
def findIndent(line):
	global indentFindingPattern

	res = indentFindingPattern.search(line)

	if not res is None:
		left = res.span()[0]
		return left

	# blank-only line
	return None

# ----------------------------

bulletLineRe = re.compile(r"^\s*\*")

def processPage(text, lineMatcher):
	'''
	Pass the whole file and aggregate sections based on lines matched
	by lineMatcher.

	I do it this way since it's simpler to get title levels and
	grab "backward" info (ie. info preceding the keyword). It'll be
	more flexible if other types of sections are added.
	'''
	lines = text.splitlines(False)

	printing = False

	# types of sections: constants
	BULLETS = 1
	PARAGRAPH = 2

	sectionType = 0

	indentLowerLimit = 999

	titleLevel = 1

	resultsArray = []
	curResult = ""

	for l in lines:
		thisLineIndent = -1

		if printing:
			thisLineIndent = findIndent(l)

			if( (sectionType == BULLETS and not thisLineIndent is None
				and thisLineIndent <= indentLowerLimit)
			   or (sectionType == PARAGRAPH and thisLineIndent is None)):
				resultsArray.append(curResult)
				curResult = ""
				printing = False

			if printing:
				curResult += RESULT_INDENT + l + "\n"

		if not printing and lineMatcher.search(l):
			if thisLineIndent < 0:
				thisLineIndent = findIndent(l)

			curResult = RESULT_INDENT + l + "\n"

			if bulletLineRe.match(l):
				sectionType = BULLETS
				indentLowerLimit = thisLineIndent
			else:
				sectionType = PARAGRAPH

			printing = True

	if printing:
		resultsArray.append(curResult)

	return resultsArray

# ----------------

def doWikiWideSearch(wikiDocument, regexpString):
	sarOp = SearchReplaceOperation()

	sarOp.wikiWide = True
	sarOp.wildCard = 'regex'
	sarOp.caseSensitive = False

	sarOp.searchStr = regexpString

	return wikiDocument.searchWiki(sarOp)

# ----------------

def extractSections(wikiDocument, regexpString, curWikiWord):
	'''
	Extract one section and output it in the current page.

	Inspired by the code used to replace a WikiWord.
	'''

	wikiWideResults = doWikiWideSearch(wikiDocument, regexpString)

	searchRe = re.compile(regexpString, re.I)

	returnStr = ""

	for resultWord in wikiWideResults:
		# Don't search the current page
		if not curWikiWord is None and resultWord == curWikiWord:
			continue

		wikiPage = wikiDocument.getWikiPage(resultWord)

		text = wikiPage.getLiveTextNoTemplate()
		if text is None:
			continue

		results = processPage(text, searchRe)

		# it is possible we found a page that doesn't contain actual results
		# since the text was contained in a dynamic search results area, so
		# we only print when we're sure we've got results
		if len(results) > 0:
			returnStr += "\n*From *[" + resultWord + "]:\n" \
					+ "----\n".join(results) \
					+ "----\n"

	return returnStr

# ---------------------------

dynsearchReplacerRe = re.compile(r"\[:dynsearch: (.*?)\]")

def createContentBase(exporter, regexpString, title):

	wikiDataManager = None

	if hasattr(exporter, "wikiDataManager"):
		wikiDataManager = exporter.wikiDataManager
	elif hasattr(exporter, "getWikiDataManager"):
		wikiDataManager = exporter.getWikiDataManager()
	else:
		pwf = wx.GetApp().GetTopWindow()
		if hasattr(pwf, 'getWikiDataManager'):
			wikiDataManager = pwf.getWikiDataManager()
		else:
			return "Error: dynsearch could not be performed (no pointer to WikiDataManager)."

	exportedWikiWord = None

	if hasattr(exporter, "wikiWord"):
		exportedWikiWord = exporter.wikiWord

	results = extractSections(wikiDataManager, regexpString, exportedWikiWord)

	# We must replace [:dynsearch: ...] strings, otherwise we end up
	# in an infinite loop, called again and again by the exporter
	def replaceFunc(match):
		return "--Escaped dynsearch for \"" + match.group(1) + "\"--"

	results = dynsearchReplacerRe.sub(replaceFunc, results)

	return "+++ " + title + "\n----" \
			+ results \
			+ RESULTS_END_STRING + "\n----\n"

class DynSearchInsertionHandler_Base:
	def __init__(self, app):
		self.app = app

	def taskStart(self, exporter, exportType):
		pass

	def taskEnd(self):
		pass

	def createContent(self, exporter, exportType, insToken):
		pass

	def getExtraFeatures(self):
		return ()

class DynSearchInsertionHandler_NoRegexp(DynSearchInsertionHandler_Base):

	def createContent(self, exporter, exportType, insToken):
		return createContentBase(exporter, re.escape(insToken.value), RESULTS_TITLE + wikiEscape(insToken.value))

class DynSearchInsertionHandler_Regexp(DynSearchInsertionHandler_Base):

	def createContent(self, exporter, exportType, insToken):
		return createContentBase(exporter, insToken.value, RESULTS_TITLE_REGEXP + wikiEscape(insToken.value))

Personal motivation and request for comments

I’ve been meaning to write this extension for some time (I have a ~450 pages/2MB wiki… info architecture becomes an issue :) ). It’d be great if it turns out to be useful for others. I’ll add it on the ListOfUserScripts page when it’s a bit more tested (I’m on Linux, using WP 1.9beta17). Please send comments, questions, bugs, etc.

12 Comments

  1. Mangesh:

    Thanks – This plugin is pretty useful.

  2. joon:

    Thanks for the great plugin!

    I think I found a typo in your text: “If you know regular expressions, you can also use the [:dynsearch: "regex_here"] syntax (be very careful here, as you may hang WikidPad if your expression ends up being too general).”

    It seems [:dynsearch: "regex_here"] should be [:dynsearch_re: "regex_here"]. Please correct me if I’m wrong.

  3. Francois:

    Hi joon,

    yes, my mistake, thanks for pointing it out (and for stopping by)!

    François

  4. Andreas:

    Great and useful function.
    I use it a lot in my pages!

    Big Thanks!

  5. zenchan:

    Hi I’ve been trying to get this to work, as I totally get what you’re trying to do, and I definitely need a similar function.

    I’m working on Linux,WikidPad 2.1

    So first to get the stupid question out of the way:

    You write: “Usage: Just copy this code in a file named DynSearchResults.py in the “extensions” directory of WikidPad.”

    Usually the scripts go into “user_extension” directory, so do you really want us to copy this to the “extensions” directory instead? That is as far as I can make out only for the stuff distributed with WP.

    I have enabled insertion scripts, and checked with other insertion scripts like [:rel: children], [:toc:],etc. Those are all working fine. But “savedsearch” and “dynsearch” I can’t get to work. The insertion text greys out as it should, but the preview page shows no difference.

    Would be wonderful if you or another reader can give a hint of what might be going wrong.

  6. Francois:

    Hi Zenchan,

    I must admit I’m still using WikidPad 1.9 regularly… I still haven’t switched to 2.0-2.2.

    However I clearly see an “extensions” directory even in the source trees of either 2.1 or 2.2 (beta versions, though), and this is where I place my extension in WP 1.9. I don’t know if scripts under “user_extension” are treated differently.

    I just tested it under WP 2.2 though (placed the script under “extensions”, and it seems to work, from a quick test.

    Maybe there’s a difference between the Windows and Linux versions of WikidPad?

    Maybe try copying it to “extensions” if you haven’t.

    Hope this helps.

  7. zenchan:

    ok, figured it out!

    I ran it in the terminal, and saw this error related to DynSearch:

    Traceback (most recent call last):
    File “lib/pwiki/PluginManager.py”, line 326, in loadPlugins
    mbcsEnc(fullname)[0], (“.py”, “r”, imp.PY_SOURCE))
    File “/home/moi/Random/WikidPad/extensions/DynSearchResults.py”, line 259
    return createContentBase(exporter, insToken.value, RESULTS_TITLE_REGEXP + wikiEscape(insToken.value))
    ^
    IndentationError: expected an indented block

    So the obvious problem was that when I made the DynSearchResult.py file, I had copy-pasted the code, and somehow the indentation on the last line got lost. After correcting the indentation, it was all sunshine and happiness :) It’s a super plugin, keep up the good work

    As an aside, I stuck the script in the user_extension (created by the user) directory, and it works fine from there. I guess it’s better to have a customised script directory separate from the original extension directory, in case of upgrades, or perhaps to sync customisations across systems.
    cheers!

  8. Francois:

    Oh OK good then :) I should probably add a link to a downloadable file; in retrospect I should have noticed this copy-pasting business would be error-prone.

    I’m glad you find the plugin useful!

  9. zenchan:

    Hi I get an error when I use two dynsearch entries on the same page with multiple tags open in 2.3_10_03 beta.

    Nothing happens on switching the view from editor. When I remove one entry then it works again. If the same entry is used on two different tabs, then again the same problem occurs. Sometimes I get empty results on switching the view with the following line at the bottom

    Error: dynsearch could not be performed (no pointer to WikiDataManager)

    However like I said on using a single entry instead of multiple on a page, it works fine. On running from the terminal I keep seeing runtime errors associated with the extension HtmlExporter.py

    I don’t know whether this is related though

  10. Francois:

    Hi Zenchan,

    Do you know if you got this error with earlier WP versions? I don’t know if I tested that case before. The problem with this extension is that it relies on internal WikidPad structures, and every time they change something in the structures I used, it’ll probably break the extension. I don’t have time to fix it right now, sorry :-/ But thanks for reporting the problem.

  11. CKolumbus:

    found the problem: the DataManager is accessed differently now. I’ve added the code

    The code will soon be available on github: https://github.com/ckolumbus/WikidPadUX
    ————————–
    wikiDataManager = exporter.wikiDataManager
    elif hasattr(exporter, “getWikiDataManager”):
    wikiDataManager = exporter.getWikiDataManager()
    - elif hasattr(exporter, “getWikiDocument”):
    - wikiDataManager = exporter.getWikiDocument()
    else:
    - return “Error: dynsearch could not be performed (no pointer to WikiDataManager).”
    + pwf = wx.GetApp().GetTopWindow()
    + if hasattr(pwf, ‘getWikiDataManager’):
    + wikiDataManager = pwf.getWikiDataManager()
    + else:
    + return “Error: dynsearch could not be performed (no pointer to WikiDataManager).”

    exportedWikiWord = None

  12. Francois:

    Thanks for the bugfix, Chris.

Leave a comment