Differences between revisions 1 and 2
Revision 1 as of 2010-03-29 14:32:32
Size: 1404
Editor: ?OllyBetts
Comment: new gsoc project idea
Revision 2 as of 2010-03-29 15:08:12
Size: 1400
Editor: ?OllyBetts
Deletions are marked like this. Additions are marked like this.
Line 11: Line 11:
 powered using [[http://xapian.org/|Xapian]] and its Omega CGI front-end.  powered using [[http://xapian.org/|Xapian]] and its CGI front-end, Omega.
Line 23: Line 23:
  * CJK (Chinese, Japanese, Korean) could be better supported, though this would   * CJK (Chinese, Japanese, Korean) could be better supported, which would
Line 30: Line 30:

Improvements to Debian Search

  • Mentor: Olly Betts (ol on irc.oftc.net)

  • Summary: Improve Debian Search

  • Required skills:

    • C++
    • HTML
  • Description: Debian has a free-text search for its website (http://search.debian.org/) and for the mailing lists (http://lists.debian.org/search.html) which were both being developed by a Debian Developer who has now retired from the Debian. Both searches are powered using Xapian and its CGI front-end, Omega. Some areas that need work:

    • Debian's web pages are available translated into many languages, and there are Debian mailing lists for discussion in particular languages, but the language filtering in the searches needs more work.
    • It would be useful to be able to perform a combined search over the website, mailing list, and potentially other sources of information such as the

      Debian wiki and packages.

    • CJK (Chinese, Japanese, Korean) could be better supported, which would require writing a custom tokeniser (either external to Xapian or as an enhancement to it).
    • There are several patches to Xapian currently being used which should be cleaned up and fed upstream. We ideally want to be using the standard Debian packages of Xapian and Omega.