Improvements to Debian Search
Mentor: Olly Betts (ol on irc.oftc.net)
Summary: Improve Debian Search
Required skills:
- C++
- HTML
Description: Debian has a free-text search for its website (http://search.debian.org/) and for the mailing lists (http://lists.debian.org/search.html) which were both being developed by a Debian Developer who has now retired from the Debian. Both searches are powered using Xapian and its CGI front-end, Omega. Some areas that need work:
- Debian's web pages are available translated into many languages, and there are Debian mailing lists for discussion in particular languages, but the language filtering in the searches needs more work.
- It would be useful to be able to perform a combined search over the website, mailing list, and potentially other sources of information such as the
Debian wiki and packages.
- CJK (Chinese, Japanese, Korean) could be better supported, which would require writing a custom tokeniser (either external to Xapian or as an enhancement to it).
- There are several patches to Xapian currently being used which should be cleaned up and fed upstream. We ideally want to be using the standard Debian packages of Xapian and Omega.