mhonarc replacement for lists.debian.org
Description of the project: lists.debian.org uses mhonarc to format its archive for the web. mhonarc has a number of drawbacks, like not being developed very active or being rusty in terms of code. The pure date based layout of the mboxes isn't also the best, an additional thread based view, a method for searching the archive more intelligent would also be a nice feature. So what we are looking for:
- An mbox based parser that writes out, nice formatted, HTML websites.
- We need to remove mails classified as spam afterwards.
- Harmful content like HTML should be stripped of in a way that xss and other attacks aren't possible
- A new, thread based view
Confirmed Mentor: formorer
How to contact the mentor: formorer@debian.org, IRC: formorer
Confirmed co-mentors: zobel@debian.org, IRC: zobel
Deliverables of the project: a git repository with the code
Desirable skills: Perl (the lingua franca of our team), E-Mail structure (RFC2822, RFC2045, ..), HTML/CSS, some experience with Modern Perl like Moose would be nice
What the student will learn: You will get experience on working with E-Mails, processing E-Mails, seeing numerous implementation bugs in E-Mail. Effective, Modern Perl programming.