scan-build on the Debian archive
Leo Cavaille GSoC application
Name: Léo Cavaillé
Contact/Email: leo+debian@cavaille.net, Chiron on OFTC, Chir0n on freenode
Background: I am currently studying at Ecole Centrale Paris, one of the top French engineering school. I had the opportunity to be involved in free software and especially Debian for the last few years through VIA Centrale Réseaux, an IT club managing on its own the campus network. All of our servers use Debian, and in many ways : mails, network storage, virtualization, redundancy, software development... I have a good knowledge of C and C++ (had the occasion to do some "real" advanced and fun C/C++ stuff in a previous internship in Australia), but also Perl or shell scripting. Since the last minidebconf in Paris where Sylvestre talked about http://clang.debian.net, I have been following the clang/LLVM project (mailing lists and some commits reading) and it has been very interesting and inspiring for my future projects. Besides I have done some web development projects using multiple technologies such as pure OO PHP, CGI, Symfony and obviously HTML/CSS.
Project title: scan-build on the Debian archive
Project details: The project consists in using clang's static analyzer (C, C++, Objective-C) on Debian packages to help developers find bugs that compilers are not able to find. scan-build can detect a large range of oddities in the code from dead assignments (a clean up is always something worthy) to null pointer dereferencement or weird malloc/free scenarios. This is a process that could be integrated into Debian quality assurance workflow, to ensure stability for end users and remove bugs even before uploading packages to the archive (such as other daca tools or lintian checks).
Code snippet: I have built Debian packages from source in a chroot with scan-build reports and cowbuilder : here. Then, I built all the packages installed on my workstation using this script.
Synopsis: Rebuild the Debian archive with static analysis by scan-build.
Benefits to Debian:
- Avoid buggy packages by providing static analysis of source code, ensuring Debian Quality engagement.
- Reinforce Debian spirit to help developers upstream with great tools and reports about their packages.
- Occasion to improve and add material to Debian Automated Code Analysis (daca).
Deliverables: The software and deployment plans required to run scan-build on the Debian archive and a neat reporting interface on daca improving proactivity to squash bugs in Debian packages.
Project schedule:
Beginning till end of June : Understand and use Debian packaging and building software to smoothly add scan-build in the loop.
Details : Write/finish scripts that can setup a proper build environment to use scan-build. The analysis should take place in a chroot and using cowbuilder is really efficient to speed things up as you can keep a chroot already set up for multiple builds. See code snippet.
July till mid-July : Design Debian archive scan-build and scale the hardware/software required (CPU time, reports storage), thoughts about reports on daca and useful indicators.
Details : Evaluate the load of re-building all the Debian archive, but also the load of running scan-build on every new version pushed to the archive. Optimize things for distributed architectures to speed up the build. Consider storage problems if all the reports are stored in plain HTML. See what is the most appropriate format and how scan-build could be tweakable in this way (also check LLVM main evolutions pending for scan-build). Find some servers ready to run scan-build on the Debian archive, get access to them and advice from daca's other checks' maintainers.
For mid-term evaluation : Implementation and Debian archive scan-(re)build will be on the way ! Documentation about the ongoing build and technical choices. Ideas to review for report analysis and display from scan-build.
Details : All checks results could also be exported in a standard format such as RDF to be used from elsewhere (packages.qa.d.o or on any metrics portal) and many more ideas to come !
August till mid-August : Implement a way to provide reports to developer and useful indicators (different scales : archive/package/…, by error, comparing {un,}stable testing, nice graphs), could talk with Zack's student about metrics portal. (Debconf will take place during this period, could be the occasion to share and talk IRL about the community, answer questions, ask for advice).
Details : Graphic representation about checks could be implemented with nice data-driven libraries such as d3.js (may be packaged soon !). A section can be added to the Debian PTS to remind developers the failed scan-build.
Till the end : Make it work and communicate with the community via mailing lists, and write thorough documentation about everything, for users and developers to scan-build debian packages. Have some time to write a full-fledged final report.
Exams and other commitments: None, I will dive into the deep Debian ocean.
Other summer plans: Hardly any… Maybe 1 or 2 weekends with limited access to my workstation and to the Internet, and I am also considering enrolling for the Debconf in Switzerland.
Why Debian?: I have been using Debian extensively for the past three years : from my personal workstation, school projects or servers for myself (personal storage/mails…) and VIA (virtualization, DHCP, DNS, RADIUS, storage, backups, website…). Debian is my favorite distro (but so is everyone's, isn't it ?) and I had the occasion to see many aspects - both technical and political - of the Debian community which I follow very closely. Lastly, I did participate to the Paris minidebconf, where I met a bunch of developers that reinforced my (urgent) wish to contribute to this great project. Eventually, the social contract and the DFSG deal with topics that really matters to me : open-sourceness and free software.
Are you applying for other projects in SoC? Not for now, but I am currently considering to apply for a scan-build improvement project that has been posted on LLVM mailing-lists.
Bonus sections :
Benefits to me : Learn a lot of things about Debian archive, build systems and architectures, but also get to know the clang project better. Also, it will be a real start to contribute to Debian and introduce myself to the community for the years to come as I want to keep on contributing.
Why GSOC ? : I have been surrounded by free software and open-source philosophy at VIA for the past three years, and I am loving it. This is the reason why it seems logical to me being a part of this and work hard to claim those ideas. Former members of VIA are involved in many open source projects (Debian, videolan, libcaca, …) and talking to them made me realize that without free software, the world just wouldn't be as awesome. That's why GSOC is a very good opportunity to kill two birds with one stone : make something useful of my summer and support Debian.