Name: Joseph Bisch
Contact/Email: firstname.lastname@example.org irc jbisch
Timezone: UTC-05:00 (East Coast US)
Background: Currently a junior in college studying Electrical Engineering at University of New Haven in West Haven, CT, US. My main distro is currently ?CrunchBang. I prefer to use Vim and Git. Some of my other interests are stamp collecting and Atari 8-bit computers.
Project title: Debian metrics portal
Project details: Create a Debian Metrics Portal to view, add, and maintain metrics in a uniform way.
Benefits to Debian: Make it easier to improve Debian by providing a single source of metrics to evaluate changes.
- database structure to store all historical data points of all metrics
- standardized declarative interface to add/remove metrics to be graphed. The interface should allow for both "local" metrics (e.g. data generated by scripts run on the machines hosting the metrics portal) and "remote" metrics (e.g. data generated by remote data sources which are then periodically gathered by the metrics portal)
- cron jobs to periodically fetch new data and generate graphs
- proof of concept: integration of (some of the) existing graphs in the metrics infrastructure
- web interface to show updated graphs of the various metrics
- client-side dynamic web interface to graph, on demand, specific metrics (possibly more than one at a time to look for correlations) over the desired time periods
- (optional) produce a Debian Package of the portal code to ease deployment on Debian-based machines
Project schedule: I created a Gantt chart to illustrate my anticipated breakdown of the allotted time. I scheduled two days off, the Fourth of July and my birthday. I left one week at the end of Coding Period 1 to allow time to finish any unfinished tasks. I can use the week I allotted for producing the Debian package at the end of Coding Period 2 if I have any unfinished tasks at that point. I don't anticipate needing it and plan on finishing all tasks, including the optional packaging task. You can view the Gantt chart at http://josephbisch.com/debian-metrics-portal.html. The only other commitment, besides the two days off, is one online summer class. It will be taken the first half of summer (May 19-June 30) and will require no more than 4 hours per week of my time. Those 4 hours include lectures and studying/homework. I don't believe the class will significantly affect my ability to complete this project. I can still dedicate at least 40 hours per week towards this project.
https://wiki.debian.org/Statistics appears to have a complete list of existing sources. I will implement metrics for the following as part of this project. In the future more metrics will be implemented.
- BTS stats including important bugs and old bugs. Important bugs are those that have a major effect on the usability of a package, making it completely unusable. Old bugs are bugs older than 2 years. Old bugs may either have just been neglected or the bug report might not be detailed enough. It is important to display these in such a way as to make the data accessable.
- Release-critical means that a bug affects the release of the package with the stable release of Debian. It is important to graph the total number of RC bugs and also the packages with the most RC bugs to show how close a release is to being RC bug free and which packages need the most attention.
- Dpkg-formats - List and graph the total number of packages that use each format. Can use to figure out why there is so much 3.0 (native) and 1.0 format usage. Possibly can correlate with packages with missing maintainers, undermaintained packages.
- Source code stats - Display statistics about the number of lines and size of releases and various packages. Important to identify changes in number of lines and in size between releases so we can identify where increased size comes from.
- VCS-usage - List and graph the total number of packages that use each VCS. Proves that git is most popular VCS in Debian. Allows us to look into trends in VCS usage and see how tools can be improved to encourage use of git over other VCS.
Why am I right for this project:
I am experienced with Python. For example, while working for my campus' tutoring center, I wrote a program that parsed the bookstore's website and generated a CSV file of all the books. I wrote it using urllib2 to get the webpages and ?BeautifulSoup to scrape the pages. I saved the center time by automating a task that had been done manually in past years.
I am willing to ask questions and learn. Prior to creating this application I never used ?TaskJuggler, but I learned it to create the Gantt chart. I already emailed the mentors to ask a question about the sources.history script. However, I will reserve asking questions for when sources such as the Debian Wiki and Google do not suffice.
- I have experience working as a team. I am on my school's robotics team, and was also on my high school's robotics team. I frequently have to work as a team as part of my engineering education.
- I took an introductory C course my freshman year of college. I also took a microcontroller course that used C. I went on to TA (be the Teacher's Assistant) the microcontroller course for one semester.
- I have experience with LabVIEW from high school robotics and college classes. I have experience with MATLAB from college classes. I took a technical writing class in college
- After graduating I mentored my high school's robotics team. I gave a seminar on LabVIEW.
- I don't have prior experience contributing to Debian. I am already familiarizing myself with Debian wiki, irc, mailing lists, etc. to minimize the impact of this.
- I don't have prior experience with SQLAlchemy, or the templating engine. I have scheduled plenty of time to familiarize myself with those.
Why Debian?: I run ?CrunchBang, which is heavily based off of Debian. So my work will have a large impact on the distro I run day-to-day and on many other people.
Am I applying for other projects?: No