cv | papers | pre-prints | books | tools | data

Tim Menzies

Director, RAISE lab, "Real-world AI for SE":
funding | projects | students


  
tim@menzies.us

Winner, 2017 Inaugraul MSR Award

Mar 15, 2017

This evening I learned that I was the winner of the inaugural Mining Software Repositories Foundational Contribution Award.

According to the award web site, the award is a recognition of fundamental contributions in the field of mining software repositories, which helped others to advance the state of the art. 

I was nominated for my work on the PROMISE repository http://openscience.us/repo.

I want to thank the committee for the award and I'd like to dedicate the award to the many people whose hard work made the PROMISE repo possible:

Why PROMISE?

In this era of Github, GHtorrent, et al. it is hard to recall that only a decade ago, it was difficult to access project data. Nevertheless, that was the case.

Back in 2005 many people in the MSR field were analyzing large amount of (public) open source data but kept the tools and processed datasets to themselves as it was often considered a competitive advantage. In fact, within the MSR community, it was not until 2013 that they started their Data Showcase track to encourage sharing of data.

Meanwhile, back in 2005, I started the PROMISE workshop with Jelber Sayyad that "encouraged" data sharing. I put "encouraged" in quotes, because it actually was a very explicit requirement. Here are part of the 2005 call for papers for PROMISE, which put the following text in all caps:

This emphasis in shared and repeatable results was unthinkable at that time and many people predicted that PROMISE would not last long. Tee hee. We proved them wrong. The PROMISE workshop soon grew into its own stand-alone conference. Due to some cosmic quirk of scheduling, the PROMISE and MSR conferences often meet at the same time, in the same corridor, sometimes even in the next room. But both events had full schedules so we rarely made it to each other sessions. Hence, the conferences evolved differently. The following is Prem Devanbu's attempt to capture the differences (and to misquote George Box, he hopes his model is more useful than it is wrong):

Now it is true that most MSR people analyzed their data with statistics and ML, and many PROMISE people did spend time in data collection. But where the PROMISE conferences was different and unique was its analysis of the analysis of data. According to Robles et al. at MSR'10 paper, most MSR papers were not concerned with a repeated analysis of data explored by a prior paper. On the other hand, the PROMISE people routinely posted all their data on a public repository and their new papers would re-analyze old data, in an attempt to improve that analysis.

Since 2011, PROMISE stopped scheduling itself at the same time as MSR. This has lead to richer interactions between MSR and PROMISE people. Hence, as time passes, the directions of these two conferences grow less distinct. Today, MSR meets at ICSE and PROMISE meets at ESEM and both events draw international leaders in the field of software data science.

So just to be clear, the "PROMISE project" has two parts:

This award was given to me for my work on the repo. As to the conference, initially, that conference was tightly connected to the repo (to store the data from papers from the conference). Since then, the scope of the repo has extended to include data from many sources.

As to the PROMISE conference, I was its steering committee chair till 2012 when Stefan Wagner was kind enough to take on those duties. These days, the PROMISE conference is guided by its dedicated and talented steering committee Leandro Minku, Andriy Miranskyy, Massimiliano Di Penta, Burak Turhan, and Hongyu Zhang.

Results from PROMISE

Here's a sample of what was achieved with PROMISE (and if anyone wants to add to this list, just email me at tim@menzies.us):

Reviewer Comments

I need to also thank my nominees for their kind words about PROMISE. The following are quotes from those letters.

The Future of the PROMISE Repository

Now that PROMISE repository has achieved international recognition, it is strange to report that the repo is being decommissioned.

The ZENODO repo at the CERN Large Hadron Collider offers many services that significantly extend what PROMISE can offer:

Accordingly, we have nearly finished moving all the PROMISE data over to the ZENODO repo called SEACRAFT (Software Engineering Artifacts Can Really Assist Future Tasks). In future, if anyone wants a long-term storage facility for data, or scripts in Github, please submit to https://zenodo.org/deposit/new?c=seacraft.