When we decided to spin out the Cilk project from MIT, I think we had a good sense of what was needed by software developers going multicore. After all, we'd been working on the technology in the lab for 15 years and had won numerous awards for our research.
Before committing to a detailed, prioritized feature set and schedule, however, we wanted to really delve into the key issues facing software developers going multicore. To that end, we conducted over 70 in-depth interviews with customer prospects.
Over the course of the interviews, we heard a triad of key themes repeatedly:
- Development time
- Software reliability
- Application performance
Moreover, when we asked these folks to prioritize these three properties in the context of a multicore software solution and name the one property they could most easily live without, ninety percent of them said, "All three are essential. Two out of three won't cut it." Let's look one by one at each issue in this multicore software triad and the challenges it poses for software developers.
Application performance
At some level, performance is what multicore is all about, but for many applications, it's really more about minimizing response time, not just maximizing throughput. Running two copies of the app isn't good enough - you want one copy to run twice as fast, or, as in the games market, you have a "time box" and need to do as much as you can within a given time budget. Application performance is also about scaling. Do you solve the problem once and for all, or are you reimplementing your software as every new processor generation produces chips with more cores? And, does your multicore solution scale down, as well as up, meaning that your multicore-enabled software runs just as fast on one core as your original code, or is there a substantial overhead?
Development time
Every company worries about getting their product out on time. That's particularly hard with multicore, because parallel-programming talent is hard to find. At one $300M company we talked to, they have 200 programmers and a C++ code base of over 7 million lines. We asked how many had any experience with multithreading or other parallel programming technologies. They answered, "You're looking at them." There were 5 software developers in the room. A good software solution to multicore addresses all 200 of their programmers, not just 5. That means a complete redesign of a major app is out of the question, and maintaining multiple sources - the original and a parallel one - is problematic.
Software reliability
Race conditions: the bane of parallel programming! How will you debug your parallel application? How can you test it effectively before release? Unless you can find race conditions, it doesn't matter how fast the execution time or easy the coding process, a race bug can bring it all down. Famous race bugs include the
Therac-25 radiation therapy machine, which killed three people and injured several others, and the
North American Blackout of 2003, which left over 50 million people without power. These pernicious bugs are notoriously hard to find. You can run regression tests in the lab for days without a failure only to discover that your software crashes in the field with regularity. If you're going to multicore-enable your application, you need a reliable way to find and eliminate race conditions.
Cilk++
In building the Cilk++ platform, the Cilk Arts engineers have worked hard to address all three components of the multicore software triad:
- Application performance: The runtime library provides linear speed-up on applications, while exhibiting virtually no overhead on a single processor.
- Development time: Cilk++ requires programmers to learn only a handful of simple keywords, enabling even junior programmers to develop multicore software.
- Software reliability: The Cilkscreen race detector finds races - guaranteed - localizing them to file, line number, and variable reference, including stack traces.
Share YOUR perspective!
We would love to hear from you:
- What do you think - does this perception reflect your needs when going multicore? What requirements might the triad miss?
- How would you prioritize the 3 aspects?
- Which of the three attributes would you give up, if you had to?