We recently asked developers thinking about multicore a series of questions:
- How many processor cores are in the most common computer configuration of your software's installed base today? 12 months ago? 12 months from now?
- What multithreading approaches have you explored? What stage are you at in exploring/deploying that approach?
- Do you target Windows? Linux/Unix?
We deployed several surveys over the past 3 months, collecting a total of over 300 responses (though not all answered every question). The questions were posed to developers in our database and/or visiting out site, so there may be some self-selection going on (e.g., folks that are at least thinking about the multicore programming challenge). That said, their answers reveal a good deal about where we are in terms of multicore adoption.
Dual- and quad-core systems dominate today
The median system has 2 cores, with just a few percent using 8 or more cores. This scenario is playing out in a similar manner in the data center, where increasingly "fatter" nodes are featuring more and more cores. Indeed, just a couple months ago, Intel announced a 6-core Xeon processor.
It is revealing to plot the developers' expectation for how the median system evolves over time, along with the chip introductions and public forecasts from Intel regarding core counts:
Transistor counts are indeed rising per Moore's Law - though now, this trend is no longer accompanied by a rising clock rate. Instead, it is the number of cores that is growing exponentially. And, we see the familiar ~18-month gap between when a processor first ships and when it makes its way into the installed base.
So, multicore is here, yet the vast majority of applications have not been multicore-enabled. Why? Well, I am guessing that this is the rough calculus done today by software development teams: On a dual-core system (the typical configuration today), one can typically expect a two-fold improvement at best. This performance improvement is just not compelling enough to make the (potentially considerable) development investment. Once the 8- and 16-core systems become mainstream, however, a single-threaded application may be leaving an order of magnitude in performance and throughput on the table. That's when multicore enablement moves from "nice to have" to a "must have."
When and what to do?
Within 12-24 months, the 8- and 16-core systems will be common. At that point, apps that are not effectively multithreaded suffer an order-of-magnitude performance disadvantage with their competitors. So, if you have not yet picked a concurrency platform (a layer of software that coordinates, schedules, and manages the multicore resources), here are a couple considerations:
- It'll take some time - at least months - to explore the leading contenders.
- Once chosen, it'll take one or more release cycles to bring the multicore-enabled software to market.
If you are not already exploring multicore enablement for your performance-sensitive apps, here's how you can estimate when your organization needs to start multicore-enabling your code base:
- Pick the number of cores you intend to first target;
- Look at the silicon roadmap and determine when that number of cores will be available;
- Add the time it takes for new processor chips to penetrate the installed base of your customers' systems;
- Subtract roughly two development cycles to give you enough time to multicore-enable your code base;
- Then, look at how long your organization takes to make decisions around software adoptions, because you'll need to subtract that amount of time to decide which concurrency platform you'll base your system on.
Concurrency Platforms: A Menu of Choices
The good news is that the set of concurrency platforms has been growing increasingly stronger. Contenders include Intel's Threading Building Blocks (TBB), OpenMP, Microsoft's Concurrency Runtime, and Cilk++, to name a few. Each platform has its pros and cons. For example, a library approach such as TBB works best for applications that have relatively distinct, long-running functions that don't interact much with the rest of the app; may be less intrusive; and works best at the "leaves" of the computation. On the other hand, a language extension such as Cilk++ has cleaner, simpler syntax; typically requires less restructuring of legacy code; works well for unbalanced problems; and makes race detection easier.
"What multithreading approaches have you explored?"

While the answers may not be representative of the entire software development community at large (for example, half of the respondents were evaluating our product, Cilk++), several observations emerged:
- Native threads and thread pools are mature approaches. Applications employing this type of concurrency platform have been widely deployed (originating well before multicore CPUs were even shipped), though are not as actively considered for new evaluations or pilot projects.
- OpenMP appears to be doing well in terms of both new evaluations and actual deployments - likely a credit to the fact that it has been out there for a while, is open source, and seems to work well for parallelizing "fat" loops, which are a common case in scientific codes.
- Intel's Threading Building Blocks, which appeared on the horizon within the last two years, appears to be widely evaluated, but deployments are small.
- Cilk++ (our product) is brand new, and although it hasn't been deployed at all yet in customer applications, we are seeing a growing set of evaluations worldwide.
- Microsoft's Parallel FX Library does not yet appear to be widely explored or deployed, although in October Microsoft announced their new Concurrency Runtime and Visual Studio tools which will surely put Microsoft on the multicore map in a big way.
Operating Systems Targeted by Multicore Developers
Finally, the "what O/S" data suggests that the majority of developers need a cross-platform solution - a Windows- or Linux-only solution likely won't cut it. Something for us tool vendors to keep in mind.

I hear the train a comin'...
The distant rumblings we hear - the customer demands for increased application performance, the increasing core counts in PCs, the need to crunch more data with existing servers - is the multicore train. Depending on your station, it may not reach you at the same time as your neighbor, but we'd all be well-advised to prepare for it, because it's comin' through!