What Multicore Artisans Say...

Without Cilk++, multicore enablement will require a drastic rewrite of our code, which can only be done by a small minority of our most experienced software developers. With Cilk++, we believe a team of largely junior developers can multicore-enable our code base.

No one provided a solution that is as crisp and simple, as easy to test and debug, and as high performing. I believe that Cilk Arts has a solution to a real and pressing problem in our industry.

Director of Research,
$300M Application Software Vendor

What Analysts Say...

Every software maker out there has got to learn how to program parallel code to remain competitive.

There's going to be a huge learning curve for developers to take on multi-threading in such a big way.

Dan Olds, Principal Analyst
Gabriel Consulting Group

Addressing The Multicore Programming Challenge


Duncan McCallum, CEO of Cilk Arts, discusses the multicore programming challenge facing the industry, and the mission of Cilk Arts

Cilk Arts serves companies who develop performance-sensitive, CPU-bound applications for multicore processors.  For our customers, it is a competitive imperative to exploit all of the performance available in multicore processors. They face three challenges in doing so - focused around development time, software reliability, and performance.

Development Time

The Multicore Challenge 

Developing multi-threaded software is dramatically more complex than developing serial code. This complexity requires organizations to acquire new programming skills - forcing retraining or retooling of development teams. With any of the alternatives to Cilk++, a legacy application must be redesigned before it can be multicore-enabled. These factors put enormous pressure on development schedules and introduce risk.

The Cilk++ Solution

The Cilk++ keywords are simple enough to learn in less than a day. As a result, all of your programmers can quickly become "multicore" developers using Cilk++. With Cilk++ you don't need to recruit new programmers or train existing programmers in a complicated new parallel programming model.

Use of Cilk++ requires little or no redesign of the original serial code, saving months to years of development time and dramatically reducing schedule risk.  A Cilk++ program retains the serial semantics of the original code.  The keywords can also easily be compiled out - allowing you to debug your application with your existing serial tools your programmers are familiar with.  Furthermore, customers can apply Cilk++ incrementally to their application - achieving rapid proof-of-benefit.

Software Reliability

The Multicore Challenge  

When parallelism is introduced into an application, that application becomes vulnerable to "race conditions". A race condition occurs when concurrent software tasks access a shared memory location and at least one of the tasks stores a value into the location. Depending on the scheduling of the tasks, the software may behave differently. The result is software flaws that are nondeterministic and very difficult to detect during testing.

The Cilk++ Solution

Because a Cilk++ program retains the serial semantics of the original program, the debug/test infrastructure already in place to test the serial version of an application remains unchanged. Since both the serial code and the serial regression tests are identical to the original, the serial correctness of a program is unchanged as well.

Using the Cilk++ race detector flags race conditions, assuring the parallel correctness of a program. This allows you to build multi-core enabled code this is as reliable as the original serial application.

Performance

The Multicore Challenge

Building applications that fully utilize all the cores in a CMP is difficult with existing approaches. An application must be tuned to run well on a predetermined number of processor cores.

This dependence on the number of processors means that applications must be modified for each successive processor generation. It also requires that development organizations must support multiple versions of an application for it to run on a heterogeneous collection of hardware platforms.

The Cilk++ Solution

Best-in-class Performance: The Cilk++ runtime library delivers performance equal to or better than the best hand-tuned codes in a fraction of the development time.

Linear Scaling as cores are added: The Cilk++ scheduler delivers near-perfect linear speedup as cores are added, measured on programs with sufficient parallelism at 3.98x on 4 cores. In addition, the scheduler is indifferent to the number of processors being scheduled - it adapts dynamically.

Minimal Overhead on a Single Core: Cilk overheads on a single-core processor are negligible - often less than 2%. The result is that a Cilk application is not dependent on the number of cores available, and will run well even on a single-core machine. A customer can write one Cilk application and it will run optimally on any platform. Applications are future-proofed against the increasing core-count expected in future microprocessors.