According
to Hoyle...
Mac OS X 10.6 Snow Leopard: Xcode 3.2 Compilers
September 2009
by Jonathan Hoyle
jonhoyle@mac.com
macCompanion
http://www.jonhoyle.com
In an industry that is perpetually bogged down in
delays and late deliveries, Apple beat its own expectations and released Mac OS
X 10.6 Snow Leopard at the end of August. Being a developer column, we will dive into the new developer
tools. In particular, we are going
to examine this month the new compiler options available with Xcode 3.2: gcc
4.2, LLVM and Clang.
gcc 4.2
We Mac developers tend to think
in terms of versions of Xcode (the complete development environment provided by
Apple), rather than versions of gcc (the open source compiler used by
Xcode). Xcode 1.x was used to
develop for Mac OS X 10.3 Panther, and it used the gcc 3.3 compiler, a very
popular compiler used amongst Unix programmers. Despite its ubiquitous use, it lacked the full ANSI/ISO C++
compliance that was available in the gcc 4 line. So it was only natural that when Xcode 2.0 was released (as
part of Mac OS X 10.4 Tiger), the default compiler was moved up to gcc 4.0. Although Xcode continued to evolve,
with versions 2.1, 2.2, 2.3, 2.4, 2.5, 3.0 and 3.1 spanning over four years
(incorporating universal binaries, Leopard support, 64-bit capabilities, etc.),
the default version of gcc they used never exceeded version 4.0.1.
Four years is a long time in
this industry. It is high time for
a change. With Xcode 3.2, the
default compiler has been updated to gcc 4.2 (although 4.0 is still available
for users to select if needs be). gcc 4.2 was available as an option in Xcode 3.0/3.1, but it becomes the
default in 3.2. There are a
number of nice features to 4.2, in addition to tweaking ANSI/ISO adherence to
be in even better compliance. Some of these features involve additional TR1 library support for the upcoming C++0x standard. These include the complex<> and random<> classes, C compatibility header files and even a lock-free version of the shared_ptr<> class.
But the biggest benefit that
gcc 4.2 brings to the table is OpenMP, a multi-processor API which will take
full advantage of those multi-core Macs that are selling so well now. And unlike Grand Central
Dispatch, OpenMP is not specific to Snow Leopard or even Cocoa, so
can be used on code you build for other platforms. An example is best to explain it further:
Consider
a simple, yet common, task of summing the values of two arrays and placing
their sum in a third array:
for (i = 0; i < numItems; i++)
z[i]
= x[i] + y[i];
Suppose
this snippet lives in a critical area of code which you would like to
optimize. Machines with multiple processors could be used to parallelize this
loop, by dividing the work across each processor core, dramatically improving
performance. Unfortunately, writing the threading code and implementing the
required mutexes to support this is quite complex, and would necessarily
require a great deal of code debugging to get it right. If such a procedure is
needed throughout different areas of the code, it gets even worse.
With OpenMP, however,
only two directives need be added:
// parallelize this loop
¦pragma omp
parallel shared(x,y,z,chunk)
private(i)
{
¦pragma omp for
schedule(dynamic, chunk) nowait
for (i = 0; i < numItems; i++)
z[i]
= x[i] + y[i];
}
The
first ¦pragma informs the compiler which data objects are being shared
across threads, and which is private. The second ¦pragma handles
the for loop chunking. This code is an order of magnitude easier to write, as
it simply directs the compiler to perform the parallelization. OpenMP is implemented by POSIX threads, thus making them quite
solid and secure.
As
mentioned above, OpenMP is a platform-independent standard and thus can be used in
cross-platform code. So in
addition to Mac OS X, you can compile this same code snippet in Visual
C++ 2005 Professional and Sun
Studio.
LLVM
One of Apple's newest compiler
strategies is LLVM,
which stands for Low Level Virtual Machine. LLVM is an open source compiler
infrastructure designed for compile-time and run-time optimizations. Although LLVM targets itself nicely as
a replacement for gcc, its internal philosophy is a bit different. In LLVM, compilers are built as a set
of reusable libraries and supports applications with shared components. Furthermore, it is designed for
performance, with much faster compile times and more optimized code generation
than found in gcc. (Ask any former
Metrowerks CodeWarrior user who had to move to Xcode, what they thought of
gcc's performance.)
A compiler can be thought of as
containing three parts: the front end (the language parser), optimizer, and the
back end (code generator). The
front end (obviously) differs from language to language, the optimizer may or
may not, and the back end (if the compiler is written well) should be
completely language independent. LLVM provides only these last two pieces, and relies on other
(compatible) front ends. gcc 4.2's
front end is written to be compatible with LLVM (which is why you'll sometimes
see Apple documentation refer to its LLVM use as "LLVM-GCC 4.2").
It should be noted that
although Apple is concerned only with three language front ends in Xcode (C,
C++ and Objective-C), the open source LLVM with work with many other gcc front
ends, including Fortran, Ada and D.
LLVM has nearly all of the major
gcc 4.2 features, including blocks, stack canaries, OpenMP and the like. LLVM's code generation is significantly
better for 32-bit compilation (particularly for Intel). Its 64-bit Intel generated code is
about the same quality as gcc's, maybe marginally better. LLVM does not support 64-bit PowerPC
compilation.
Clang
Like LLVM,
Clang is a non-traditional compiler
technology. Clang however is only
a language front-end, designed to be used with LLVM's optimizer and back-end. With Clang, all remaining vestiges of
gcc can finally be removed. However, only Clang's C front end is fully complete. The Clang Objective-C compiler is
usable, but not quite finished. Sadly its C++ (and by extension its Objective-C++) front end compilers
will not be completed before 2011, so is not supported in Xcode 3.2. However, if you can use Clang, you will
see a nearly threefold performance improvement, in compile times, a major
improvement over gcc. (I can hear
old CodeWarrior users now sighing "Finally!").
With a new front end, there are
always concerns about language syntax changes in existing code. Many moving from Xcode 1.5 to Xcode 2.x
(gcc 3.3 to gcc 4.0) a few years ago will remember having to make numerous code
changes. These changes were, for
the most part, good because gcc became tighter in its ISO compliance, and thus
changes were made to essentially fix "badly written" code. However, moving from gcc 4.x to Clang
should not involve nearly as many code changes, although some may be required.
One change is simply in
defaults. In gcc 4.x, the default
C compiler is the original (ancient) C89 parser. Users had to manually change to C99 to get more modern
features. Clang correctly uses C99
as the default. Therefore, if you
have some old code which is not C99 compatible (and you don't have time to fix
it), you will have to change your Clang settings to use C89.
In Clang's Objective-C front
end, a number of incorrect and deprecated constructs are not supported. These include various "bad"
casting calls and sizeof() being used on NSArray. Here are some Objective-C examples which compile in gcc 4.x
but fail in Clang:
[(myInterface
*) super add: 4] // should just be [super add: 4]
(int
*) addr = val; // casting an l-value is bad. Use: addr = (float *) val;
sizeof(NSArray) // use: class_getInstanceSize([NSArray class]);
Clang is also command-line
compatible with gcc, so build script can be simply changed. For example:
/Developer/usr/bin/clang
hello.c -o hello
Perhaps the best thing Clang
brings to the table is better error and warning messages. This is yet another complaint that
former CodeWarrior users have (rightly) squawked about over the years. Obscure and arcane gcc messages, which
are often more deceiving than helpful, have been replaced in Clang with more
appropriate and helpful ones. An
example Apple itself has used to demonstrate this improvement is a simple one
in which a missing header causes NSString not to be defined. In which case the following line of
code will fail in compilation:
NSString *s = @"I
like Clang";
In gcc, that compiler error
looks like this
Expected '=', ',', ';', 'asm' or '__attribute__' before
'*' token
My God, we put a man on the
moon 40 years ago, but we can't get a better error message than that? A definite WTF. With Clang however, the error message
reads:
Unknown type name 'NSString'
A simple, but definite,
improvement.
Conclusion
Things
are really looking up in the Mac development world. Simply put, gcc 4.2 is better than gcc 4.0.1, LLVM-GCC 4.2
is better than gcc 4.2, and LLVM-Clang is better than LLVM-GCC 4.2. It's all good. Well, it may be all good, but sadly
it's not all finished. Clang still
does not have C++ enabled, making the most promising of the compiler
technologies the least useful. However, unless you have some very fragile code, or are still compiling
64-bit PowerPC (and why would you be needing to do that?), you should be able
to move to LLVM today.
Coming
Up Next Month: More on Apple's development tools!
To see a list of all the According to Hoyle columns,
visit:
http://www.jonhoyle.com/maccompanion
http://www.maccompanion.com/macc/archives/September2009/Columns/AccordingtoHoyle.htm