According to Hoyle...
Mac OS X 10.6 Snow Leopard: Xcode 3.2 Compilers
September 2009
by Jonathan Hoyle
jonhoyle@mac.com
macCompanion
http://www.jonhoyle.com
In an industry that is perpetually bogged down in
delays and late deliveries,
Apple
beat its own expectations and released
Mac OS X 10.6 Snow Leopard
at the end of August. Being a developer column, we will dive into the
new developer tools. In particular, we are going to examine this month
the new compiler options available with
Xcode 3.2:
gcc 4.2,
LLVM and
Clang.
gcc 4.2
We Mac developers tend to think in terms of versions
of Xcode (the complete development environment provided by Apple), rather than
versions of
gcc
(the open source compiler used by
Xcode). Xcode 1.x
was used to develop for
Mac OS X 10.3 Panther,
and it used
gcc 3.3,
a very popular compiler used amongst Unix programmers. Despite its ubiquitous
use, it lacked the full
ANSI/ISO C++
compliance that was available in the
gcc 4
line. So it was only natural that when
Xcode 2.0
was released (as part of
Mac OS X 10.4 Tiger),
the default compiler was moved up to gcc 4.0. Although Xcode continued to evolve,
with versions 2.1, 2.2, 2.3, 2.4, 2.5, 3.0 and 3.1 spanning over four years (incorporating
universal binaries,
Leopard support,
64-bit
capabilities, etc.), the default version of gcc they used never exceeded version 4.0.1.
Four years is a long time in this industry. It
is high time for a change. With
Xcode 3.2,
the default compiler has been updated to gcc 4.2 (although 4.0 is still available
for users to select if needs be). gcc 4.2 was available as an option in
Xcode 3.0/3.1, but it becomes the default in 3.2. There are a number of nice
features to 4.2, in addition to tweaking
ANSI/ISO
adherence to be in even better compliance. Some of these features involve additional
TR1 library support
for the upcoming
C++0x
standard. These include
the complex<> and random<> classes,
C compatibility header files and even a lock-free version of
the shared_ptr<> class.
But the biggest benefit that gcc 4.2 brings to the
table is
OpenMP,
a multi-processor API which will take full advantage of those
multi-core
Macs that are selling so well now. And unlike
Grand Central Dispatch,
OpenMP is not specific to Snow Leopard or even
Cocoa,
so can be used on code you build for other platforms. An example is best
to explain it further:
Consider a simple, yet common, task of summing the
values of two arrays and placing each pairwise sum into a third array:
for (i = 0; i < numItems; i++)
z[i] = x[i] + y[i];
Suppose this snippet lives in a critical area of
code which you would like to optimize. Machines with multiple processors
could be used to
parallelize
this loop, by dividing the work across each processor core, dramatically improving
performance. Unfortunately, writing
threading code
and implementing the required
mutexes
to support it, can be quite complex, and would necessarily require a great deal
of code debugging to get it right. If such a procedure is needed throughout
different areas of the code, it can get even worse.
With OpenMP, however, only two directives need
be added:
// parallelize this loop
#pragma omp parallel shared(x,y,z,chunk) private(i)
{
pragma omp for schedule(dynamic,chunk) nowait
for (i = 0; i < numItems; i++)
z[i] = x[i] + y[i];
}
The
first #pragma informs
the compiler which data objects are being shared across threads, and which are private. The
second #pragma handles
the for loop
chunking. This code is an order of magnitude easier to write, as it simply directs
the compiler to perform the parallelization. OpenMP is implemented by
POSIX threads,
thus making them quite solid and secure.
As mentioned above, OpenMP is a platform-independent
standard and thus can be used in cross-platform code. So in addition to
Mac OS X, you can compile
this same code snippet in
Visual C++ 2005 Professional and
Sun Studio.
LLVM
One of Apple's newest compiler strategies is
LLVM,
which stands for Low Level Virtual Machine. LLVM is an open source
compiler infrastructure designed for compile-time and run-time optimizations. Although
LLVM targets itself nicely as a replacement for gcc, its internal philosophy is
a bit different. In LLVM, compilers are built as a set of reusable libraries
and supports applications with shared components. Furthermore, it is designed
for performance, with much faster compile times and more optimized code generation
than found in gcc. (Ask any former
Metrowerks
CodeWarrior
user who had to move to Xcode, what they thought of gcc's performance.)
A compiler can be thought of as containing three parts:
the front end (the language parser), optimizer, and the back end (code generator). The
front end (obviously) differs from language to language, the optimizer may or may
not, and the back end (if the compiler is written well) should be completely language
independent. LLVM provides only these last two pieces, and relies on other
(compatible) front ends. gcc 4.2's front end is written to be compatible with
LLVM (which is why you'll sometimes see Apple documentation refer to its LLVM use as
"LLVM-GCC 4.2").
It should be noted that although Apple is concerned
only with three language front ends in Xcode
(C,
C++ and
Objective-C),
the open source LLVM will work with many other gcc front ends, including
Fortran,
Ada and
D.
LLVM has nearly all of the major gcc 4.2 features,
including
blocks,
stack canaries,
OpenMP and the like. LLVM's code generation is significantly better for
32-bit compilation
(particularly for
Intel). Its
64-bit Intel generated code is about the same quality as gcc's, maybe marginally
better. LLVM does not support 64-bit PowerPC compilation.
Clang
Like LLVM,
Clang
is a non-traditional compiler technology. Clang however is only a language
front-end, designed to be used with LLVM's optimizer and back-end. With
Clang, all remaining vestiges of gcc can finally be removed. However, only
Clang's C front end is fully complete at this time. The Clang Objective-C
compiler is usable, but not quite finished. Sadly, its C++ (and by extension: its
Objective-C++)
front end compilers will not be completed before 2011, so is not supported in Xcode 3.2. However,
if you can use Clang, you will see a nearly threefold performance improvement, in compile
times, a major improvement over gcc. (I can hear old CodeWarrior users now
sighing "Finally!").
With a new front end, there are always concerns about
language syntax changes in existing code. Many moving from Xcode 1.5 to
Xcode 2.x (gcc 3.3 to gcc 4.0) a few years ago will remember having to make numerous
code changes. These changes were, for the most part, good because gcc became
tighter in its ISO compliance, and thus changes were made to essentially fix "badly
written" code. However, moving from gcc 4.x to Clang should not involve
nearly as many code changes, although some may be required.
One change is simply in defaults. In gcc 4.x,
the default C compiler is the original (and ancient)
C89 parser. Users
had to manually change to
C99
to get more modern features. Clang correctly uses C99 as the default. Therefore,
if you have some old code which is not C99 compatible (and you don't have time to fix it),
you will have to change your Clang settings to use C89.
In Clang's Objective-C front end, a number of incorrect and
deprecated constructs
are not supported. These include various "bad" casting calls
and sizeof() being used
on NSArray type
objects. Here are some Objective-C examples which compile in gcc 4.x, but
fail in Clang:
[(MyInterface *)super add:4]; // use: [super add:4];
(int *) addr = val; // use: addr = (float *) val;
sizeof(NSArray) // use: class_getInstanceSize([NSArray class])
Clang is also command-line compatible with gcc, so
build script can be simply changed. For example:
/Developer/usr/bin/clang hello.c -o hello
Perhaps the best thing Clang brings to the table is
better error and warning messages. This is yet another complaint that former
CodeWarrior users have (rightly) squawked about over the years about Xcode. Obscure
and arcane gcc messages, which are often more deceiving than helpful, have been
replaced in Clang with more appropriate and helpful ones. An example which
Apple itself has used to demonstrate this improvement is a simple one in which
a missing header
causes NSString not
to be defined. In which case, the following line of code will fail in compilation:
NSString *s = @"I like Clang";
In gcc, that compiler error looks like this:
Expected '=', ',', ';', 'asm' or '__attribute__' before '*' token
My God,
we put a man on the moon 40 years ago,
but we can't get a better error message than that?? A definite WTF. With
Clang however, the error message reads:
Unknown type name 'NSString'
A simple, but definite, improvement.
Conclusion
Things are really looking up in the Mac development
world. Simply put, gcc 4.2 is better than gcc 4.0.1, LLVM-GCC 4.2 is better
than gcc 4.2, and LLVM-Clang is better than LLVM-GCC 4.2. It's all good. Well,
it may be all good, but sadly it's not all finished. Clang still does not
have C++ enabled, making the most promising of the compiler technologies the least
useful at the moment. Fortunately, unless you have some very fragile code,
or are still compiling 64-bit PowerPC (and why would you be needing to do that?),
you should be able to move to LLVM today.
Coming Up Soon: More on Apple's development tools!
To see a list of all the According to Hoyle columns, visit:
http://www.jonhoyle.com/maccompanion
http://www.maccompanion.com/macc/archives/September2009/Columns/AccordingtoHoyle.htm