<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.1//EN' 'http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd'>
<html xmlns:og='https://ogp.me/ns#' lang='en'>
<head>
<title>ongoing by Tim Bray &#xb7; On Threads</title>
<meta name='viewport' content='width=device-width, initial-scale=1.0, shrink-to-fit=no'/>
<meta property='og:site_name' content='ongoing by Tim Bray'/>
<meta property='og:title' content='On Threads'/>
<meta property='og:image' content='/ongoing/misc/podcast-default.jpg'/>
<meta property='og:type' content='website'/>
<meta http-equiv='Content-Type' content='text/html; charset=UTF-8'/>
<link rel='stylesheet' type='text/css' media='screen' title='serif' href='/ongoing/serif.css' />
<script type='text/javascript' src='//use.typekit.net/ugm7uwx.js'></script>
<script type='text/javascript'>try{Typekit.load();}catch(e){}</script>
<script type='text/javascript' src='/ongoing/ongoing.js'></script>
<link rel='alternate' type='application/atom+xml' title='Atom (full content)' href='/ongoing/ongoing.atom' />
<!-- Generated from XML source code using Perl, Expat, Emacs, Mysql, Ruby, Java, and ImageMagick.  Industrial-strength technology, baby. -->
</head><body itemscope='' itemtype='http://schema.org/Blog'>
<div id='payload'>
<div id='banner'><h1 itemprop='name'>On Threads</h1><div id='search'><form action="https://www.google.com/search" target="_parent">Search <input size="20" name="as_q" /><input type="hidden" name="hl" value="en" /><input type="hidden" name="ie" value="UTF-8" /><input type="hidden" name="btnG" value="Google+Search" /><input type="hidden" name="as_qdr" value="all" /><input type="hidden" name="as_occt" value="any" /><input type="hidden" name="as_dt" value="i" /><input type="hidden" name="as_sitesearch" value="tbray.org" /></form></div></div>
<div id='center-and-right'><div id='centercontent'>
<p itemprop='description'>Last week I attended a Sun â€œCMT Summitâ€, where CMT stands for â€œChip
Multi-Threadingâ€; a roomful of really senior Sun people talking about the
next wave of CPUs and what they mean.
While much of the content was stuff I canâ€™t talk about, I was left with a
powerful feeling that there are some 
real important issues that the whole IT community needs to start thinking
about now.
Iâ€™ve
<a href='/ongoing/When/200x/2004/12/13/Multicore'>written about this
before</a>, and of the many others who have too, Iâ€™m particularly impressed by
<a href='http://www.aceshardware.com/read.jsp?id=65000333'>Chris Rijkâ€™s
work</a>.
But I think itâ€™s worthwhile to pull all this together into one place and do
some calls to action, so here goes.
<i>[Ed. Note: Too long and too geeky for most.]
[Update: This got slashdotted and I got some really smart feedback, thus
<a href='/ongoing/When/200x/2005/06/20/Threads'>this
follow-up</a>.]</i></p>
 
<p id='p-1' class='p1'><span class='h2'>Where We Are Now</span> &#xb7; 
Itâ€™s no secret at all that weâ€™re shipping
<a href='http://www.sun.com/processors/throughput/'>Niagara</a> before too long
(pictures
<a href='http://blogs.sun.com/jonathan/20040910'>here</a>).
Niagara has eight cores each with hardware support for four threads;
and bear in mind that weâ€™re talking about Niagara <strong>1</strong>.
I am totally not privy to clock-rate numbers, but I see that Paul
Murphy is
<a href='http://blogs.zdnet.com/open-source/?p=316&amp;part=rss&amp;tag=feed&amp;subj=zdblog'>claiming over on ZDNet</a> that 
it runs at 1.4GHz.</p>

<p>Whatever the clock rate, multiply it by eight and itâ€™s pretty obvious that
this puppy is going to be able to pump through a whole lot of instructions in
aggregate.</p>

<p>Itâ€™s not just us.  Both AMD and Intel are sorta kinda shipping dual-core
parts, and just this week AMD was
<a href='http://www.extremetech.com/article2/0,1558,1826663,00.asp'>making
quad-core noises</a>.  I thought the key line in that story was the AMD people
talking about â€œthroughput per watt per dollarâ€; a lot of really smart people
all over the industry are deciding that throughput per watt is getting more
important by the day, and as for throughput per dollar, thatâ€™s not new
news.</p>

<p>IBMâ€™s much-touted â€œCellâ€ work is highly parallel; having said
that, they seem to still be bearing down harder than anyone else on cranking
up the clock.  I wonder if anyone will hit 5GHz in the foreseeable future, and
if so I wonder what kind of cooling-system rocket science will be required to
keep the sucker from doubling as a small local nuclear-fusion powerplant?</p>

<p>So, while Sun will probably be the first
player slapping big money down on the multithreading horse in the
high-stakes CPU race, you still need to pay attention even if youâ€™re not a Sun
customer.
Because a few years from now, youâ€™re going to need a lot more CPU cycles
than you do now, and unless youâ€™re willing to bet on that 5GHz fusion reactor,
multithreading is how youâ€™re probably going to get them.</p>

<p id='p-2' class='p1'><span class='h2'>What Scales and What Doesnâ€™t?</span> &#xb7; 
At one point during the CMT summit, I stuck my hand up and asked: is there
anything that <em>in principle</em> doesnâ€™t scale with multithreading?
There wasnâ€™t a lot that leapt to the mindsâ€™ eyes, except for compiler code.
(Bear in mind that while an individual compile doesnâ€™t parallelize that well,
what <i>make</i> and <i>Ant</i> do can be, and has been.)</p>

<p>Now of course, the room was full of Sun infrastructure weenies, so if thereâ€™s
something terribly obvious in records management or airline reservations or
payroll processing that doesnâ€™t parallelize, we might not know about it.
Having said that, itâ€™s fair to conclude that multithreading will help with a
pretty fair proportion of the things that computers do.</p>

<p>And of course there are lots of workloads where multithreading is already
known to work beautifully, and that includes a whole lot of Web workloads and
other server-side apps.</p>

<p id='p-6' class='p1'><span class='h2'>The Programmer Community</span> &#xb7; 
The conversation at the summit spent quite a bit of time on what we need to
do to help the developer community get the most out of these weird chips; the
days of putting everything in a big loop and counting on the clock rate to
make your program run faster every year are so, so, over.</p>

<p>First off, we agreed pretty quickly that there isnâ€™t one â€œdeveloper
communityâ€; there are infrastructure developers who totally have to think
about threading and concurrency, and there are application developers who
would totally rather not.</p>

<p>Which led to a real interesting discussion: are mainstream enterprise
programmers, who today live somewhere in the spectrum from Visual Basic to
J2EE, ever going to do concurrent programming, consciously?
The conventional wisdom is that theyâ€™re just not up to it, but as
<a href='http://today.java.net/pub/au/189'>Graham Hamilton</a> pointed out,
there was a long period when the bleeding edge knew all about
Object-Orientation and garbage collection and so on, but despaired of the
mainstream developer ever getting there.
But then the mainstream moved, and
pretty quickly too.
So I donâ€™t think any of us are comfortable with asserting that â€œThe mainstream
will never grok concurrency.â€
Maybe itâ€™s just the tools that are missing?</p>

<p>At this point the 
<a href='http://www.erlang.org/'>Erlang</a> community will jump up and down
and shout â€œWe have the answer!â€  Maybe, but Iâ€™m dubious: if I understand
Erlang correctly, it abjures the use of global data, which simplifies the
problems immensely.
Iâ€™ve done a lot of concurrent work, and my biggest programming wins 
have been all about a bunch of threads running around a big shared data
structure.</p>

<p id='p-4' class='p1'><span class='h2'>Java Basics</span> &#xb7; 
Java and CMT are a really good fit for each other.
For people who have to actually deal with threads and locks and that kind of
stuff, Java provides the best programming infrastructure Iâ€™ve ever used.
This doesnâ€™t mean itâ€™s easy, but it is tractable.
(Can I assume that .NET, since it came after Java, learned the lessons and is
also decent in this respect?)</p>

<p>At a higher level, people in J2EE-land live in a world of containers of
various kinds, and these things are all thread-savvy because theyâ€™ve been
carefully built that way by experts. 
So a lot of the big enterprise apps should be CMT-turbocharged just fine, for
free.</p>

<p id='p-5' class='p1'><span class='h2'>Where are the Problems?</span> &#xb7; 
The most useful part of the summit was identifying the places where the
industry as a whole and Sun in particular need to get to work to get ready for
ubiquitous CMT.
Because weâ€™re facing some real problems.</p>

<p id='p-8' class='p1'><span class='h3'>Problem: Legacy Apps</span> &#xb7; 
Youâ€™d be surprised how many cycles the worldâ€™s Sun boxes spend running
decades-old FORTRAN, COBOL, C, and C++ code in monster legacy apps that work
just fine and arenâ€™t getting thrown away any time soon.
There arenâ€™t enough people and time in the world to re-write these suckers,
plus it took person-centuries in the first place to make them correct.</p>

<p>Obviously itâ€™s not just Sun, I bet every kind of computer you can think of
carries its share of this kind of good old code.
I guarantee that whoever wrote that code wasnâ€™t thinking about threads or
concurrency or lock-free algorithms or any of that stuff.
So if weâ€™re going to get some real CMT juice out of these things, itâ€™s going
to have to be done automatically down in the infrastructure.
Iâ€™d think the legacy-language compiler teams have lots of opportunities for
innovation in an area where you might not have expected it.</p>

<p id='p-7' class='p1'><span class='h3'>Problem: Observability</span> &#xb7; 
One of the lessons of Solaris 10 is that being able to tell what your
system is doing is a <em>really big deal</em>.
Of all the Solaris goodies, DTrace has been the big attention-getter.
If we drop a big complicated app onto a CMT box and it runs real fast, thatâ€™s
good; but what if it doesnâ€™t?  Weâ€™re going to need DTrace or equivalent right
up through all the levels of the application stack, and building thatâ€™s going
to be a big job.</p>

<p id='p-12' class='p1'><span class='h3'>Problem: Java Mutexes</span> &#xb7; 
The standard APIs that came with the first few versions of Java were thread
safe; some might say fanatically, obsessively, thread-safe.
Stories abound of I/O calls that plunge down through six layers of stack, with
each layer posting a mutex on the way; and venerable standbys like
<a href='http://java.sun.com/j2se/1.5.0/docs/api/java/lang/StringBuffer.html'>StringBuffer</a>
and
<a href='http://java.sun.com/j2se/1.5.0/docs/api/java/util/Vector.html'>Vector</a>
are mutexed-to-the-max.
That means if your app is running on next yearâ€™s hot chip with a couple of
dozen threads, if youâ€™ve got a routine thatâ€™s doing a lot of string-appending
or vector-loading, only one thread is gonna be in there at a time.</p>

<p>One thing the Java people need to do is put big loud blinking messages in
all that Javadoc saying <strong>Using this class may impair performance in
multi-threaded environments!</strong> You can drop in
<a href='http://java.sun.com/j2se/1.5.0/docs/api/java/util/ArrayList.html'>ArrayList</a>
for Vector and
<a href='http://java.sun.com/j2se/1.5.0/docs/api/java/lang/StringBuilder.html'>StringBuilder</a>
for StringBuffer.
Hey, I just noted that the StringBuffer Javadoc does have such a warning; good
stuff, but we need to be doing more evangelism on this front.</p>

<p>On the other hand, those mutexes were there for a reason.
Nobodyâ€™s saying â€œIgnore thread-safetyâ€ but rather â€œThread-safety is expensive,
donâ€™t do it unless you need to.â€</p>

<p id='p-10' class='p1'><span class='h3'>Problem: LAMP</span> &#xb7; 
An increasing proportion of enterprise computing is being done not in J2EE
nor WebSphere nor .NET, but in PHP and Python and MySQL and this or
that Apache module.
While Iâ€™m a big fan of dynamic languages, when it comes to
parallelism theyâ€™re pretty primitive compared to Java.
So that community is going to need to put some cycles into becoming
CMT-friendlier, and theyâ€™re starting from behind.</p>

<p>With the exception of Apache, which has been thread-savvy and sensibly
concurrent for a long time.
Iâ€™m pretty intimate with the guts of Apache, and based only on what weâ€™ve said
publicly about Niagara, hereâ€™s a fearless prediction: a workload that is
Apache-dominated is going to run like a bat out of hell on that kind of box.
Wait and see.
Having said that, a lot of the Apache world is still running 1.3, Apache 2
changed the process/threading model quite a lot, so I suspect thereâ€™s some
useful work to be done (probably at the APR level) in making sure it takes
advantage of what the hardware can do.</p>

<p id='p-11' class='p1'><span class='h3'>Problem: Testing and Debugging</span> &#xb7; 
I am right now, in the Zeppelin context, grinding away on a highly
concurrent multi-threaded application.
Debugging it is a complete mindfuck, 
and Iâ€™m spending too much time debugging
it because I have no idea how to write the unit tests.
Consider a method that gets a
network request for more resources, discovers which other computers in the
cluster are advertising cycles to spare, pings them to see if theyâ€™re really
there, asks them to handle the request, and reports back to the requester; how
do you unit-test that?  I have no idea.</p>

<p>This is hard low-level Computer Science and we in the industry trenches
could sure use some help from the researchers; are the researchers looking in
this direction?</p>

<p id='p-9' class='p1'><span class='h3'>Problem: How Many Is Enough?</span> &#xb7; 
Right at the moment, CMT is the low-hanging fruit in CPU performance; itâ€™s
a lot cheaper and more tractable (and power-efficient) to double the number of
threads than to double the clock rate.
But this trend can only go on so long; my intuition is that most modern server
workloads will have no trouble using 32 hardware threads.  How about 64?  How
about a thousand?
At some point the return on investment gets lousy, and weâ€™re going to have to
go back to grinding away at the clock rate, or whatever the next trick is that
we havenâ€™t thought of yet.</p>

<p id='p-3' class='p1'><span class='h2'>Conclusion</span> &#xb7; 
Iâ€™d like to end on a positive note, because actually Iâ€™m pretty optimistic:
I think that weâ€™re going to get a few yearsâ€™ good mileage out of cranking up
the parallelism, and enough benefits will fall out of the architecture for
free to make it worthwhile.
I also agree with Paul Murphy (see the the link above) that these chips are
going to be well-suited for laptops.
Iâ€™m currently sitting in front of a 1.25GHz PowerPC; itâ€™s got an Altivec for
graphics, but everything else is single-threaded.  Pretty well everything runs
pretty well fast enough, except maybe PhotoShop and video processing.  And
those things are already parallelized.</p>

<p>So, given that CMT chips use less watts per unit of computing, why arenâ€™t
they being designed into the next generation of laptops?
If I were a hot young EE looking for an opportunity, Iâ€™d be thinking
startup.</p>

<hr />
<div id='commentHere'></div>
<div id='footer'><p class='footer'><b>Updated: 2005/06/20</b></p>
</div>
</div>

<div id='rightcontent'><div class='oo'><a id='to-home' href='https://www.tbray.org/ongoing/'><span id='home'>ongoing</span></a></div>
<div>
<div class='principles'>
<a href='/ongoing/WhatItIs'>What this is</a> &#xb7;
<a href='/ongoing/ongoing.atom'><img title="Subscribe to ongoing" alt="Subscribe to ongoing" src="/ongoing/Feed.png"/></a><br/>
<a href='/ongoing/Truth'>Truth</a> &#xb7;
<a href='/ongoing/Biz'>Biz</a> &#xb7;
<a href='/ongoing/Tech'>Tech</a></div>
<a href='/ongoing/misc/Tim'>author</a> &#xb7;
<a href='http://www.textuality.com/BillBray/'>Dad</a><br/>
<a href='/ongoing/misc/Colophon'>colophon</a> &#xb7;
<a href='/ongoing/misc/Copyright'>rights</a>
</div>
<div id='potd'><a id='tnA' href='/ongoing/goto-potd/'><img id='tnI' src='/ongoing/potd.png' alt='picture of the day' /></a></div>
<div id='cats'>
<a href='/ongoing/When/200x/2005/06/'>June</a> <a href='/ongoing/When/200x/2005/06/12/'>12</a>, <a href='/ongoing/When/200x/2005/'>2005</a><br/> &#xb7; <a href='/ongoing/What/Technology'>Technology</a><span class='more'> (90 fragments)</span>
<br/> &#xb7; &#xb7; <a href='/ongoing/What/Technology/Concurrency'>Concurrency</a><span class='more'> (76 more)</span>
<br/> &#xb7; &#xb7; <a href='/ongoing/What/Technology/Software'>Software</a><span class='more'> (86 more)</span>
</div>

<div class="employ">
<p>By <a rel="author" href="/ongoing/misc/Tim">Tim Bray</a>.</p>
<p>The opinions expressed here <br/>
are my own, and no other party<br/>
necessarily agrees with them.</p>
<p>A full disclosure of my<br/>
professional interests is<br/> 
on the <a href='/ongoing/misc/Tim'>author</a> page.</p>
<p>Iâ€™m on <a rel="me" href="https://cosocial.ca/@timbray">Mastodon</a>!</p>
</div>



</div>
</div>
</div>

</body>
</html>