Distributed Computing Archive: July 2008
« Previous | Main | Next »
July 2, 2008
Apache Hadoop Wins Terabyte Sort Benchmark
One of Yahoo's Hadoop clusters sorted 1 terabyte of data in 209 seconds, which beat the previous record of 297 seconds in the annual general purpose (daytona) terabyte sort benchmark. The sort benchmark, which was created in 1998 by Jim Gray, specifies the input data (10 billion 100 byte records), which must be completely sorted and written to disk. This is the first time that either a Java or an open source program has won. Yahoo is both the largest user of Hadoop with 13,000+ nodes running hundreds of thousands of jobs a month and the largest contributor, although non-Yahoo usage and contributions are increasing rapidly.
The cluster statistics were:
- 910 nodes
- 2 quad core Xeons @ 2.0ghz per a node
- 4 SATA disks per a node
- 8G RAM per a node
- 1 gigabit ethernet on each node
- 40 nodes per a rack
- 8 gigabit ethernet uplinks from each rack to the core
- Red Hat Enterprise Linux Server Release 5.1 (kernel 2.6.18)
- Sun Java JDK 1.6.0_05-b13
The benchmark was run with Hadoop trunk (pre-0.18) with a couple of optimization patches to remove intermediate writes to disk. The sort used 1800 maps and 1800 reduces and allocated enough memory to buffers to hold the intermediate data in memory. All of the code for the benchmark has been checked in as a Hadoop example.
Owen O'Malley
Yahoo! Grid Computing Team
Posted by aanand at 3:42 PM | Comments (23) | TrackBack | Permalink
Subscribe
Recent Blog Articles
view all
Slides from Hadoop World and University Talks
Wed, 28 Oct 2009
Hadoop User Group (HUG) – Oct 21st at Yahoo!
Fri, 23 Oct 2009
M45 Enables Web-Scale Information Extraction Research
Fri, 23 Oct 2009
Slides of September 23rd Bay Area Hadoop User Group
Mon, 05 Oct 2009
New Update: Yahoo! Distribution of Hadoop
Thu, 01 Oct 2009
Recent Links
Web addresses may adopt non-English characters | Digital Media - CNET News
Mon, 26 Oct 2009
Yahoo Open Hack NYC - Open Blog - NYTimes.com
Thu, 15 Oct 2009
Music Hack Day - Boston - Nov 20-21
Sun, 11 Oct 2009
A List Apart: Articles: Discovering Magic
Tue, 06 Oct 2009
Building iPhone Apps with HTML, CSS, and JavaScript
Sun, 04 Oct 2009
Archives

