Welcome, guest Sign In

Yahoo! Hadoop Blog

June 29, 2010

Hadoop 0.20.S Virtual Machine Appliance

At Yahoo!, we recently implemented a stronger notion of security for the Hadoop platform, based on Kerberos as underlying authentication system. We also successfully enabled this feature within Yahoo! on our internal data processing  clusters. I am sure many Hadoop developers and enterprise users are looking forward to get hands-on experience with this enterprise-class Hadoop Security feature.

In the past, we've aided developers and users get started with Hadoop by hosting a comprehensive Hadoop tutorial on YDN, along with a pre-configured single node Hadoop (0.18.0) Virtual Machine appliance.

This time, we decided to upgrade this Hadoop VM with a pre-configured single node Hadoop 0.20.S cluster, along with required Kerberos system components. We have also included Pig (version 0.7.0), a high level SQL-like data processing language used at Yahoo!.

This blog post describes how to get started with the Hadoop 20.S VM appliance. The basic information about downloading, setting up VM Player, and using the Hadoop VM is same as described in the tutorial module-3 — except the user has to use the following information and links to download the latest VM Player and  Hadoop 0.20.S VM Image. You should also review the following information for security-specific commands that need to be performed before running M/R or Pig jobs.

For more details on deploying and configuring Yahoo! Hadoop 0.20.S security distribution, look for continuing announcements and details on Hadoop-YDN.

Installing and Running the Hadoop 0.20.S Virtual Machine:

  • Virtual Machine and Hadoop environment: See details here.
  • Install VMware Player: See details here. To download latest VMware Player for Windows/Linux, go to Vmware site
  • Setting up the Virtual Environment for Hadoop 0.20.S:
Copy the [Hadoop 0.20.S Virtual Machine] into a location on your hard drive.
It is a zipped vmware folder (hadoop-vm-appliance-0-20-S, appriox ~400MB), which includes a few files: a .vmdk file that is a snapshot of the virtual machine's hard drive, and a .vmx file that contains the configuration information to start the virtual machine. After unzipping the vmware folder zip file, to start the virtual machine, double-click on the hadoop-appliance-0.20.S.vmx file.  Note: Uncompressed Size of hadoop-vm-appliance-0-20-S folder is ~2GB. Also, based on that data you upload for testing, VM disk is configured to grow up to 20GB).

When you start the virtual machine for the first time, VMware Player will recognize that the virtual machine image is not in the same location it used to be. You should inform VMware Player that you copied this virtual machine image (choose "I copied it"). VMware Player will then generate new session identifiers for this instance of the virtual machine. If you later move the VM image to a different location on your own hard drive, you should tell VMware Player that you have moved the image.

After you select this option and click OK, the virtual machine should begin booting normally. You will see it perform the standard boot procedure for a Linux system. It will bind itself to an IP address on an unused network segment, and then display a prompt allowing a user to log in.

Note: The IP address displayed on the login screen can be used to connect to VM instance over SSH. The Login screen also displays information about starting/stopping Hadoop daemons, users/passwords, and how to shutdown the VM.

Note: It is much more convenient to access the VM via SSH. See details here.
  • Virtual Machine User Accounts:
The virtual machine comes pre-configured with two user accounts: "root" and  "hadoop-user". The hadoop-user account has sudo permissions to perform system-management functions, such as shutting down the virtual machine. The vast majority of your interaction with the virtual machine will be as hadoop-user. To log in as hadoop-user, first click inside the virtual machine's display. The virtual machine will take control of your keyboard and mouse. To escape back into Windows at any time, press CTRL+ALT at the same time. The hadoop-user user's password is hadoop. To log in as root, the password is root.
  • Hadoop Environment:
Linux    : Ubuntu 8.04
Java       : JRE 6 Update 7 (See License info @ /usr/jre16/)
Hadoop : 0.20.S  (installed @ /usr/local/hadoop,  /home/hadoop-user/hadoop is symlink to install directory)
Pig         : 0.7.0 (pig jar is installed @ /usr/local/pig,  /home/hadoop-user/pig-tutorial/pig.jar  is  symlink to  one in install directory)

Login: hadoop-user, Passwd: hadoop (sudo privileges are granted for hadoop-user). The other usrers are hdfs and mapred (passwd: hadoop).

Hadoop VM starts all the required hadoop and Kerberos daemons during the boot-up process, but in case the user needs to stop/restart,
  • To start/stop/restart hadoop: login as hadoop-user and run 'sudo /etc/init.d/hadoop [start | stop | restart]' ('sudo /etc/init.d/hadoop' gives the usage)
  • To format the HDFS & clean all state/logs: login as hadoop-user and run 'sudo reinit-hadoop'
  • To start/stop/restart Kerberos KDC Server: login as hadoop-user and run 'sudo /etc/init.d/krb5-kdc [start | stop | restart]'
  • To start/stop/restart Kerberos ADMIN Server: login as hadoop-user and run 'sudo /etc/init.d/krb5-admin-server [start | stop | restart]'
To shut down the Virtual Machine: login as hadoop-user and run command 'sudo poweroff'

Environment for 'hadoop-user' (set in /home/hadoop-user/.profile)
  $HADOOP_HOME=/usr/local/hadoop
  $HADOOP_CONF_DIR=/usr/local/etc/hadoop-conf
  $PATH=/usr/local/hadoop/bin:$PATH
  • Running M/R Jobs:
Running M/R jobs in Hadoop 0.20.S is pretty much same as running them in non-secure version of Hadoop. Except before running any Hadoop Jobs or HDFS commands, the hadoop-user needs to get the Kerberos authentication token using the command 'kinit'; the password is hadoopYahoo1234.

For example:
hadoop-user@hadoop-desk:~$ cd hadoop
hadoop-user@hadoop-desk:~$ kinit
Password for hadoop-user@LOCALDOMAIN:  hadoopYahoo1234
hadoop-user@hadoop-desk:~/hadoop$ bin/hadoop jar hadoop-examples-0.20.104.1.1006042001.jar pi 10 1000000

For automated runs of hadoop jobs, a keytab file is created under the hadoop-user's home directory (/home/hadoop-user/hadoop-user.keytab). This will allow user to execute the "kinit" without having to manually enter the password. So for automated runs of hadoop commands or M/R, Pig jobs through the cron daemon, users can invoke the following command to get the Kerberos ticket. Use command 'klist' to view the Kerberos ticket and its validity.

For example:
hadoop-user@hadoop-desk:~$ cd hadoop
hadoop-user@hadoop-desk:~$ kinit -k -t /home/hadoop-user/hadoop-user.keytab hadoop-user/localhost@LOCALDOMAIN
hadoop-user@hadoop-desk:~/hadoop$ bin/hadoop jar hadoop-examples-0.20.104.1.1006042001.jar pi 10 1000000

  • Running Pig Tutorial:
The Pig tutorial is installed at "/home/hadoop-user/pig-tutorial". Example commands to run the Pig script are given in "example.run.cmd.sh". The Data needed for Pig scripts are already copied to HDFS. See more details about the Pig Tutorial at Pig@Apache
  • hadoop-user@hadoop-desk:~$ cd pig-tutorial
  • hadoop-user@hadoop-desk:~$ sh example.run.cmd.sh
  • Shutting down the VM:
When you are done with the virtual machine, you can turn it off by logging in as the hadoop-user and running the command 'sudo poweroff'. The virtual machine will shut itself down in an orderly fashion and the window it runs in will disappear.

Last but not least, I would like to thank Devaraj Das and Jianyong Dai from the Yahoo! Hadoop & Pig Develoment team for their help in setting up and configuring Hadoop 0.20.S and Pig respectively.

Notice: Yahoo! does not offer any support for the Hadoop Virtual Machine.
The software include cryptographic software that is subject to U.S. export control laws and applicable export and import laws of other countries. BEFORE using any software made available from this site, it is your responsibility to understand and comply with these laws. This software is being exported in accordance with the Export Administration Regulations. As of June 2009, you are prohibited from exporting and re-exporting this software to Cuba, Iran, North Korea, Sudan, Syria and any other countries specified by regulatory update to the U.S. export control laws and regulations. Diversion contrary to U.S. law is prohibited.

 

Suhas Gogate Suhas Gogate
Technical Yahoo!, Cloud Solutions Team, Yahoo!

Bookmark this on Delicious

Comments (5) | Permalink

June 24, 2010

Managing Big Data: Architectural Approaches for making batch data available online

This is the beginning of an ongoing series of blog posts on “Managing Big Data”. This series will focus on techniques that Yahoo uses to process large volumes of data, ranging from initial collection of data to the end usage of that data.

Introduction

Over the last several years there are two important trends that require additional thought when putting together an architecture for a hosted service. At Yahoo!, the ability to analyze and process enormous amounts of data is increasingly important. It’s a foundational layer for improving our consumer experiences and for sharing audience insights with advertisers.

From a technology perspective, the two trends I'd like to focus on are:

1. Batch processing -- the increasing awareness of batch processing and the recent uptick in use of the map/reduce paradigm for that purpose.

2. NoSQL stores – The rise of so called "NoSQL" stores and their use to serve up data to online users (typically inside of the user's request/response cycle).

Both of these trends represent significant advances in the way that hosted systems are developed. But in order to derive the most value for an entire system, developers must think about how these two areas will work together in some holistic manner.

Let's look at a specific scenario to make this more concrete:

Making batch data available to the online system

Data Available Online

Let's assume for the moment that you're building a new e-commerce site. And let's assume that one of the significant features of this e-commerce site is to provide user recommendations for which items a user may be interested in purchasing. This overall feature will decompose into batch component (to determine the recommendations to give to a user) and an online component (that will present the recommendations to end user).

For the batch system we may choose to use a map/reduce framework like Hadoop . The batch recommendation component will rely on various types of raw event data. Some examples include: the products the end user has viewed, the products they have purchased and types of searches the user has performed, etc. We will create a model using this data; perhaps based on a simple behavior model (e.g. if the user looks at a significant amount of sports equipment than recommend products in the sports equipment category) or a collaborative filtering model (e.g. if other users purchased the same products as this end user, recommend other products that they purchased).

Once we have decided on the data inputs and the model for making recommendations, we'll like produce the output of the batch processing as a set of recommendations for each user on a Hadoop cluster.

There are two approaches we can then make for making the batch data available online in a NoSQL store.

1.Full updates

In this approach as part of the batch processing, we will recreate the set of recommendations for all users. It also may be possible to create a native version of the online store format in the batch system. This is generally only possible if there is no scenario where the data (user recommendations for products in this case) is updated in the online store. There must also exist a library to write the into the online store's file format (not all NoSQL stores have such as library).

This approach is attractive for a couple of reasons:

  • There should not be any consistency issues between your offline and online representation of the data as you are always creating an entirely new copy of the online data at some interval (for instance once a day).
  • This approach should have the least impact of your online store performance if you can perform a "swap" with new data set. How this would work is that the newly created set of recommendations are pushed in the native NoSQL store format to the online stores physical boxes. Once there each of the NoSQL stores are "upgraded" to start using the new copy of the recommendations and stop using the old version.

2. Incremental/Delta updates

Taking an incremental update approach between your offline and online store entails creating a new set incremental data changes in your batch system. For instance, creating a new set of user recommendations for newly joined users or users that have had some recent activity (which would change the recommendations). This incremental data must then be pushed to the online store.

This approach is attractive for a couple of reasons:

  • Latency. By processing just the incremental updates of your recommendations is possible to update the online stores on a frequent basis. For instance, it may be possible to run a batch job on Hadoop every 30 minutes that produces a new set of product recommendations. These product recommendations can then be pushed to the online store at a 30 minute interval. The full update approach to moving the offline data online will more likely have an update frequency of several hours or even once a day (due to the large amount of data that needs to be processed and transferred to the online stores).
  • Size of updates. Depending on the size of the data, it's also possible that incremental updates maybe the only viable solution. For instance if the number of users we would like to create recommendations for is in the 10s of millions and the number of recommendations for each user is large, the data set may be too large to recalculate and push the entire set.

One weakness of the incremental update approach is that it can have a performance impact on your online store. Therefore, when you apply updates to the online store you may need to consider some form of throttling of the updates as they are applied to the online store.

Key takeaways

  • Consider the type of data you have and whether pushing full updates or delta updates is more appropriate for your type of data as this fundamentally affects the architecture for making your batch data available online. At Yahoo we use both approaches depending on the specific scenario.
  • Throttling your updates to your online store is important consideration to maintain your online stores latency and availability.

In a future post I'll also address taking data that's available in an online NoSQL store and making that data available in batch system like Hadoop.

 

Dirk Reinshagen Dirk Reinshagen
Cloud Architect
Cloud Computing at Yahoo!

Bookmark this on Delicious

Comments (1) | Permalink

June 15, 2010

Hadoop and the fight against shape-shifting spam

At a recent Hadoop User Group meeting, I made a presentation on how we leverage hadoop for spam mitigation in Yahoo! Mail. A number of people followed up requesting additional details of our architecture and engineering strategy.
In this post, I am going to try and capture our antispam engineering story, how it came to be hadoop centric and how well the new architecture has worked. I will also highlight the results we have been able to achieve. Finally, I will provide an update on when we will be releasing these updates to wide production.

At the Hadoop User group presentation, I had delved into the details of two interesting antispam algorithms. The first was "frequent itemset mining", the second was what we called the "connected components" algorithm. Both these algorithms are implemented as part of our tools portfolio. They are used by engineers, product managers and operations analysts to get a compact summary of the major trends in spam. Both these tools were implemented as part of a new engineering strategy we put in place second quarter of last year.

Our new strategy called for pointed improvements in the ability of systems to digest massive amounts of data. The first portion of the strategy, implemented by the end of 2009, targeted our reporting systems, tool chains and existing abuse reputation algorithms. The proposal was to increase the granularity of the data being handled, increase the response time to detect an attack and do to do more early detection of spam attacks. In our analysis, we quickly realized that even the small changes we were proposing to our reporting systems, tools and algorithms required us to scale our existing systems well beyond the limits that they were meant to scale.

Also, we found that engineering, product and customer advocacy teams were all hungry for data and it would be great to support additional requirements around ad hoc joins across data streams and support a general "slice" and "dice" approach to data engineering. Our first revisions to the sender and content classifiers also made it abundantly clear that we needed massive storage and massive compute.

We took the simple approach of putting ALL the data that we would possibly want to query or develop algorithms on, on a hadoop grid and let the grid scale to the storage and compute requirements. To give you some idea of the scale involved here, let me provide some ball park metrics.
By the end of this quarter, we will be loading close to 4TB of antispam data on our hadoop grids every day and we will be querying several days of data for report generation and running automated classifiers and algorithms at a frequency of a few minutes. We have not run into any scalability problems so far. In general we have found that with proper data organization, hadoop is able to scale linearly with data and compute requirements.

I will complete this section by saying that this strategy has had a huge impact on spam complaints. See for you self; I am enclosing the graph of our spam complaints from last year. The big dip corresponds to when we shifted our reports, algorithms and filters to the hadoop grid. Need I say more?

While making changes to how an existing system works is interesting and clearly the first step, the second and more interesting step is the development of brand new, distributed reputation algorithms using hadoop. Once again, our new engineering strategy called for the rewiring of all algorithms to run in parallel and increase the level of feature engineering across heuristic, statistical and machine learning systems. We needed to do this across reputation algorithms for IPs, domains, from addresses(senders), receivers(users) and content. Once again, we realized that much of the complexity was in massive data engineering. We needed to ensure that we used every bit of data that would help us make a spam/notspam decision. We also had to choose an appropriate model that could interpret this vast amount of data without getting overwhelmed.

The term "massive feature engineering" should be familiar to people in the area of machine learning. In more common engineering terms, we needed to associate several pieces of meta data to every entity we needed to classify and we needed to choose algorithms that would parallelize well. We have been hard at work the last 5 months doing this new "hadoop engineering". By the end of this month, we hope to release our first hadoop based, massively feature engineered, distributed sender classification algorithm. Code named zeroB, our initial tests make this a very compelling replacement for our current sender management system. It is 25% more accurate while being faster and cheaper to run and maintain than the current version.

Indeed I have now come to believe that Hadoop has tremendous applicability to the abuse and security domains as a whole. Both these domains have the proverbial problem of finding the needle in the hay stack and hadoop is well equipped for this task. With the amount of spam that large mail systems like yahoo see, it is truly important to employ powerful frameworks like Hadoop to ensure the problem remains tractable.

Yahoo mail was recently voted by the renowned fraunhaufer institute as the best free mail service for spam management. This recognition clearly demonstrates that our new hadoop based strategy is working and working very well. This is just the tip of the success though. In the next 3 months, we are rolling out many of these new systems to our wide install base in the United States and I am eagerly waiting to see the effect this has on spam. It has been fun and exciting building these systems on top of hadoop but it has been even more exciting to see us winning the war on spam.

Join us next week for the Yahoo! keynote at Hadoop Summit 2010 to hear more about Hadoop and Mail Antispam.

 

Vishwanath Ramarao Vishwanath Ramarao
Director of Anti-Spam Engineering
Yahoo! Mail

Bookmark this on Delicious

Comments (0) | Permalink

June 9, 2010

Enabling Hadoop Batch Processing Systems to Consume Streaming Data

At Yahoo!, the ability to analyze and process enormous amounts of data is increasingly important. It’s a foundational layer for improving our consumer experiences and for sharing audience insights with advertisers.

In the last few years, I have been a part of a project to design, build, and run a low-latency, large-scale, distributed event data collection system at Yahoo!. When we started off, the goal seemed relatively unambitious, to collect web-access event data across all of the web-servers across all the data centers and bring it to a central location for processing. This perception soon changed after we realized that this involved around 20000 machines and over 20 data centers across the world amounting to over 40 billion events per day that helped fill-up over 10 TB of disk space. To add to the mix, the data had to be available within 15 minutes with an expected completeness of 99% across trans-oceanic fiber optic cable.

We decided to collect the data in a streaming fashion. This enabled us to feed the data at very low latencies to stream processing applications. However, there was an existing batch processing application that required all the data for the entire day to be available with near 100% completeness.

In order to achieve this, the data was collected in a streaming fashion and put into files that contained events belonging to that particular time period, the default being a minute and hence called minute files. Once the data was collected for the minute, the minute files were closed and the data was made available to the consuming application using a queue. Now, this worked reasonably well when there was one application but had problems when the consuming application wanted to reprocess or perform partial updates. In addition to this, the queue essentially kept the consuming application state making the collection and processing systems tightly coupled. This made is increasingly hard when it came to supporting multiple applications because the state for each batch for each application needed to be kept.

This made it even more interesting when the Hadoop initiative at Yahoo! began. All batch processing application were now running on the grid while the data was still consumed by legacy applications. How would we feed the old and the new systems with the same data without duplication?

What we needed was a loosely coupled or completely decoupled method of communicating the files to be processed to downstream batch processing applications. The solution we came up with was a simple but elegant one called a List of Files (LoF) repository.

The LoF repository contains an entry for each minute file collected and its associated attributes such as the start time, the end time, the size in bytes, the number of events, the collection pipeline instance name, and other relevant data. An API to access this data called the LoF API was provided to be able to query the repository for a set of files that satisfied certain attribute constraints. For example, a query might request “all the files that belong to the period 12:00 to 12:05 collected from the sports web servers”. The repository did not need to keep state of which files this application had processed or maintain any queue. This allowed the application to maintain its state and multiple applications were as simple as the single application case. To simplify the usage the API was made available in the form of a RESTful web-service.

Different applications had different completeness requirements from the data collected. For example, a low-latency behavioral targeting application would typically be happy with 95% of the data within 1 minute of the data, while a revenue realization or tracking application would need 100% of the data within 15 minutes. In order to support this, the API returned a completeness metric along with the list of files returned to indicate the percentage of data the list represented. The application could use this information to commence processing based on its own completeness requirements.

Given the distributed nature of the web-servers, data was often delayed or unavailable due to network outages or temporary host unavailability. This meant that applications requiring higher levels of completeness were routinely delayed beyond their SLAs. To solve this we provided a simple timestamp based cursor facility to enable incremental processing. The cursor was essentially returned with the list to indicate the timestamp at which the list was generated. The subsequent query would provide the previously returned cursor along with the subsequent query to indicate the time of the last fetch and the query would return all the files later that that timestamp.

This is what the web-service request looks like:

The response to this is of the form:
collector1.yahoo.com 1234388520 1234 /col1/1200.gz
collector2.yahoo.com 1234388520 3232 /col2/1200.gz

A subsequent request to get incremental data would use the cursor timestamp returned to fetch additional files as follows:

which would get a response similar to:
collector1.yahoo.com 1234388580 3232 /col1/1201.gz

I would like to conclude by saying that the LoF API has enabled the same data to be made available to different application with varying completeness and latency requirements in a simple and elegant manner. Moreover, it has enabled the collection system, which uses a stream-based paradigm, to easily feed multiple largely batch-oriented systems in a relatively seamless manner. Keeping the design simple enabled us to solve a reasonably complex problem.

Akon Dey Akon Dey
Architect, Event Data Collection System at Yahoo!

Bookmark this on Delicious

Comments (2) | Permalink

May 27, 2010

Hadoop Summit 2010 - Agenda is available!

I’m happy to share the agenda for the upcoming Hadoop Summit – June 29th, Hyatt, Santa Clara.

We received over 70 great submissions for talks. It was a very impressive combination of development tool overviews, application case studies and innovative research.

We had the difficult task of selecting just a handful of presentations from this overwhelming collection of great quality abstracts and speakers. The variety of topics across numerous industry verticals, served as a clear evidence of how far this technology has evolved over the past year. Hadoop is really going mainstream!

Our goal was to create a diverse agenda that covers topics for experienced Hadoop users as well as people who recently began to explore this technology. We wanted to focus on the Hadoop eco-system of tools and solutions as well as “real life” users experience.

I want to thank all the people who submitted talks and encourage speakers that were not selected to submit their great presentations to our monthly Hadoop User Groups.

Detailed agenda and abstracts available at http://www.hadoopsummit.org/agenda.html

At Yahoo!, we are embracing Hadoop at the very core of our business. As you will hear at the summit, we are continuing to invest heavily in both the technology and the community to make it even better. We love being at the center of the discussion and debates around Hadoop and learning from other’s experiences.

We are planning to go bigger next year, with a broader event that will allow more opportunities for speakers as well as sponsors.

We hope you all join us at the summit – if you have not registered yet, please REGISTER TODAY!. Space is limited and we don’t want you to miss the opportunity to see the great variety of talks outlined above first-hand.

Special thanks to our Sponsors:

 

Dekel Tankel Dekel Tankel
Director, Product Management
Cloud Computing at Yahoo!

Bookmark this on Delicious

Comments (1) | Permalink

May 21, 2010

Pig, Cascalog & HBase Among Highlights of May Hadoop Meet-Up

Hi Hadoopers

Thanks to close to 300 developers who came this week to Yahoo! for our monthly Hadoop User Group meeting. The energy in the packed room was phenomenal and conversations continued long after the formal sessions.

>Hundreds of Hadoop Fans Flock to Yahoo! for  the May Hadoop User Group
Hundreds of Hadoop Fans Flock to Yahoo! for the May Hadoop User Group

A few lucky winners received free tickets to the upcoming Hadoop Summit 2010 (June 29th, at the Hyatt Regency, Santa Clara). Congratulations to those winners – everyone else please register here

The event started with Alan Gates from Yahoo! who described the new features and work done in Pig 0.6 and 0.7 including the Hadoop’s compatibility plan, described in more details in this post.

 

Nathan Marz from BackType presented a cool demo of how easy it is to query existing data stores using Cascalog, a query language for Hadoop. Nathan described how queries can be written as regular Clojure code and combined with Cascading. Be sure to watch the demo as part of the video below.

 

Next was Dmitriy Ryaboy, an engineer at Twitter and a Pig committer. Dmitriy walked us through the extensive use of Hadoop eco-system at Twitter. He explained what are the challenges they face in processing 55 million tweets a day and why they chose to use Hadoop, Pig and HBase. Dmitriy introduced the Elephant Bird libraries and shared interesting tips for dealing with Big Data.

 

We concluded with Tom White from Cloudera who walked us through the release plans for Apache Hadoop 0.21 including the Source Compatibility project described in the Yahoo! hadoop blog

 

We at Yahoo! are embracing Hadoop – we share the challenges presented by Twitter for processing massive data sets and continue to invest heavily in the technology and the community. We love to hear about the growing ecosystem and solutions like Cascalog.

Please join us at the Hadoop Summit to continue the conversation.

As always, we are looking for exciting technologies and experiences you want to share. Please contact me via the Hadoop Bay Area User Group Meetup page.

Note that we will not have a meetup in June due to the Hadoop Summit . See you all on July 21st, 2010. Registration is available here, agenda will be published soon

 

Dekel Tankel Dekel Tankel
Director, Product Management
Cloud Computing at Yahoo!

Bookmark this on Delicious

Comments (0) | Permalink

YDN Libraries & Best Practices

  • ASTRA
  • Design Pattern Library
  • Exceptional Performance
  • Yahoo! User Interface Library (YUI)

Copyright © 2010 Yahoo! Inc. All rights reserved. Copyright | Privacy Policy | Terms of Use

Help us continue to improve the Yahoo! Developer Network: Send Your Suggestions