
In case you were looking for a little light reading on a Monday, our own architect Peter Mika has just published an article in the latest print issue of Nodalities. Peter's article, titled Anatomy of a SearchMonkey, dives into the architecture and thinking behind the SearchMonkey project, and how SearchMonkey applies to semantic data on the Web. The entire magazine is available as a free PDF download (5 MB), so take a look and let us know what you think!
Evan Goer
Yahoo! SearchMonkey Team
Posted at 10:51 AM | Comments (0)
Yahoo! Search is now extracting RDFa data across the World Wide Web and making this information available to the public via SearchMonkey. RDFa is an open standard for embedding structured data directly in HTML. Along with our previous support for eRDF and a number of popular microformats, SearchMonkey now supports a wide variety of popular semantic technologies.
What is structured data, and why is structured data good for search? Traditional search engines crawl the web and extract what metadata they can: the page title, an autogenerated summary, the file size, the MIME-type, the last-updated date, and so on. However, this sort of analysis pales in comparison to what a human being can do simply by glancing at the page. A human can look at the words "Joe's Home Page" and infer, "ah, this page probably belongs to Joe," or look at an image and infer, "ah, that's probably a picture of Joe, the owner of the page." That's easy enough for humans... but what if the search engine could pick out this info and display it directly in the search result?
RDFa relies on using attributes to embed structured data in XHTML. These attributes are not valid in HTML 4, but the W3C has provided an XHTML DTD to validate against. The following example illustrates a simple home page marked up with RDFa data (in bold):
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN"
"http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
lang="en" xml:lang="en">
<head>
<title>The Amazing Home Page of Joe Smith</title>
</head>
<body>
<h1 property="dc:title">Joe's Home Page</h1>
<div rel="foaf:maker">
<h2 property="foaf:name">Joe Smith</h2>
<div rel="foaf:depiction" resource="http://joesmith.org/images/jsmith.png">
<img src="/images/jsmith.png" alt="Smiling headshot of Joe" />
<p property="dc:rights">Creative Commons Attribution 3.0 Unported</p>
</div>
</div>
</body>
</html>
In this page, the designer has explicitly stated that the image is a "depiction of the person who made the web page." Adding this information as RDFa can potentially benefit many applications. In the case of Yahoo!, we've designed our search index to extract and store this information.
RDFa support has already enabled some interesting new SearchMonkey applications. For instance, Creative Commons has recently started to deploy RDFa across the web in the form of copyright and licensing information. Every time a Creative Commons user selects a CC license, the generated HTML badge contains RDFa markup indicating the nature of the license. The Creative Commons Infobar uses this data to selectively trigger on pages that declare their license using structured markup:

To get started with RDFa:
searchmonkeyid:com.yahoo.rdf.rdfa parameterEvan Goer
Yahoo! SearchMonkey Team
Posted at 2:00 PM | Comments (5)
In July, we bucket tested a new template design for Enhanced Result applications. Developers do not have to take any actions to use this new redesign; all Enhanced Result applications will upgrade automatically. Important: Infobar applications will be unaffected.
The original SearchMonkey post contains examples of both the old and the new SearchMonkey Enhanced Result templates. In this posting, we'll just show examples of the new design. First, an Enhanced Result for HowStuffWorks.com that displays a result with a full abstract:

The first design change you'll notice is that images appear to the right, rather than the left. Our user testing has shown that moving images to the right improves the usability of the overall result. As for links, they now appear in a horizontal row, rather than as a vertical column next to the image. Not only are the deep links are more discoverable when presented separately from the image, but moving the links into a horizontal row provides more space for the summary and key/value pairs.
Now let's take a look at another real-world example from CitySearch, this one displaying a result with key/value pairs:

As in the example above, the image is shifted to the right. The key/value pairs have replaced the abstract, lying just below the deep links.
Careful observers will note that the key and value are now the same color — a minor change from the old template, where the keys were a lighter shade of grey. We found that keys were actually more discoverable when displayed in the same shade as the value.
The SearchMonkey team is confident improves performance, engagement, and discoverability across the board for users and developers. If you have feedback on the new templates, please let us know here, or leave us a comment in the SearchMonkey forum.
Evan Goer
Yahoo! SearchMonkey Team
Posted at 9:50 AM | Comments (5)
For those of you who are interested in learning more about how structured data fits in with SearchMonkey and Yahoo! Search strategy, please tune in to the latest Semantic Web Gag podcast to hear our very own Peter Mika discussing topics such as:
This podcast is available as a stream or MP3 download. Enjoy!
Evan Goer
Yahoo! SearchMonkey Team
Posted at 4:01 PM | Comments (0)
If you've been hanging around the YDN recently, you've probably heard a thing or two about SearchMonkey.
And why not? SearchMonkey is pretty darn cool.
It lets you enhance the appearance of search results for your favorite sites. So the next time you need to look up, say, restaurants, a SearchMonkey app can distill all of the important information, like location, price range, and rating all into one place, right there in your search results.
A ton of people have been tinkering with SearchMonkey since it launched in May. One of the main reasons for this (aside from how cool it is...and never mind the $10,000 contest they held recently), is how easy it is to pick up and start developing with.
In this article, I'll go over XSLT and RDF--two of the fundamental concepts that power SearchMonkey. If you're looking to build your first app or you've built a few and want to get more out of it, you'll definitely want to read on.
RDF stands for the Resource Description Framework. It provides a standard way to organize information into semantic units. Structuring information with RDF allows authors the ability to preserve relational and meta information. This is a very good thing.
For instance, here's a plaintext snippet of a newspaper article in unformatted plaintext:
Link By Link This Is Funny Only if You Know Unix By Noam Cohen Published: May 26, 2008 For a certain subset of Internet users, “Sudo make me a sandwich” may as well be “Take my wife ... please.”
Sure, it's easy for us to identify all the different parts just by context. You could pick out the author's name, the title, and the date it was published without a breaking a sweat. For computers, on the other hand, figuring all this out is at best tedious and in the worst case, pretty much impossible. With a trillion+ webpages out there, being able to programatically index meta-information is more important than ever.
So let's try this again, this time with RDF:
<rdf:Description>
<dc:identifier>http://www.nytimes.com/2008/05/26/business/media/26link.html?_r=1&oref=slogin</dc:identifier>
<dc:relation>Link By Link</dc:relation>
<dc:title>This Is Funny Only if You Know Unix</dc:title>
<dc:creator>Noam Cohen</dc:creator>
<dc:date>Mon May 26 00:00:00 -0700 2008</dc:date>
<dc:rights>Copyright 2008 The New York Times Company</dc:rights>
For a certain subset of Internet users, “Sudo make me a sandwich” may as well be “Take my wife ... please.”
</rdf:Description>
Now that's what I'm talking about! It may look weird with its tags showing, but it looks great when rendered properly by a browser. By adding a little context, programs like SearchMonkey can extract this information and organize it in meaningful ways.
One caveat that I'll get to later in this article: SearchMonkey uses a standard similar to RDF called DataRSS. The same general principles apply, so it's good to have an understanding of RDF first coming into it.
When you go to the SearchMonkey application builder, you have the choice to build either a Custom Data Service, which translates a web page or API call into DataRSS, or a Presentation Application, which builds on data services to build what you see when it's in use on a search result page. Presentation Applications are as simple as filling in the blanks in a template, so most of the work in developing with SearchMonkey is parsing data sources into the right format. That's where XSLT comes in.
Now let's say we have this kind of information available to us. How do we actually get at it?
Well, lucky for you, SearchMonkey is built on a standard called XSLT, which was designed for just such a task.
XSLT, or eXtensible Stylesheet Language Transformations, is a W3C specification1 for how to manipulate XML.
1 XSLT and RDF are both specifications drafted by the World Wide Web Consortium, or W3C. Once drafted, a specification is handed down to vendors, who are _supposed_ to write software that follow the spec. XSLT is actually a pretty good example of this in practice, as opposed to CSS, which any web developer will tell you, isn't exactly standard across browsers.
XSLT is itself XML, which makes it familiar, if a bit verbose. Using XPath, a querying language inside the XSLT spec2, you pick out nodes from the XML tree using selectors, and process them according to different templates that you specify.
2 There's a long story behind this. Basically, XPath was motivated by XSLT and something called XPointer. XSLT itself was developed alongside another similarly-featured technology called XQuery. Wikipedia tells me there's something else called XLink, which is summed up eloquently by the title of the 2002 O'Reily article: "XLink: Who Cares?"
It boggles the mind that a topic with so many Xs could be so unbelievably boring.
Let's say you're compiling a list of your favorite cities. Because it seemed like a good idea at the time, you decided to format it in XML, along with additional information, like each city's country, population and area. You want to put this online, but don't want to do any more work as the list grows over time.
What's that about a database?
PHP?
Bah!
Might I suggest XSLT for this heavily rhetorical situation?
Anyway, here's the list of cities:
<?xml version="1.0" encoding="UTF-8"?>
<cities>
<city name="Kyoto">
<country>Japan</country>
<population>1464990</population>
<area units="km²">1779</area>
</city>
<city name="San Francisco">
<country>United States</country>
<population>764976</population>
<area units="km²">600</area>
</city>
<city name="Portland">
<country>United States</country>
<population>568380</population>
<area units="km²">376</area>
</city>
<city name="Bremen">
<country>Germany</country>
<population>548477</population>
<area units="km²">1679</area>
</city>
<city name="Doha">
<country>Qatar</country>
<population>339847</population>
<area units="km²">2574</area>
</city>
</cities>
Looks good. Now how do we turn this into HTML?
Even if you're not too familiar with XML, you'll notice that it bears a striking similarity to what you see when you view a webpage's source code. That's because HTML (or rather XHTML, to be precise) is XML. Since XSLT can transform an XML document into another, that means any XML document can be turned into XHTML and vice-versa.
Let's make a simple XSLT document to list the names of the cities in the list:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<html>
<head></head>
<body>
<h1>Cities</h1>
<ul>
<xsl:for-each select="//city">
<li><xsl:value-of select="@name"/></li>
</xsl:for-each>
</ul>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
This should give you a good idea of the basic way XSLT works. First, as with any XML document, we declare our XML header. Once we've done that, we open up the main part of the document, the <xsl:stylesheet> element. The xmlns:xsl="http://www.w3.org/1999/XSL/Transform" specifies the namespace of the document as an XSL Transformation.
Within the stylesheet, we can define a number of <xsl:templates>, which trigger when the specified pattern is matched. This <xsl:template> matches on "/", or the root element, meaning that it triggers at the start of a document. This catch-all will always trigger first, so it's a good place to start constructing our XHTML.
xsl:for-each allows us to iterate through all elements that match a particular query. As I mentioned before, these matcher strings are XPath selectors. The two slashes in //city looks for every instance of a <city
You can get a more comprehensive guide to XPath at W3 Schools
An unordered list of cities is alright, but what if we want to display all of our information?
Here's how we could output our information into a table:
<table>
<xsl:for-each select="//city">
<tr>
<td><strong><xsl:value-of select="./@name"/></strong></td>
<td><xsl:value-of select="./population"/></td>
<td><xsl:value-of select="./area"/> km<sup>2</sup></td>
<td><xsl:value-of select="./country"/></td>
</tr>
</xsl:for-each>
</table>
Just as before, we iterate through all of the cities using <xsl:for-each>, this time wrapping everything in a tr tag. Notice that the population, area, and country are not attributes, but rather the actual contents of their respective elements. ./ is just a more explicit convention that I prefer, meaning "this element" (just like a filepath in Unix).
As a really quick example to throw out there, this final iteration shows off some advanced features that you might find useful.
<h1>My <xsl:value-of select="count(//city)"/> Favorite Cities</h1>
<table>
<thead>
<tr>
<th>Name</th>
<th>Population</th>
<th>Area (sq miles)</th>
<th>Country</th>
</tr>
</thead>
<tbody>
<xsl:for-each select="//city">
<xsl:sort select="./@name"/>
<tr>
<td><strong><xsl:value-of select="./@name"/></strong></td>
<td><xsl:value-of select="format-number(./population, '#,###')"/></td>
<td><xsl:value-of select="format-number(./area * 0.386102159, '#,###')"/></td>
<td><xsl:value-of select="./country"/></td>
</tr>
</xsl:for-each>
</tbody>
</table>
In our <h1&h1;, we count the number of elements from a particular query result by using the function count. There are a lot of useful built-in functions like this. Again, check out the W3 Schools reference for a good listing.
Another thing you can do is sort the results of a <xsl:for-each> by using the <xsl:sort> element. The specifier points the value you want to order on for each element. In our case, we're ordering our list alphabetically by the name of the city.
Finally, you might want to format or operate on a value using XSLT. In this example, we're doing both, by converting the area in km2 into square miles by multiplying by the conversion ratio, and then taking that, and formatting it to use comma separators. format-number(), like count(), is a built-in XSLT function.
Now that we're using XSLT with a fair degree of confidence, we're read to go back to our NY Times article. Our example showed how the article might be marked up using RDF, but since we want to use it with SearchMonkey, we'll want to translate that into DataRSS.
Unfortunately, in the real world, markup is wildly inconsistent across different websites. You can never tell how easy it will be to get to the information you want without looking under the hood at the page source code. If you're lucky, your website of interest gives special classes or ids to things you want to get at. If you're not so lucky, there may actually be no consistent solution for how to access information across the site.
In our case, the nytimes.com doesn't just have good markup, they even have some custom tags, like <NYT_BYLINE> that we can take advantage of. Here's what I came up with:
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:template match="/">
<adjunctcontainer>
<adjunct id="smid:{$smid}" version="1.0">
<item rel="rel:Posting">
<meta property="dc:identifier"><xsl:value-of select="//meta[@name='articleid']/@content"/></meta>
<meta property="dc:type">News Story</meta>
<meta property="dc:title"><xsl:value-of select="//NYT_HEADLINE"/></meta>
<meta property="dc:creator"><xsl:value-of select="//NYT_BYLINE"/></meta>
<meta property="dc:date"><xsl:value-of select="//meta[@name='pdate']/@content"/></meta>
<meta property="dc:summary"><xsl:value-of select="//NYT_TEXT/p"/></meta>
<item rel="rel:Photo" resource="{//div[@class='image']//img/@src}">
<meta property="media:width"><xsl:value-of select="//div[@class='image']//img/@width"/></meta>
<meta property="media:height"><xsl:value-of select="//div[@class='image']//img/@height"/></meta>
</item>
</item>
</adjunct>
</adjunctcontainer>
</xsl:template>
</xsl:stylesheet>
This should give you a good idea of what to expect with DataRSS. For the most part, you'll be worrying about items and their contents. Items have a rel property that specifies exactly what's represented, whether it's a person, article, photo, etc. Here are the docs for rel properties.
Items have meta elements, with property attributes that work kind of like regular RDF tags. For example, dc:title is a property now, whereas in RDF it was its own tag. Check out the docs to find all of the available properties in DataRSS.
Once you've built a data service in SearchMonkey, you can start writing a presentation layer to pull from it. Since you'll be referencing values pulled from DataRSS, there's a big incentive for diligence in how you name elements. Don't obsess over it, but use what's available with rel's and properties as best as you can.
XSLT and RDF are pretty simple on their own. Scraping sites, however, is pretty tedious. Although we didn't build a SearchMonkey Application in this article, we covered all of the basics of how you might go about it.
Once you've gotten the hang of it, it starts to be pretty fun, so get out there and start hacking with SearchMonkey!
Mattt Thompson
YDN Tech Evangelist
Editor's note: Props and fond farewells to Mattt Thompson, who's heading back to college. Mattt, you'll be missed here at the YDN. So long and thanks for all the fishsticks.
Posted at 6:08 PM | Comments (2)
If you've been waiting 'till now to build a SearchMonkey application, you're in luck — we're pleased to announce that two intrepid SearchMonkey developers have independently created some great tutorials to help you get started.
First, Mark Birbeck of webBackplane has released a set of SearchMonkey tutorials that cover the DataRSS format, creating a DataRSS XSLT extractor, and creating a presentation application. Mark's specific focus is on government websites and expanding the ways these websites can share information with citizens. In fact, one of the reasons Mark is so interested in SearchMonkey is that SearchMonkey apps enable people to see what's possible when you embed RDFa in your pages. Mark also points out that custom data services are an excellent way to prototype these kinds of features before you go to the trouble of changing your markup.
If that weren't enough, Chris Lindsey of the Yahoo! Search Editorial team has posted a two-part SearchMonkey tutorial series on How to Build a Custom Data Service [PDF] and How to Build a SearchMonkey app (Infobar) [PDF]. These step-by-step tutorials are written with the newbie developer in mind, and cover various pitfalls that you might encounter when you're first learning how SearchMonkey works.
Finally, it's worth noting that a number of people have posted SearchMonkey-related presentations on slideshare.net, including Chris Heilmann, Neil Crosby, and David Berkowitz. If you've got SearchMonkey tutorials or interesting apps to share, feel free to let us know.
Evan Goer
Yahoo! SearchMonkey Team
Posted at 4:19 PM | Comments (2)
This is a cross-posting from the Yahoo! Search Blog.
In the past few months, SearchMonkey developers have told us they'd like to use Enhanced Results for site search. Yahoo! and other search engines have long had a site restrict operator (e.g. site:anysite.com) and other site search tools, but we decided to launch a new capability that lets you add a query parameter that automatically turns on the SearchMonkey Enhanced Result for the site you're searching. This is important for site owners because it makes it easier for their communities to get more complete answers when they search on Yahoo! Search.
This new parameter will work with any app that's in the Yahoo! Search Gallery as well as any official app. (To make an app official, a site owner just needs to authenticate their site using Site Explorer and then associate their app with their site when they make it sharable in the Developer Tool)
How it works
To use this functionality, you just need to append a few parameters to a typical Yahoo! Search query string. Here's a quick example:
This query string can be generated using whatever mechanism you choose, including a simple search box on your site or blog. It works with both Infobars and Enhanced Results -- as long as the app is either in the Gallery or is official. If you accidentally try a SearchMonkey app along with a site restriction that doesn't match, the results will look like a typical site-restricted search. Here are a few more examples:
Site: Wikipedia
App: http://gallery.search.yahoo.com/application?smid=knb
SearchMonkey for site search query: http://search.yahoo.com/search?q=george+washington&vs=wikipedia.org&sm=knb
Try a search:
Site: Yahoo! Answers
App: http://gallery.search.yahoo.com/application?smid=ylc
SearchMonkey for site search query: http://search.yahoo.com/search?q=can+pigs+fly?&vs=answers.yahoo.com&sm=ylc
Try a search:
Site: java.sun.com
App: http://gallery.search.yahoo.com/application?smid=Zq0
SearchMonkey for site search query: http://search.yahoo.com/search?p=exception&vs=java.sun.com&sm=Zq0
Try a search:
Before we continue work on this feature, please let us know what you think below or on the Developer group. We'd be particularly interested to hear what else site owners would like to be able to customize in order to make SearchMonkey a more valuable site search tool.
The SearchMonkey team
Posted at 10:13 AM | Comments (1)
This is a cross-posting from the Yahoo! Search Blog.
Last month we opened up the Yahoo! Search Gallery to showcase all of the useful SearchMonkey applications that have been built by developers, site owners and Yahoo!. Today, we’re turning on a few of those applications for all users. Now, the Yelp, Yahoo! Local and LinkedIn Enhanced Results will automatically appear in the search results, eliminating the need for users to go into the Search Gallery to add them.
Why did we start with these applications? Before making an application “default on” we require a few things: access to the site’s structured data through semantic markup or a data feed, a well-designed and broadly useful application, and positive user metrics. To understand how a SearchMonkey app affects user metrics, we generally expose a small percentage of our users to a default-on experience and measure if and how it changes their usage. We started with Yelp, LinkedIn, and Yahoo! Local because they were among our first partners to share structured data. Our tests uncovered that users found these apps useful; in fact, in some cases, we saw a lift in click-through rate of as high as 15 percent.
In addition to testing a "default on" treatment, we also tested giving users the ability to add the LinkedIn Enhanced Result directly from the search results page. We'll continue to use this treatment as another way to promote high quality SearchMonkey apps to users.
While these are the first apps to be automatically included in the search results, they will certainly not be the last. We'll continue to work with our SearchMonkey developers to increase the exposure of other high-quality applications to the search results page in the months to come (so, make sure to submit your applications to the Gallery). Making it easier to find and add SearchMonkey apps is an important step in improving and enriching the search experience for our users.
In addition to turning these Enhanced Results on by default, we've also added "share-with-a-friend" functionality. By clicking the envelope icon on any Enhanced Result, users can now quickly send an email to their friends to share the app.
Let us know how these Enhanced Results are working and others you’d like to see in the comments below!
Amit Kumar
Director, Product Management
Yahoo! Search
Posted at 2:13 PM | Comments (0)
We are testing a new template design for Enhanced Result applications. This new template contains two key design improvements that stem directly from recent Yahoo! user research. If our tests are successful (we're exposing 5% of our traffic to this new template), we'll roll it out to all users in the coming months. Developers do not have to perform any actions to take advantage of this new redesign; all Enhanced Result applications will upgrade automatically. Infobar applications will be unaffected.




Key differences include:
Images appear to the right of the abstract, rather than the left. Our latest eye-tracking studies indicate that moving the image helps users selectively discover it — without disrupting their ability to scan the search result page. While we know that users do like relevant images, we also know that images can increase the risk that the result resembles an advertisement. Shifting the image to the right reduces the perception of the result as an advertisement, and caused users to report noticing the images more. The new design also helps users engage the image only when they feel it is relevant.
Links appear in a horizontal row, rather than as a vertical column next to the image. Research indicates that deep links are more discoverable when presented separately from the image. Additionally, moving the links into a horizontal row provides more real estate for the summary and key/value pairs.
In a subsequent post, one of our researchers will dive into the data behind this redesign in much more detail. Meanwhile, stay tuned — we'll let you know when the new designs are about to go live!
Evan Goer
Yahoo! SearchMonkey Team
Posted at 10:49 AM | Comments (0)

Search Monkey arrived in Paris with the sound of flying toy monkeys screeching as they flew through the La Cantine café. The festive, attentive group gathered to hear about the new Yahoo! Search feature and to fling the little guys across the room.
La Cantine is a cross between an Internet Café and a co-op workspace. Its central Paris location provides space to collaborate, learn, share, and explore the future of technology.

The Yahoo! Developer Network and local Yahoos joined forces to explain Search Monkey; an open interface for providing enhanced search result presentations. Participants were able to start building Search Monkey applications in small groups as well as learn about Microformats.
Search Monkey Applications
Search Monkey allows any developer to create enhanced presentations for a web site when it appears in Yahoo! Search. The standard result displays the page title, description, and web site address. A Search Monkey powered result could contain photographs, address, contact information, deep links to particular pages, and additional information from related pages. This makes search result pages more engaging for users.
Search Monkey applications do this by defining small bits of data that should fill the space. Search Monkey automatically grabs microformatted data on the page, as well as providing the standard Yahoo! Search indexed information. The programmer can also define custom data by scraping the desired page or using web services.
Search Monkey opportunities in France
Museums - Provide images, titles, artist, date, and other information for their objects when they appear in search results.
Traffic - Provide traffic conditions, rates, times, and tourist information when a city or train or metro route appears in the search result pages.
Restaurants - Directories could provide ratings, addresses, phone numbers, and summaries.
Theater - Cinema, Theater, Opera results include times, synopsis, rating, and location information.
Sports - Tour de France and other sports include player information, team scores, upcoming events, and standings.
Ted Drake
Web Developer - Global Finance
Posted at 10:50 AM | Comments (1)
Intrepid coder Bart Teeuwisse has written up an excellent technical account of creating "Tweet", a beautifully designed SearchMonkey app for Twitter. From a performance standpoint, writing a Twitter SearchMonkey app is particularly challenging, as Bart explains:
It turns out that execution speed of a SearchMonkey is key. To make the SearchMonkey Gallery a presentation monkey such as Tweet has to complete within a fraction of a second. Any call to fetch 3rd party takes too long to satisfy this requirement. Certainly calling Twitter's API whose fluctuating response times are all over the map.
Secondly, Twitter's profile API call takes a user ID, which first has to be extracted from Yahoo!'s indexed data. An additional data SearchMonkey can do that and whose output is the input to Tweet's profile feching data monkey. However, this chaining of data monkeys makes Tweet only slower.
Fortunately, Bart hit on a really clever solution: a mashup with Google App Engine, which acts as a simple proxy cache for Twitter data, which SearchMonkey can then consume. The result (after also adding Bart's own FriendNet infobar app):

Not only is the caching a nifty way to smooth out the API response times, but it also helps reduce the number of (rate-limited) API calls required. Read more about it at Bart's place.
Posted at 11:17 AM | Comments (0)
In our continued effort to search for the right balance between simplicity and expressiveness, we are revising some aspects of the DataRSS format used by SearchMonkey applications. As a first step, we have made it possible to provide a space-separated list of compact URIs as part of the rel and property attributes instead of providing a single value. This is to support situations where there are multiple properties (possibly from different vocabularies) expressing the proper relationship or there are multiple relationships to begin with. For example, mixing the FOAF and VCard vocabularies you may write
<item rel="dc:subject rel:Card">
<meta property="vcard:fn foaf:name">Peter Mika</meta>
<item rel="foaf:knows xfn:co-worker">
<meta property="vcard:fn foaf:name">Amit Kumar</meta>
</item>
</item>
By providing support for a space-separated lists of compact URIs (CURIEs), we are bringing dataRSS closer to RDFa, which also allows multiple values for the property and rel attributes.
The microformat data we generate now also features space separated lists of CURIEs as in the above example. Note that presentation applications using Data::xpath and checking for the presence of a certain CURIE using equivalence ('=' in XPath) instead of containment (contains function in XPath) may fail. If your application used Data::xpath and uses microformat data from the index, you may need to update your application.
As a next step, we will introduce support for the 'typeof' attribute of RDFa. For the moment, typing is explicit in DataRSS: in the above example, rel:Card is a property that expresses that the type of the item is a VCard. Based on the analysis of current usage it seems that this is confusing for many SearchMonkey developers and we need a clearer separation between types and properties. We will report on this improvement when it becomes available.
As always, your feedback is welcome. Please consider providing comments on existing features using our mailing list or proposing new features using the suggestions board.
Peter Mika
Data Architect for SearchMonkey
Posted at 12:04 PM | Comments (1)
This Wednesday, Yahoo!'s London-based SearchMonkeys hosted a evening to show the developer community just how ridiculously easy it is to build enhancements to their Yahoo! Search results using Search Monkey. Even though we were up against one of Radiohead's concerts, and a Girl Geek Dinner, some fifty people came along to monkey around with us in the loft space of a lovely rambling building just off Covent Garden in London.
We were lucky to have Paul Tarjan, the Chief Technical Monkey from America along to give an overview of the inner workings of SearchMonkey. This was followed by Neil Crosby (that's me!) giving a live demo to show just how simple SearchMonkey makes enhancing your search results. Thankfully, the Internet stayed up, and the demos went without a hitch.
After the talks, the floor was opened for questions, and then people got down to the important task of eating pizza and making monkeys. Walking around the room, it was clear that people had interesting ideas about things they could make. I look forward to seeing them shared in the gallery soon.
The one recurring question I was asked during the evening was, "Are your slides available?" The good news is they're now up on Slideshare, and they walk through the process of creating a couple simple monkeys as I did for our live audience.
All in all, a good time was had. We gave out a whole bunch of toy monkeys that make a ridiculous amount of noise as they fly through the air (sorry if they've shown up in your office), as well as a bunch of hats and stickers to remember us by. The biggest takeaway of all: Enhancing your search results is really easy with SearchMonkey.
We're planning more of these developer evenings in London over the coming months, so keep an eye on the YDN blog and come along to the next one!
Neil Crosby
Engineer, Yahoo London
Posted at 10:53 AM | Comments (1)
I'd like to share some highlights from the two-day Search Engine Strategies 2008 Toronto Conference & Expo that took place last week. I was delighted by the opportunity to give a talk on "Web 2.0 and Search Engines." My audience consisted of search engine marketers, sales, strategist, consultants, and engineers. The participants had a common goal: how to maximize search engine optimization -- for themselves and/or their clients.
Day 1: The keynote delivered by Fredrick Marckini was excellent, with many useful learnings. I also attended a great session called "Universal and Blended Search" session. Key takeaway here for me: Because users spend more time viewing images and video nowadays, SEO involves more than just text and links on the page.
The "Getting Found in Maps & Local Search" session offered useful information. The materials were not new to me, but that's probably because I'm a local search engineer. The session on "Twitter: Ultimate Time Waster, or Great Tool" was quite interesting. I didn't expect much from this session because I hadn't twittered before. After the session I started twittering.
The SES party sponsored by Yahoo! was great. Unfortunately, I couldn't party all night long because I needed to tweak my presentation and tune it for the SES audience. That's the advantage of presenting on Day 2 of a conference. Let's face it, as a speaker, you can't please everyone, but at least you can try to keep the audience awake. If you can't do that -- go take a Toastmasters course!
Day 2: I hid myself in a conference room to rehearse my presentation, which focused on Web 2.0, and how Yahoo! leverages microformats and semantic markup to enable SearchMonkey. With SearchMonkey and the semantic web, site owners can differentiate themselves from their competitors by creating their own blended search experience and unique presentation. Site owners can (theoretically) redefine the search result page heatmap. The customized search result and informative presentation will attract the users' focus to the publisher's listing. So the conventional wisdom of heatmap and golden triangle are outdated. SearchMonkey creates new opportunities for search marketers. SEO is not longer just about links, metadata, h1s, keywords, and text. It's about creating efficient access to information by recognizing the context of the data. Content alone is NOT King. Content without context is like a life without purpose.
My message to the audience was: Give meaning to your data with SearchMonkey's semantic markup --then you'll be ahead of the curve. Happy SEOing!
Ambles Kwok
Technical Yahoo!
Yahoo! Canada
Posted at 8:55 AM | Comments (1)
The SearchMonkey team is inviting a whole lot of developers throughout (well northwest) Europe to come and meet SearchMonkey experts, have a drink and some food and show and tell us what they could use SearchMonkey for.
Three locations in parallel will be monkeyed up:
So what are you waiting for? Go sign up and we'll see each other there next Wednesday!
Chris Heilmann
Yahoo Developer Network
Posted at 2:04 AM | Comments (0)
Here is a screencast showing how to make a "data triggered" plugin. It won't show the normal "Load Error" or "Enhancement Failed" for results that don't have the data. Remember, this technique only works on prexisting data, and not data services, so go mark-up your pages! :)
On Yahoo Video: http://video.yahoo.com/watch/2827476/8201306
Module (so you can clone it if somehow the gallery clone doesn't work) : SearchMonkey-YDirectory.txt
Gallery (so you can use it, and clone it) : Y!Directory in Gallery
Posted at 12:28 PM | Comments (3)
On Wednesday night we announced that the Search Gallery, which had previously been open to developers, is now available to users. You will now notice a “Customize” drop down menu in the header of the search results pages. From there, users can browse applications and add those they find useful.

There are already a bunch of great applications in the Gallery, and a number of our partners have blogged about their apps. Here’s a few examples:
last.fm built a great app for music searching, and they blogged about the process here.

LinkedIn blogged about their app, which includes profile photos and details.

Trulia’s blog discusses their app, which includes photos and links to related sections of their site.

Thanks again to everyone who’s built SearchMonkey apps (and blogged about it!). Don't forget to enter your apps in the SearchMonkey Developer Challenge!
Graham Mudd
Posted at 9:41 AM | Comments (0)
Recently we added two new requirements for gallery apps. If you have apps in the gallery (or are submitting new ones), please check that the apps meet these guidelines:
getOutput() method, so we default to the simple search result.
And for other requirements please refer to our previous blog post here.
Lawrence Kim
Yahoo! Search
Posted at 2:34 PM | Comments (0)
Digital Web just released my Step-by-Step walkthrough on how to create your first SearchMonkey application.
This is basically the demonstration we showed the attendees at the SearchMonkey developer breakfast in London and is a good starting point for those who like visual explanations.
Christian Heilmann
Yahoo Developer Network
Posted at 11:06 AM | Comments (0)
Have you ever wondered which pages on the internet have hCard on them? Or hAtom? hReview? XFN? Well, now that Yahoo! Search is indexing all of these formats, you can easily search the web for them.
Just do a search for:
We sort the output for relevance too, so you can add in query terms as well. If you want to find a searchmonkey meeting advertising in hcalendar format: searchmonkey searchmonkeyid:com.yahoo.uf.hcalendar.
Enjoy the Monkey.
Paul Tarjan
(|): Chief Technical Monkey :(|)
Posted at 9:21 PM | Comments (4) | TrackBack
Besides the existing support for microformats, we have already shared our plans for supporting other standards for embedding metadata into HTML. Today we are announcing the availability of eRDF metadata for SearchMonkey applications, which will soon be followed by support for RDFa. SearchMonkey applications can make direct use of the eRDF data by choosing the com.yahoo.rdf.erdf data source, while RDFa data will appear under com.yahoo.rdf.rdfa. Nothing changes in the way applications are created: as SearchMonkey applications have already been built on a triple-based model, the same applications can work on both microformat, eRDF or RDFa data.
Content publishers, however, will now have an even wider array of choice for providing metadata inside HTML. Therefore it is worthwhile to briefly summarize the key differences between microformats, eRDF and RDFa and the possible migration paths across these approaches:
There are plenty of resources to familiarize with both eRDF and RDFa. The SearchMonkey guide has a brief overview of the topic. The eRDF specification and the RDFa Primer are more technical, but also complete, contain plenty of examples and still fairly easily readable. The tools supporting eRDF are listed on the same page as the specification. Here are some links to the RDFa implementations and tools.
In summary, our support for eRDF and RDFa brings even more choice for publishers while open up new data sources for application developers!
Peter Mika
Data Architect, SearchMonkey
Posted at 10:08 AM | Comments (5)
Just a quick note for everyone that is using any extracted hCard. The way we store hCard changed. This was done to more faithful to the vcard-in-rdf specification and also adds support for vcard:adr and organization name.
| URL | |
|---|---|
| Old | com.yahoo.uf.hcard/rel:Card/vcard:url |
| New | com.yahoo.uf.hcard/rel:Card/@resource |
| Photo | |
|---|---|
| Old | com.yahoo.uf.hcard/rel:Card/rel:Photo/@resource |
| Old | com.yahoo.uf.hcard/rel:Card/vcard:photo |
| New | com.yahoo.uf.hcard/rel:Card/vcard:photo/@resource |
We are going to have about 10 minutes of downtime tonight (around 10pm PST) while we do a string replace on the database for those fields. If you were doing any fancy things that wouldn't be caught by a simple string replace from OLD to NEW, then please update your code accordingly (and republish to the gallery if needed).
Paul Tarjan
(|): Chief Technical Monkey :(|)
Posted at 9:14 PM | Comments (0)
Ever since the SearchMonkey Gallery went live, we have been reviewing tons of app submissions from our developer community late into the night. We’re trying to discover apps that are creative, useful, fun and engaging so that we can proudly showcase them to the rest of the world. For those who have submitted their apps to the gallery, thank you for your contributions and we’re excited to have you participate! If you have developed an app but did not submit it to the gallery, please consider sending it over so that the rest of the Yahoo! community can benefit from your creation.
For developers who want to get their apps into the gallery, below is a list of things we consider when reviewing your submissions.
Well, now you know some of our key considerations when reviewing your apps. Please use this as a guideline for writing your awesome SearchMonkey applications in the future!
Lawrence Kim
Yahoo! Search
Posted at 7:30 AM | Comments (1)
It's been a week since the SearchMonkey Launch Party (which, by the way, was a TON of fun), but we've still been slaving in the jungles for you.
There is another SearchMonkey release going out RIGHT NOW with many neat new features. In no specific order:
<meta property="vcard:fn"><xsl:value-of select="substring-before(substring-after(//script, 'I think that '), ' is awesome')" /></meta>
Will still get my name from :
<script>alert("I think that Paul Tarjan is awesome!")</script>
P.S. For anyone who puts that script on their site--10 points, for some definition of "points."
We'll keep you posted about other changes as we roll them out. Don't forget to post to the Developer and Site owner Yahoo! groups about anything and everything. Well, maybe not your entire life history, but at least the part with SearchMonkey in it.
Paul Tarjan
(|): Chief Technical Monkey :(|)
Posted at 8:56 PM | Comments (0) | TrackBack
Check out this informative article on writing applications on the SearchMonkey platform at SearchEngineLand - a must read!.
Done? Now write your own application!
Posted at 3:00 PM | Comments (0)
This morning a group of selected developers from different companies took their first steps in creating their first SearchMonkey apps in the London office of Yahoo.
Developers from Skype, the BBC, ebay, MySpace and other interested parties came to the office to enjoy sandwiches, coffee and monkey cookies and get both a quick introduction to the idea of SearchMonkey, how it will apply to their companies and hands-on help in developing their first applications.
Richard Allinson, Ben Cosgrave and Russell Aronson from the UK monkey team helped the others find the easiest way to get their buried content out into search result pages using a few lines of XSLT and PHP.
The initial feedback was very positive and one of the great things about SearchMonkey - making people look more closely at the markup of their sites and the idea of adding Microformats to ease data extraction - took off almost immediately.
What are you waiting for? Start monkeying around, too!
Chris Heilmann
Yahoo Developer Network
Posted at 5:05 AM | Comments (0)
Original post featured on Yahoo! Search Blog
It's been three weeks since we began the limited preview of Yahoo! Search's new open developer platform, SearchMonkey. Today, we're officially opening up the doors to all developers -- professionals and hobbyists -- to begin building applications that enhance the usefulness and relevance of search results.
There are three components to this open ecosystem:
So, what's in it for developers?
With SearchMonkey, developers have a hand in shaping the next generation of search by building customized search results and mash-ups that users can add to their Yahoo! Search experience. By leveraging structured data from sites like CitySearch, StumbleUpon, eBay, or Epicurious.com, developers can add navigational links, reviews, contact information, and even locations to provide enhanced search listings.
Developers can build two types of applications using SearchMonkey: Enhanced Results and Infobars. Enhanced Results replace the current standard results with a richer display. All the links in the Enhanced Results must point to the site to which the result refers. Infobars are appended below search results and can include metadata about the result, related links or content, or links for user actions (such as adding a movie to a Netflix queue).
The process for building SearchMonkey applications is very straightforward:
Announcing the SearchMonkey Developer Challenge
To foster innovation and creativity on the SearchMonkey platform, we're hosting a good old-fashioned competition. The SearchMonkey Developer Challenge will recognize innovative applications within four categories: Best Enhanced Result, Best Infobar, Most Innovative Use of Structured Data, Best Data Service, and Grand Prize (best over all categories). You have until June 14th to submit your applications for a chance to win up to $10,000.
And don't forget to come kick things off with us this evening at the SearchMonkey Developer Launch Party. Catch live demos, meet the product team and enjoy free food, beer and, of course, schwag at Yahoo!'s headquarters in Sunnyvale, CA.
Whether you can join us for the party or not, keep in touch -- visit our suggestion forum or leave us a comment below. We want to know how the tool is working out for you.
We look forward to evolving web search with you.
Amit Kumar
Director, Product Management
Yahoo! Search
Posted at 9:04 AM | Comments (4)
Adding structured data to your site doesn't need to be complicated or difficult. It can be as simple as adding a handful of attributes to your page -- "'class"and "rel" are the most common. Many sites use semantic markup, and already have these attributes, in which case you can insert additional values into existing attributes, since these actually hold space-separated lists. Microformats is the name of one common method for using this kind of simple markup.
Microformats are community-driven standards, put together and maintained by volunteers outside of a formal organization like IETF. Typically, they cover well-worn use cases, a concept called "paving the cowpaths." A number of microformat specifications in various stages of development are available at microformats headquarters.
Initially, the Yahoo! Search indexer supports the following microformats:
* hCard for personal or organization contact info
* hCalendar for event descriptions and timelines
* hAtom for syndicated content as might appear in an RSS feed
* hReview to record review ratings such as "8.5 out of 10"
* XFN to track relationships on the social graph in a lightweight fashion
The Web has a huge number of helpful articles and tutorials on using microformats. If the structured data your site exposes falls into any of categories above, then microformats are probably a good choice for you.
Here's a simple example. If your personal site already has markup like this, pointing to one of your other sites:
<a href="http://myothersite.com/blog">My site</a>
Add XFN with a single attribute, like this:
<a rel="me" href="http://myothersite.com/blog">My site</a>
The value of rel="me" indicates that the other site is also representative of you.
A more involved example requires changes across more than one element. Let's say a page mentions a review of an iPod, like this:
<div>Overall, I give the iPod a rating of 8 (out of 10)</div>
To add hReview markup to this, a few additional wrapper elements are needed, like this:
<div class="hreview">
Overall, I give the
<span class="item">
<span class="fn">iPod</span>
</span>
a rating of
<span class="rating">
8
</span>
(out of
<span class="best">
10
</span>
)
</div>
This is simplified markup. Consult the microformats.org site for specific details. In general, these changes indicate that the overall structure is a review ("hReview"), that the item being reviewed is an iPod (additional details such as a URL are helpful), and the rating is 8 out of a possible 10.
What if your structured markup needs go beyond the list of supported microformats? Please provide us with feedback, as we are continuously evaluating and adding support for additional microformats. On the other hand, you might want to consider using more expressive RDF markup. If you want to expose structured data but aren't ready (or able) to make site changes yet, then you might consider writing a custom data service. Stay tuned, we'll cover these topics in future blog posts.
By the way, we'll be talking about microformats among other things later today (Thursday, May 15) at the SearchMonkey Launch Party at Yahoo!'s Sunnyvale headquarters. Here are the details – join us for a bit of SearchMonkey talk and a plenty of beer, food, and schwag.
Micah Dubinko
SearchMonkey Team Alumnus
Posted at 7:11 AM | Comments (4)
A few weeks ago, we announced SearchMonkey, an new open platform that lets developers and site owners use semantic markup and structured data to enhance Yahoo! Search results and make them more useful, relevant, and visually appealing.
We wanted to remind you that we're kicking off this launch in true SearchMonkey style with a Developer Launch Party next Thursday, May 15. Come get the inside scoop on SearchMonkey, meet with our product managers and engineers over tasty (read: free!) food and beer, see live demos, and take a closer look at the Developer Tool.
When: May 15, 2008, 5:30 -- 8:30 p.m.
Where: Yahoo! Headquarters @ URL's Cafe
701 First Ave.
Sunnyvale, CA 94089
RSVP: Email your full name and company name directly
to searchmonkeyevent@yahoo-inc.com. Space is limited.
For more information on the agenda and logistics, check out the event page.
Posted at 11:58 AM | Comments (0)
In February, we began talking about our plans to open up Yahoo! Search to website owners and all third-party developers. This new developer platform, which we’re calling SearchMonkey, uses data web standards and structured data to enhance the functionality, appearance, and usefulness of search results.
With SearchMonkey:
As Ari Balogh, Yahoo!’s CTO, will mention in his Web 2.0 Expo keynote this morning, we’re rolling out a limited preview of the SearchMonkey developer tool starting today. With this online tool, developers can build data services that can be used to present richer, more useful search results. These data services can be constructed using structured data either from the Yahoo! Search index or from publicly available sources (such as APIs).
In addition to signing up for the preview, head over to the Yahoo! Developer Network booth at Web 2.0 Expo to check out a demo.
And what’s a new product without a party to kick it off? We’re celebrating this component of our open platform with a Developer Event at our Sunnyvale campus on Thursday, May 15. If you’re interested in joining us, here’s more information. Hope to see you there!
Amit Kumar
Chief SearchMonkey
[This post originally featured on the Yahoo! Search blog]
Posted at 11:13 PM | Comments (0)
Copyright © 2008 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Copyright Policy - Job Openings