Yahoo! Developer Network Blog
« Previous | Main | Next »
October 31, 2008
Create a niche search engine with Yahoo! BOSS
The Yahoo! BOSS API allows you to access the Yahoo! search index with new levels of freedom. You can rearrange the results, change their look, have unlimited requests, mash the results with other resources, and you don't even have to let people know that Yahoo! is powering the page. Many people are busy mashing the BOSS results with internal data sets, proprietary logic, and new visual interfaces.
BOSS is perfect for creating a niche search engine. With just a few tweaks, you can create a site that is finetuned to your particular subject. This article will walk you through some of those options.
Any Volkswagen afficianado can tell you it is difficult to find good information in search engines. VWs unfortunately do not have unique model names. Searching for information about a Rabbit, Bug, Beetle, and Golf is frustrating. You have to dig through thousands of results about insects, mammals, and Tiger Woods [sample unfiltered search result (.xml)]. However, we can create a killer VW search engine with just a few BOSS configurations.
First Steps
You'll need to apply for an application ID to create your own BOSS-based site,. Get one at the Yahoo! Developer Network. It will only take a few minutes and you can update your information later.
This article will display suggested api requests. Replace "your-BOSS-app-id" with the ID you receive from the previous step.
Site-Specific Search
This is the easiest configuration. Let's assume you want to search within a single site, such as VW.com. BOSS recognizes many of the search filters you would use in any search box. This includes +, -, "", and site:. Let's make a web service request that searches only the official VW web site.
The key here is the query. We will take the user's request and add "site:vw.com" [ sample site: search result (.xml)]
http://boss.yahooapis.com/ysearch/web/v1/golf+site:vw.com?appid=your-BOSS-app-id&format=xml&start=0&count=15
Define Your Pool of Expert Resources
The site-specific filter is going to limit the usefulness of your site. Let's open the results to a wider range of resources. The "sites" query param lets you define a list of sites for BOSS to search through. While BOSS can handle tons of sites, you are limited by the length of a url in the request. To be safe, keep your list to no more than 30.
The BOSS team is evaluating options for massive lists. However, this isn't an issue for most niche search engines.
The pattern is pretty simple. Insert a sites query param that equals a comma separated list of urls. Here's a very brief list of VW experts for this demonstration [ sample sites based search result (.xml) ]:
http://boss.yahooapis.com/ysearch/web/v1/golf?sites=vw.com,vwtrendsweb.com,performancevwmag.com,caranddriver.com&appid=your-BOSS-app-id&format=xml&start=0&count=15
Now that you've created a set of experts for VW news and information, you can create a second group for parts and stores. This is how you can quickly create sub-categories in your search site.
Refine Your Search Results
As mentioned earlier, the Yahoo! BOSS API recognizes most of the advanced search filters. Let's see how we can start making our results even more specific. These can be used with either of the above techniques.
Tag Search
While BOSS does not have a tag search function, you can use the inurl: filter to get a similar functionality. This will work especially well if your set of resources includes blogs; which have the tag as part of the url, i.e. http://myblog.com/golf/sample-post [ sample inurl: search result (.xml) ]. Notice how the display url in the sample xml has the query term wrapped in b tags. BOSS results make it easier for your visitors to recognize the filter results.
http://boss.yahooapis.com/ysearch/web/v1/inurl:golf?sites=vw.com,vwtrendsweb.com,performancevwmag.com,caranddriver.com&appid=your-BOSS-app-id&format=xml&start=0&count=15
Title Search
You may find better results by using the intitle: filter instead. This will only return pages that have the query in their title.
[ sample intitle: search result (.xml) ]
http://boss.yahooapis.com/ysearch/web/v1/intitle:golf?sites=vw.com,vwtrendsweb.com,performancevwmag.com,caranddriver.com&appid=your-BOSS-app-id&format=xml&start=0&count=15
Get Related Sites
Now let's add another layer to your search results. We can make a secondary request for each result to find related web pages. This would have some performance impact, but could be done as an AJAX request after the page has loaded.
We will use the related: filter. Let's grab this result from the above intitle: search result: http://www.caranddriver.com/car/2006-models/2006-golf.html. We will now create a secondary request for this url to find websites that are related to it. [sample related search result (.xml)]
http://boss.yahooapis.com/ysearch/web/v1/related:http://www.caranddriver.com/car/2006-models/2006-golf.html?appid=your-BOSS-app-id&format=xml&start=0&count=15
Start Innovating
You've now created a set of authorities on your niche subject, you've given the user the ability to fine tune the results, and you've triggered a secondary request for related web sites. Next steps: offer multi-language support, display news and images for the query, use Yahoo! Pipes to mash the results with other services. The possibilities are only limited by your imagination.
Ted Drake
Yahoo! Paris
Posted at October 31, 2008 10:17 AM | Permalink
Comments
link on "Yahoo! BOSS API" is broken (htp:// -> http://)
Posted by: d2 at November 3, 2008 2:37 PM
hi fellas , thanks for all the great BOSS work .
i really love the technology and esp the amount of tweaking possible with BOSS .
i am looking forward for your best work to go beyond , and my site with boss is www.kidsuki.com/search.php .
i want to know how to remove certain set sites from the query result .
An example of this requirement is that , pokemonaholic.com is no more but yahoo index has lot of its result .
And any search with boss throws these as the relevant .
So how can i remove a set of sites from the result . Also can we refresh the index at will ?
Posted by: Deepan prabhu at November 28, 2008 12:41 AM
yay! this is what I'm looking for. Thanks for these tips, now i can begin with my boss app project. I really have a hard time in figuring out where to begin after having my API ID. Now, with this post, I know where to start. Hehehe.. I'm such a noob at this, but I'll try hard to create a good BOSS web app. :)
Posted by: Jehzeel Laurente at February 9, 2009 12:04 PM
Very helpful information.
Posted by: Carl Bowen at August 9, 2009 12:17 PM
Post a comment
Comment Policy: We encourage comments and look forward to hearing from you. Please note that Yahoo! may, in our sole discretion, remove comments if they are off topic, inappropriate, or otherwise violate our Terms of Service. Fields marked with asterisk '*' are required.
Subscribe
Recent Blog Articles
view all
YQL Open Table for Google Buzz now live
Tue, 09 Feb 2010
INSERT INTO twitter.status ...
Mon, 08 Feb 2010
Announcing the Yahoo! Brasil Open Hack Day 2010, 20-21 March
Mon, 08 Feb 2010
Marketing hacks, linchpins, and tech women of valor
Sun, 07 Feb 2010
Yahoo! India invites you to join the first India Hadoop Summit
Thu, 04 Feb 2010
Recent Links
Appcelerator Titanium + Yahoo YQL on Vimeo
Mon, 08 Feb 2010
Tue, 02 Feb 2010
PhoneGap | Cross platform mobile framework
Sat, 30 Jan 2010
Web developers can rule the iPad - O'Reilly Radar
Sat, 30 Jan 2010
rc3.org - Is the iPad the harbinger of doom for personal computing?
Thu, 28 Jan 2010
Archives
2010
2009
2008
2007
2006
2005
Recent Readers

