Yahoo! Developer Network Blog
« Previous | Main | Next »
November 26, 2008
Finding relevant keywords for any term using Yahoo! Boss
One of the new features of Yahoo! BOSS is that it returns relevant keywords for each of the search results. You can use this to match a search term with relevant keywords for example to increase the search engine visibility of your web site. I've just released a free tool to do this for you: Keywordfinder. Here's a short explanation on how you can do the same only using JavaScript.
Before we start, make sure you get an application ID for BOSS. Got one? Good, let's go!
The plan to get to the keywords is simple: perform the search, get all the keywords and find out which keywords are most frequent.
Turning that into to code is not quite as simple but not really magic either.
First of all, we need to get the information from BOSS by assembling the right REST call and define a callback function.
The URL for that is:
http://boss.yahooapis.com/ysearch/web/v1/{term}?
format=json&callback=mycallback&count=50&
view=keywords&appid={your-app-id}
This will search the web for term, return the results as JSON, wrap the information in a function called mycallback, limit it to 50 results and show the keywords Yahoo indexed for each of the sites (this is the view=keywords) parameter.
We could write this in a fixed term function but to keep things flexible, let's create a small wrapper for this to be re-used in all your JavaScript solutions.
You can download the script or see it in action and here is the source with step-by-step explanations:
KEYWORDS = function(){
var config = {
appID:'your-app-id',
amount:20
}
We create a new module called KEYWORDS and give it a configuration object with two properties: the application ID (replace this with yours!) and the amount of keywords you want to return.
var out = {};
var callback = alert;
function getTerms(term,cb){
var api = 'http://boss.yahooapis.com/ysearch/web/v1/' +
term + '?format=json&view=keyterms&callback=KEYWORDS.seed'+
'&appid=' + config.appID;
var s = document.createElement('script');
s.src = api;
s.type = 'text/javascript';
document.getElementsByTagName('head')[0].appendChild(s);
if(typeof cb === 'function'){
callback = cb;
}
}
We define an object called out which will later on store the result of all our work. We predefine the callback of our wrapper as alert which means that if your scripts using this wrapper don't define a callback function the information will be shown as an alert (a good reminder to define a callback). If there is a callback function, it overrides the alert preset. We create a new script node with the correct API call and add it to the head of the document. As we defined KEYWORDS.seed() as the callback for the BOSS call, this method will retrieve the relevant data once the call has been successful.
function seed(o){
if(typeof o.ysearchresponse.nextpage !== 'undefined'){
var next = o.ysearchresponse.nextpage.split('?')[0];
var query = next.replace(/.*?\//g,'');
}
out.term = query;
out.keywords = [];
var results = o.ysearchresponse.resultset_web;
for(var i=0,j=results.length;i<j;i++){
out.keywords = out.keywords.concat(results[i].keyterms.terms);
}
var filtered = filter(out.keywords)
out.keywords = filtered[0];
out.toplist = filtered[1];
callback(out);
}
In the seed() function we need to find which query term was used to return the current data set. As BOSS does not bring that back as an own property we need to extract it from the nextpage property (which is a bit annoying but not too hard to do).
We then start populating the out object by storing the query as a term property and creating a new array and storing it as a keywords property.
We loop through all the search results and concatenate the keywords array from the ones in each result set.
We then need to do some heavy filtering (as the resulting array is 200 terms in no particular weighting or order), re-define keywords as one of the returns of the filter method and a toplist property as the other return.
The toplist array will be a comma-separated string of all the top terms.
Once we got all the data and stored it as fitting properties we invoke the callback function and send the information.
function filter(kw){
var kw = kw.join(',');
kw = kw.toLowerCase();
kw = kw.split(',');
var kw = kw.sort();
var count = 0;
var filtered = [];
for(var i=0,j=kw.length-1;i<j;i++){
if(kw[i]!==kw[i+1]){
filtered.push(count + '|' + kw[i]);
count = 0;
}
count++;
}
filtered.sort(function(a,b){
return parseInt(a) - parseInt(b);
});
filtered.reverse();
filtered = filtered.slice(0,config.amount);
var toplist = []
for(var i=0,j=filtered.length;i<j;i++){
var bits = filtered[i].split('|');
toplist.push(bits[1]);
filtered[i] = {term:bits[1],amount:bits[0]};
}
return [filtered,toplist.join(',')];
}
The filter method is a bit of a pain, as JavaScript does not have a array_count_values() function like PHP does. It could be that I am off the mark with this implementation, so if you can shorten this, be my guest and put a link in the comments.
The problem is that we have a random array of keywords and we want to find out which one is used how often. We start by making the array elements all lowercase (using split() and join()) and sorting the array alphabetically. We then create a count variable and an empty array to store the filtered terms.
We then loop over the array and check if the next item is the same as the current one. If the element is the same, we increase the counter, and if it isn't we store the current term in the filtered array as counter|term.
This allows us to sort the resulting array numerically. We then reverse the order and which brings the most common term to become the first item and so on until we reach the least common term. We cut off the terms we don't want with slice() and that's the filtering done.
In order to make the data useful again we have to loop over the filtered array and assemble an array of all the top items containing objects with the properties term and amount. We also assemble a toplist array that only contains the most successful terms without any amount information and join this to a string when we return it.
return{seed:seed,get:getTerms,config:config}
}();
We return the properties and methods that need to be public and voila - we've got ourselves a wrapper.
How to use the function?You don't need to do much to use the function, all it needs to get keywords with the wrapper is to call KEYWORDS.get() with a term and a callback function.
For example:
function foo(o){
alert(o.toplist);
}
KEYWORDS.get('ozelots',foo);
The example page however is a bit more complex as it it progressively enhances a link:
<p>
<a id="keywordlink" href="http://keywordfinder.org/?term=wombats">
Get keywords for wombats
</a>
</p>
<script type="text/javascript" src="keywords.js"></script>
We provide a working link, and include the keywords.js file.
<script type="text/javascript">
var x = document.getElementById('keywordlink');
if(x){
x.onclick = function(){
var term = this.href.split('=')[1];
this.innerHTML += ' (loading...)';
KEYWORDS.get(term,seed);
return false;
}
}
We check if the link with the right ID exists and apply an event handler that retrieves the search term. We add a loading message to the link when it is clicked to tell the user that something is happening and call the keywords API with seed() as the callback function. We return false to make sure the link is not being followed.
function seed(o){
var div = document.createElement('div');
var head = document.createElement('h2');
head.innerHTML = 'Keywords for '+o.term;
div.appendChild(head);
var p = document.createElement('p');
p.innerHTML = o.toplist;
div.appendChild(p);
var head = document.createElement('h3');
head.innerHTML = 'Details:';
div.appendChild(head);
var list = document.createElement('ol');
for(var i=0,j=o.keywords.length;i<j;i++){
var li = document.createElement('li');
li.innerHTML = o.keywords[i].term + '('+o.keywords[i].amount+')';
list.appendChild(li);
}
div.appendChild(list);
x.parentNode.replaceChild(div,x);
}
</script>
All the seed() function does is retrieve the data and create a bunch of appropriate HTML using DOM scripting to replace the link.
I hope this has given you some ideas what to do with the keyword output of BOSS. Happy hacking!
Chris Heilmann
Yahoo Developer Network
Posted at November 26, 2008 8:05 AM | Permalink
Comments
Thanks a lot, guys! From your tutorial I've made my own search engine and presented it on the page http://devaka.ru/articles/yahoo-boss in my blog.
Yahoo! BOSS is a really useful platform!
Posted by: devaka at November 28, 2008 12:57 PM
Very useful for related keywords research! Would it be possible for you to provide a PHP version of this script?
Posted by: Indian Web Master at November 29, 2008 3:32 AM
Very thanks for the script ! It's really great ! I want to translate this article ! And to rewrite this script into perl language. Or maybe php.
Posted by: Irvin at May 3, 2009 10:04 AM
Post a comment
Comment Policy: We encourage comments and look forward to hearing from you. Please note that Yahoo! may, in our sole discretion, remove comments if they are off topic, inappropriate, or otherwise violate our Terms of Service. Fields marked with asterisk '*' are required.
Subscribe
Recent Blog Articles
view all
YQL Open Table for Google Buzz now live
Tue, 09 Feb 2010
INSERT INTO twitter.status ...
Mon, 08 Feb 2010
Announcing the Yahoo! Brasil Open Hack Day 2010, 20-21 March
Mon, 08 Feb 2010
Marketing hacks, linchpins, and tech women of valor
Sun, 07 Feb 2010
Yahoo! India invites you to join the first India Hadoop Summit
Thu, 04 Feb 2010
Recent Links
Appcelerator Titanium + Yahoo YQL on Vimeo
Mon, 08 Feb 2010
Tue, 02 Feb 2010
PhoneGap | Cross platform mobile framework
Sat, 30 Jan 2010
Web developers can rule the iPad - O'Reilly Radar
Sat, 30 Jan 2010
rc3.org - Is the iPad the harbinger of doom for personal computing?
Thu, 28 Jan 2010
Archives
2010
2009
2008
2007
2006
2005
Recent Readers

