Finding things on the web with BOSS

Post on 19-Oct-2014

6,047 views 0 download

Tags:

description

Introdcution to Yahoo's open search platform BOSS at Open Hack Day in Bangalore, India.

transcript

Finding things on the web with BOSS

Christian Heilmann | http://wait-till-i.com | http://scriptingenabled.org

Open Hack Day 2009 - Bangalore, India

What is innovation?

Innovation is improving the current state of affairs.

In the best of all cases this means that it gets easier for the person using a product

to reach a goal.

This can be achieved by connecting several different products and turning them into one. (pssst... Mashup).

I’ve seen this followed cleverly on several levels

here in India.

BOSS is your chance to make the web an easier to navigate

space.

You can help us turn a search engine into a find engine.

Back in the days the web was

small and largely static documents.

Nowadays it is huge and its content and the publication speed increased a lot.

This is one challenge of search engines these days.

The others are old, but also increasing

Now, say you have the most awesome idea of a search engine that works around these issues.

Where you will get stuck is the overhead of indexing,

storing and filtering the data of the web.

And this is where BOSS comes in.

It is an API interface to our data stores for search.

BOSS is Build Your Own Search Service:

http://developer.yahoo.com/search/boss/

To use it, you need a Application ID:

https://developer.yahoo.com/wsregapp/

And there is full documentation available:

http://developer.yahoo.com/search/boss/boss_guide/

Happy Hacking!

... oh alright then ...

You can get the code examples I will show here:

http://isithackday.com/hacks/bangalore/bosscode.zip

Say you want to search the web for donkeys.

... oh alright then ...

Because

Donkeys

Rock!

Using BOSS you can do this with a REST API and display

the results any way you want!

The REST API:boss.yahooapis.com/ysearch/{type}/v1/

{search}

type is what you want to search:

web: the interwebs

news: new stuff

images: pictures

The REST API:boss.yahooapis.com/ysearch/{type}/v1/

{search}

search is the term to look for (url-encoded)

Put “” around terms to ensure the right order, f.e. “donkey fur” (you don’t want to see cats, do you?)

Filter with a -, f.e. donkey -shrek

Restrict to a site using site:, f.e. donkey site:flickr.com

The REST API:boss.yahooapis.com/ysearch/{type}/v1/

{search}

Other parameters:appid: your app ID (needed)

count: amount of results

start: where to start the counting

region / lang: country and language

format: xml or json

sites: restrict to certain sites (comma separated)

Web search REST API:boss.yahooapis.com/ysearch/web/v1/{search}

Extra parameters:filter: To filter out nasties, use filter=-porn-hate

type: to search different types. You can use html, text, pdf, xl, msword, ppt or groups like msoffice and nonhtml. You can also do a type=msoffice,-xl

Image search REST API:boss.yahooapis.com/ysearch/images/v1/

{search}

Extra parameters:filter: no nudies

dimensions: all, small, medium, large, wallpaper, widewallpaper

refererurl: all images in that url

url: image at that url

News search REST API:boss.yahooapis.com/ysearch/news/v1/{search}

Extra parameters:age: how old the news are in days. Last five days would be “5d”

There are restrictions how to display results and

information as to what data comes back.

For this, read the guide!http://developer.yahoo.com/search/boss/boss_guide/

Everybody Duck!

There will be code

The easiest way to use BOSS is using JavaScript.

{"ysearchresponse":{"responsecode":"200","nextpage":"\/ysearch\/web\/v1\/donkeys?format=json&appid=[...]&start=10","totalhits":"492215","deephits":"15700000","count":"10","start":"0","resultset_web":[{"abstract":"Hyperlinked description of the domesticated mammal discussing its appearance, relationship to horses, economic <b>...<\/b> horses and <b>donkeys<\/b> were brought back <b>...<\/b>","clickurl":"http:\/\/lrd.yahooapis.com\/_ylc=X3oDMTU4b2NoaDR2BF9TAzIwMjMxNTI3MDIEYXBwaWQDVFg2YjRYSFYzNEVuUFhXMHNZRXI1MWhQMXBuNU84S0FHcy5MUVNYZXIxWjdSbW1WclpvdXo1U3Z5WGtXc1ZrLQRwb3MDMARzZXJ2aWNlA1lTZWFyY2hXZWIEc2xrA3RpdGxlBHNyY3B2aWQDR3lDaEgwU081cTlmSktUNG1ndTVUUUJNdlNjaS4wa1ZUVndBQVF5Sw--\/SIG=11820sato\/**http%3A\/\/en.wikipedia.org\

To use this across domains, simply define a callback

parameter:

founddonkeys({"ysearchresponse":{"responsecode":"200","nextpage":"\/ysearch\/web\/v1\/donkeys?format=json&callback=founddonkeys&appid=TX6b4XHV34EnPXW0sYEr51hP1pn5O8KAGs.LQSXer1Z7RmmVrZouz5SvyXkWsVk-&start=10","totalhits":"492215","deephits":"15700000","count":"10","start":"0","resultset_web":[{"abstract":"Hyperlinked description of the domesticated mammal discussing its appearance, relationship to horses, economic <b>...<\/b> horses and <b>donkeys<\/b> were brought back <b>...<\/b>","clickurl":"http:\/\/lrd.yahooapis.com\/_ylc=X3oDMTU4cG05cXJwBF9TAzIwMjMxNTI3MDIEYXBwaWQDVFg2YjRYSFYzNEVuUFhXMHNZRXI1MWhQMXBuNU84S0FHcy5MUVNYZXIxWjdSbW1WclpvdXo1U3Z5WGtXc1ZrLQRwb3MDMARzZX

All you then need to do is put this url in a script node and

write the founddonkeys function:

<div id="searchresults"></div> <script type="text/javascript"> function founddonkeys(o){ var donkeys = o.ysearchresponse.resultset_web; var results = document.createElement('ul'); for(var i=0,j=donkeys.length;i<j;i++){ var item = document.createElement('li'); var link = document.createElement('a'); var abstract = document.createElement('p'); link.setAttribute('href',donkeys[i].clickurl); link.innerHTML = donkeys[i].title; item.appendChild(link); abstract.innerHTML = donkeys[i].abstract; item.appendChild(abstract); results.appendChild(item); }

Two problems though:

First of all - without JavaScript there are no

donkeys!

Secondly - you can only find donkeys!

The solution: Event Handling and dynamic script

generation.

<p>Warning: this is terrible code, USE A LIBRARY INSTEAD!</p><ul id="searches"> <li><a href="http://search.yahoo.com/search?va=donkeys"> Search for Donkeys </a> </li> <li><a href="http://search.yahoo.com/search?va=kittens"> Search for kittens </a> </li></ul><div id="searchresults"></div>

<script type="text/javascript" charset="utf-8"> function founddonkeys(o){ var donkeys = o.ysearchresponse.resultset_web; var results = document.createElement('ul'); for(var i=0,j=donkeys.length;i<j;i++){ var item = document.createElement('li'); var link = document.createElement('a'); var abstract = document.createElement('p'); link.setAttribute('href',donkeys[i].clickurl); link.innerHTML = donkeys[i].title; item.appendChild(link); abstract.innerHTML = donkeys[i].abstract; item.appendChild(abstract); results.appendChild(item); } var resultsbox = document.getElementById('searchresults'); resultsbox.innerHTML = ''; resultsbox.appendChild(results); } var APIkey = 'TX6b4XHV34EnPXW0sYEr51hP1pn5O8KAGs'+ '.LQSXer1Z7RmmVrZouz5SvyXkWsVk-'; var searchlinks = document.getElementById('searches').getElementsByTagName('a'); for(var i=0;searchlinks[i];i++){

var APIkey = 'TX6b4XHV34EnPXW0sYEr51hP1pn5O8KAGs'+ '.LQSXer1Z7RmmVrZouz5SvyXkWsVk-'; var searchlinks = document.getElementById('searches').

getElementsByTagName('a'); for(var i=0;searchlinks[i];i++){ searchlinks[i].onclick = function(){ var searchterm = this.href.split('va=')[1]; var url = 'http://boss.yahooapis.com/ysearch/web/v1/' + searchterm + '?format=json&' + 'callback=founddonkeys' + '&appid=' + APIkey var s = document.createElement('script'); s.setAttribute('type','text/javascript'); s.setAttribute('src',url); document.getElementsByTagName('head')[0].appendChild(s); return false; } }</script>

*click*

Using the YUI library (YUI3 JavaScript and CSS grids) you

can easily make this much cooler:

To make using BOSS with JavaScript easier, I’ve written

a BOSS wrapper called YBOSS:

http://icant.co.uk/sandbox/yboss/

<div id="results"></div><script type="text/javascript" src="yboss-lib.js"></script><script type="text/javascript">YBOSS.get({ searches:'search,news', query:'obama', count:10, callback:seedpics});function seedpics(o){ var all = '<h4>Web Sites</h4>' + o.webHTML + '<h4>News</h4>' + o.newsHTML; var out = document.getElementById('results'); out.innerHTML = all;}</script>

There’s also the Python mashup framework that

allows for SQL for remixing arbitrary XML/JSON sources:

http://developer.yahoo.com/search/boss/mashup.html

All this has been around for a while.

Here are some new things added lately:

People are trying to make the web a less messier place by

adding semantic data to HTML documents.

Using SearchMonkey technology BOSS now lists this information in the results.

http://www.flickr.com/photos/glenscott/3273401181/

Using the view=keyterms parameter you can get

keywords associated with each result.

http://keywordfinder.org

In order to get longer descriptions of each result

you can now use the abstract=long parameter to

get up to 300 characters instead of 130.

Another thing we’ve done is using the Yahoo Site Explorer

and bundle it with BOSS.

This way you can now get page information and page inlink information with two

new BOSS services.

So what has been done using BOSS?

http://ask-boss.appspot.com/

http://ask-boss.appspot.com/

http://ask-boss.appspot.com/

http://hakia.com/

http://www.oneriot.com/

And on a more lighter note:

The client side is where strange things happen.

http://isithackday.com/hacks/web-the-adventure/

The motherlode of BOSS implementations:

http://mashable.com/boss/

http://delicious.com/tag/bossmashup

Add yours by tagging it with “bossmashup” on Del.icio.us!

Keep in touch:

Christian Heilmann

http://wait-till-i.com

http://scriptingenabled.org

http://twitter.com/codepo8

T H A N K S !