Enterprise Search: How do we get there from here?

Post on 09-May-2015

7,339 views 1 download

description

Enterprise Search: How Do We Get There From Here? by Daniel Tunkelang (Head of Query Understanding, LinkedIn) Keynote at 2013 Enterprise Search Summit We've been tackling the challenges of enterprise and site search for at least 3 decades. We've succeeded to the point that search is the gateway to many of our information repositories. Nonetheless, users of enterprise search systems are frustrated with these systems' shortcomings. We see this frustration in surveys, but, more importantly, most of us experience it personally in our daily work life. We all dream of a world where searching any information repository is as effective as searching the web—perhaps even more so. A world where we find what we're looking for, or quickly determine that it doesn't exist. Is this Utopia possible? If so, how do we get there from here? Or at least somewhere close? In this talk, Tunkelang reviews the track record of enterprise search. He talks about what's worked and what hasn't, especially as compared to web search. Finally, he proposes some paths to bring us closer to our dream. -- Daniel Tunkelang is Head of Query Understanding at LinkedIn. Educated at MIT and CMU, he has his career working on big data, addressing key challenges in search, data mining, user interfaces, and network analysis. He co-founded enterprise search and business intelligence pioneer Endeca, where he spent a decade as its Chief Scientist. In 2011, Endeca was acquired by Oracle for over $1B. Previous to LinkedIn, he led a team at Google working on local search quality. Daniel has authored fifteen patents, written a textbook on faceted search, and created the annual symposium on human-computer interaction and information retrieval.

transcript

Enterprise  Search:  How  do  we  get  there  from  here?  

Daniel  Tunkelang  Head  of  Query  Understanding,  LinkedIn  

THERE  The  Dream  (Franz  Marc,  1912)  

“Computer,  what  is  the  nature  of  the  universe?”  

"a  web  of  data  that  can  be  processed  by  machines"  

Mind  reading  is  now  possible!  

HERE  Office  Space  (1999)  

Google  VP  Udi  Manber  on  their  in-­‐house  search:  “It’s  not  that  good.”  

Beyond  10  blue  links?  Not  so  much.  

Meta-­‐utopia  or  Metacrap?  

Cory  Doctorow’s  seven  straw-­‐men  of  meta-­‐utopia:    1.   People  lie.  2.   People  are  lazy.  3.   People  are  stupid.  4.   Mission:  Impossible  -­‐-­‐  know  thyself.  5.   Schemas  aren't  neutral.  6.   Metrics  influence  results.  7.   There's  more  than  one  way  to  describe  something.  

So  how  do  we  get  there  from  here?  

Three  Baby  Steps  on  the  Path  to  Utopia  

1.    Exercise  common  sense.  

2.    Show  some  humility.  

3.    If  all  else  fails,  cheat.  

Remember  what  the  Dormouse  said:  Feed  your  head.  

From  2012  Google  Zeitgeist  

Monitor  your  top  queries.  Nail  them.  

15  15

for i in [1..n]! s ← w1 w2 … wi! if Pc(s) > 0! a ← new Segment()! a.segs ← {s}! a.prob ← Pc(s)! B[i] ← {a}! for j in [1..i-1]! for b in B[j]! s ← wj wj+1 … wi! if Pc(s) > 0! a ← new Segment()! a.segs ← b.segs U {s}! a.prob ← b.prob * Pc(s)! B[i] ← B[i] U {a}! sort B[i] by prob! truncate B[i] to size k!

Long  tail?  Structure  and  segment  your  queries.  

Eneees  and  categories  are  your  friends.  

Even  the  eneees  for  which  you  have  no  results.  

Idenefy  unsuccessful  searches.  

Use  analyecs  to  drive  triage.  

“Sorry,  no  results  containing  all      your  search  terms  were  found.”  

Analyzed  representaDve  random  sample  of  name  searches.    Leading  causes:    1)  Misspelled  names.  2)  Correctly  spelled  name  of  someone  not  on  site.  

 Combine  automated  analysis  with  human  judgment.  

Triage  drives  and  validates  agile  development.  

Misspelled  name?  

Correctly  spelled  name  of  someone  not  on  site?  

You  just  ask  them?  

vs.  

Recognize  ambiguity  and  ask  for  clarificaeon.  

Clarify,  then  refine.  

Computers   Books  

It’s  2013.  Please  use  faceted  search.  

Make  your  best  guess,  but  hedge  your  bets.  

Claudia  Hauff,  Query  Difficulty  for  Digital  Libraries  [2009]  

Not  all  queries  are  created  equal  in  difficulty.  

“It's  ok  to  cheat,  as  long  as  you  cheat  your  way  to  the  top."  

Design  an  experience  that  doesn’t  require  search.  

Crowd-­‐source  curaeon.      

Unstructured  data?  Beg,  borrow,  or  steal.  

Solve  an  easier  problem:  re-­‐finding.  

Invest  in  type-­‐ahead,  especially  instant  results.  

“Good  arests  copy.  Great  arests  steal.”                                                                                                                                -­‐-­‐  Picasso  /  Jobs  

Three  Baby  Steps  on  the  Path  to  Utopia  

1.    Exercise  common  sense.  

2.    Show  some  humility.  

3.    If  all  else  fails,  cheat.  

It’s  the  economy,  stupid!  

Warning:  technology  alone  is  not  a  solueon.  

The  future  is  on  the  way.  

But  the  present  doesn’t  have  to  be  so  bad.  

Email:  dtunkelang@linkedin.com  

   

Connect:  hmp://linkedin.com/in/dtunkelang