+ All Categories
Home > Internet > What Can a Business do with a Web Index?

What Can a Business do with a Web Index?

Date post: 14-Aug-2015
Category:
Upload: dixon-jones
View: 327 times
Download: 3 times
Share this document with a friend
Popular Tags:
30
From Trust Flows Understanding The Deloitte Fast 50 Big Data Company You never heard of… Until now.
Transcript
  1. 1. From Trust Flows Understanding The Deloitte Fast 50 Big Data Company You never heard of Until now.
  2. 2. @tryMajestic Some Stuff Youll Learn How we built a search engine without $30 billion dollars How you can use it to make lots of: Predictions Insights Money Data Stories
  3. 3. @tryMajestic Reaching for the Stars
  4. 4. @tryMajestic An Inspiration of a Search Engine
  5. 5. @tryMajestic Majestic is a Specialist Search Engine Digital knowledge on a grand scale Dixon Jones
  6. 6. @tryMajestic The BIG specialist search engine Twitter has 500,000,000 Tweets per day on average In the same day, Majestic crawls well over 2,000,000,000 NEW URLs (and sees 7 billion)
  7. 7. @tryMajestic How do they do that? Information Retrieval in the Zeta age 1. Data Collection 2. Data Grouping 3. Data Indexing 4. Data Matching
  8. 8. @tryMajestic How to Collect 7 Billion URLs a Day?
  9. 9. @tryMajestic How to Analyze 200 Billion URLs a Day?
  10. 10. @tryMajestic Groups Make Search Much Better Find a Fact Find a Friend Find a Customer Finding Anything LibraryofCongresscirca1940 Research At: info.majestic.com/groupresearch
  11. 11. @tryMajestic We Group AND ANALYSE pages Topical Trust Flow using decay Algorithm ???
  12. 12. @tryMajestic The Index: For every page we know Its influence in a simple score Its Context Its context by keyword Its Influence in Context! In a series of simple 0-100 scores
  13. 13. @tryMajestic Works best with Universal Data set Every signal is small Individually prone to error or opinion At scale the error decreases Confidence increases http://info.majestic.com/universal
  14. 14. @tryMajestic Data Matching
  15. 15. @tryMajestic Our Data Stack (For the Techies) Crawler: C# .net / Mono NoSQL Read only file system Java Interrogation Dynamic Front End Perl/Ruby etc Hadoop coming soon
  16. 16. @tryMajestic So we built it Now Imagine What COULD you do with it?
  17. 17. @tryMajestic 1: Compare Competitor Backlinks
  18. 18. @tryMajestic Who is more popular on Twitter? 2: Finding influencers Lady Gaga? Barack Obama? Trust Flow 74 Trust Flow 70
  19. 19. @tryMajestic 3: Prediction Elections Boris v Ken Obama v Romney
  20. 20. @tryMajestic 4: Lobbying Senators
  21. 21. @tryMajestic 5: Data Art (Profiling Companies)
  22. 22. @tryMajestic What if we Pivot? Hadoop Imagine your OWN version of our web index? A subset of the data, prepopulated for your needs Updated Daily / Weekly / Monthly Stored in Open Source Hadoop instances ready for easy interrogation What could you do then?
  23. 23. @tryMajestic Data Store Examples
  24. 24. @tryMajestic
  25. 25. @tryMajestic
  26. 26. @tryMajestic
  27. 27. @tryMajestic Ways you could segment the web All domains hosted in [Choose country or City Here] Most influential sites about [Insert 800 Topics Here] Best Web Pages for [Choose 50 Million Phrases Here] Spamiest pages about [Insert 800 Topics Here] Most influential Pages on [Choose any set of sites] Create a set of pages with [Choose properties here] Got a plan? We have the starting point for web data
  28. 28. @tryMajestic Some Takeaways How we built a search engine without $30 billion dollars How you can use it to make lots of: Predictions Insights Money Data Stories
  29. 29. @tryMajestic Out of Trust Flows understanding Real insight into the world wide web from Majestic, the specialist search engine
  30. 30. From Trust Flows Understanding

Recommended