Post on 13-Jun-2015
transcript
Internet Search Strategy
Advantages
The ability to learn faster than your competitor may be the only sustainable competitive advantage.
Peter Senge, The Fifth Discipline
Outline
• Background
• Browsers
• Search Engine
• Directory
• Blog
• Web 2.0
• Internet Politics
Background
History
Definition of Net The Internet is the publicly accessible worldwide
system of interconnected computer networks that transmit data by packet switching using a standardized Internet Protocol (IP).
It is made up of thousands of smaller commercial, academic, domestic and government networks. It carries various information and services, such as electronic mail, online chat, and the interlinked web pages and
other documents of the World Wide Web.
How big is the web?
• 56 billion static web pages are publicly-available on the World Wide Web.
• Another estimated 6 billion static pages are available within private intranet sites
• 200+ billion database-driven pages are available as dynamic database reports ("invisible web" pages)
• Google.com indexes 9.75 billion web pages.
WWW domination
Deep Web
• The invisible web,, a vast repository of information that search engines don't have access to, such as databases
• Private networks, called intranets, that are not actually hooked up to the Web
• Forms, like ColdFusion or CGI • Password-protected sites, like a university
library • Sites that intentionally, for various reasons,
keep their information from being indexed by search engine spiders
Today
• 200 Billion• Only 50 Billion is static
web• Geogle only indexed
10%• Daily Web Space
increase 100,000 websites
Key Players
Larry PageCo-Founder & President, Products
Sergey BrinCo-Founder & President, Technology
Sir "Tim" John Berners-Lee
• the inventor of the World Wide Web and director of the World Wide Web Consortium
Search Strategy
Search Strategy
a. Choose appropriate key words
b. Select right tools
c. Evaluate Information
Sharing
• Interesting sites?
• Your frustration?
• Questions ?
Your needs?
1. What information you want to
have right now?
A.________________________
B. _______________________
C. _______________________
Tools
• Search engine• Meta Search• Specialized search engine• Directory• Specialized Directory – academy, alexa• Blog• RSS• Torrent
Tools for Multimedia
• Sound - Podcast
• TV – Online TV
• Photo – flickr
• Invisible Web
Browser
Definition
A web browser is a software application, technically a type of HTTP client, that enables a user to display and interact with HTML documents hosted by web servers or held in a file system.
HTML & HTTP• In computing, HyperText
Markup Language (HTML) is a markup language designed for the creation of web pages with hypertext and other information to be displayed in a web browser. HTML is used to structure information — denoting certain text as headings, paragraphs, lists and so on
Browser - functions
Mozilla Firefox – tab, extensions
Internet explorer – high security Opera – sessions, ligh
Browsers
• Internet Explorer (decoder)• Bookmark/Favorite• Home Page (Google, Yahoo)• Back Forward• Refresh (7 seconds)• History• Text size• Encoding
Search Engine
Definition•
Definition: A search engine is a searchable database of Internet files collected by a computer program (called a wanderer, crawler, robot, worm, spider).
search engine
Spider: Program that traverses the Web from link to link, identifying and reading pages
Index: Database containing a copy of each Web page gathered by the spider
Search and retrieval mechanism: Technology that enables users to query the index and that returns results in a schematic order
SingleGoogle.com Vivisimo.comMetaAll the Web Dogpile
Internet search engines can be the most useful--or useless--tools on the Internet
Search Engines
Boolean Search
Add +ABC
Minus -ABC
Default DEF OR ABC
Exact phrase “ABC”
Wild card ABC*
Synonym ABC~
Boolean
OR33702660
NOT81497
AND1677
effective habits:•
•
•
•Study Search Engine Help Files
•Use The "Three Strikes" Rule
•Don't Play Favorites
•Use Specialized Search Sites
•Keep your book mark well classified
TEST
• Use google to find the three items you have listed down
Directory
• • pick by human
• hierarchy
• small portion of cyberspace
• low noise
Characters
General Directory
•Yahoo - largest collection of topical collections•Google Web Directory –
using the Google link ranking technology; Google search results are also included with directory results • Open Directory – volunteers to pick the web pages
Specialized Dir
• About - large collection of topical collections gathered subject specialists • Alexa – List down the highly ranked websites•100times – free education sites for business studies
• INFOMINE - large collection of scholarly Internet resources collectively maintained by several libraries, including those from the University of California
• The Internet Public Library - large, selective collection from the University of Michigan
• The WWW Virtual Library - highly respected guides to many disciplines sponsored by the W3 Consortium
TEST
• Find directories that related to your profession
Blog
Blog
• http://www.tehnorati.com
• http://www.bloglines.com
• http://www.blogger.com
• http://blog.iht.com
• http://www.jeffooi.com
Mailing List/GroupsInfo Exchange
Multimedia
Web 2.0
• Encyclopedia
• http://www.wikipedia.org
• Photo
• http://www.flickr.com
• RSS
• http://www.feedstar.com
TV
• Power Point Slide show Online
• http://www.slide.com
• Online TV
• http://wwitv.com
• http://twit.tv
• http://websearch.about.com/od/imagesearch/a/freeonlineTV.htm
Magazine & Newsletter
PodcastSound
Podcast
• Podcasting is a term coined in 2004 when the use of RSS syndication technologies became popular for distributing audio content for listening on mobile devices and personal computer
RSSNews Aggregator
Mailing ListConnect
Pictorial Explanation
Web Based Mailing List
• Google Groups
• Yahoo Groups
• MSN Groups
•LIST: SIPI, Dale Carnegie, TRDEV
Benefits
• Ask questions
• Sense of belonging
• International exposure
• Contribution
• Networking
Dangers
• Lurk first
• Beware of the audience
• Jokes
• Use emoticon
Information Evaluation
Web Evaluation Techniques
Before you click to view the page...
• Look at the URL - personal page or site ? ~ or % or users or members
• Domain name appropriate for the content ? edu, com, org, net, gov, ca.us, uk, etc.
• Published by an entity that makes sense ? • News from its source?
www.nytimes.com
• Advice from valid agency? www.nih.gov/www.nlm.nih.gov/www.nimh.nih.gov/
Web Evaluation Techniques Scan the perimeter of the page• Can you tell who wrote it ?
• name of page author• organization, institution, agency you recognize• e-mail contact by itself not enough
• Credentials for the subject matter ?– Look for links to:
“About us” “Philosophy” “Background” “Biography”
• Is it recent or current enough ?• Look for “last updated” date - usually at bottom
• If no links or other clues...• truncate back the URL
http://hs.houstonisd.org/hspva/academic/Science/Thinkquest/gail/text/ethics.html
Web Evaluation Techniques
Indicators of quality
• Sources documented• links, footnotes, etc.
– As detailed as you expect in print publications ?
• do the links work ?
• Information retyped or forged• why not a link to published version instead ?
• Links to other resources• biased, slanted ?
Web Evaluation Techniques
What Do Others Say ?
• Search the URL in alexa.com
– Who links to the site? Who owns the domain?
– Type or paste the URL into the basic search box
– Traffic for top 100,000 sites
• See what links are in Google’s Similar pages
• Look up the page author in Google
Web Evaluation Techniques
STEP BACK & ASK: Does it all add up ?• Why was the page put on the Web ?
• inform with facts and data? • explain, persuade? • sell, entice?
• share, disclose?• as a parody or satire?
• Is it appropriate for your purpose?
Try evaluating some sites...
1. Search a controversial topic in Google:– "nuclear armageddon"– prions danger– “stem cells” abortion
2. Scan the first two pages of results
3. Visit one or two sites – try to evaluate their quality and reliability
Internet Politics
Internet Politics
Virus
Freedom of speech Pornography
Company policies
Copy right
Internet Politics
• Virus
•data loss due to viruses is still less than 10%
•2 hours to clear up, a major infection will probably
take 5 days
What is the consequences?
Internet Politics
• Virus
One of the first major attacks in the United States occurred in 1988 with a virus created by a Cornell University graduate student. It jammed more than 6,000 computers across the country, shutting down some networks on what was then a much smaller national computer network.
Internet Politics
• Antivirus Rules For The Users1. Never accept disks, programs or data files without checking them first2. Never use software, demo's or other software with doubtful origins3. Always scan any program or document download onto your machine before you open or read it, this includes attachments received via e-mail4. If you lend a disk to anyone, check it when you get it back. BEFORE you use it again5. Keep your Antivirus software up to date
Internet Politics
• Freedom of speech
•Abide to non-disclosure agreement.
•In discussion group, lurk before you participant.
• Do not use four letter words
• Use emoticon for international communication
Internet Politics
• Pornography
•It’s a big NO NO
•Why it is not allowed?
•If allowed, what would be the negative consequences?
•If accidental, leave straight immediately
Internet Politics
• Company policies
• Internet Users Policy
(IUP)
• Previous experience
Internet Politics
Copy right
• Three types of software:– public domain, freeware and shareware
• Give credit to authors– electronics, verbal or written forms
• Check virus
• Consult IT or HR if not clear