Post on 18-Dec-2015
transcript
T.Sharon-A.Frank1
Internet Resources Discovery (IRD)
Internet/WWW
Technical Background
Thanks to Miki Even-Haim and Yoram Dahan
T.Sharon-A.Frank2
Measuring the Web
"When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely in your thoughts advanced to the state of science." - Lord Kelvin
T.Sharon-A.Frank3
Internet/WWW Statistics
• Internet Size & Growth
• Population Sizes
• Various Activities
• Web Size & Growth
• Web Pages and Formats
T.Sharon-A.Frank4
The Domain Survey
The Domain Survey attempts to discover every host (i.e., uniquely reachable connected computers) on the Internet by doing a complete search of the Domain Name System. The latest results gathered during late Jan 2001 are listed, together with Mark Lottor’s work in this area over many years. For more information see RFC 1296; for more data see the archive site at the Internet Software Consortium, http://www.isc.org/ds/
Beginning with the January 1998 survey, Lottor began using a new method of doing the survey to avoid the increasing blocking of DNS zone transfers. This method of querying the DNS for known IP address is explained at http://www.isc.org/ds/new-survey.html. It is not backward compatible with the old results. The old and the new data is juxtaposed in these trends graphs with dotted lines.
T.Sharon-A.Frank60
20,000,000
40,000,000
60,000,000
80,000,000
100,000,000
120,000,000
Jan-95 Jan-96 Jan-97 Jan-98 Jan-99 Jan-00 Jan-01
Internet Hosts 1995-2001
New survey data
Adjusted old survey data
T.Sharon-A.Frank710,000
100,000
1,000,000
10,000,000
100,000,000
1,000,000,000
Jan
-89
Jan
-90
Jan
-91
Jan
-92
Jan
-93
Jan
-94
Jan
-95
Jan
-96
Jan
-97
Jan
-98
Jan
-99
Jan
-00
Jan
-01
Jan
-02
Jan
-03
Jan
-04
Jan
-05
Jan
-06
Jan
-07
Internet Hosts - Overall Trend
ProjectedHistorical
T.Sharon-A.Frank8
Trends in Internet Hosts
• The figure of 109 million hosts represents a significant new benchmark for the number of Internet hosts. The current annual growth rate now stands at 51%, within the 46-67 % rates seen over the past 2 years.
• It shows continued strong exponential growth, with the 100 million host barrier being crossed in late 2000. If the same growth rate is sustained, the Internet would cross the 1 billion host mark in mid 2005.
• The Internet is now expanding at the rate of 63 new hosts and 11 new domains per minute worldwide.
T.Sharon-A.Frank9
Total domains registered
• Total domains registered worldwide: 33,293,791
• International (COM): 23,121,005
• International (EDU): 6,708
• International (GOV): 1,269
• International (NET): 4,343,150
• International(ORG): 2,671,279
• United Kingdom (CO.UK): 3,150,380
T.Sharon-A.Frank10
Where the Internet hosts are by domain (Jan 2000)
Germany2%
Canada2%
com35%
net23%
UK3% Japan
4%
US-dom3%
USA-dom3%
edu8%
Others18%
T.Sharon-A.Frank11
Where the Internet hosts are by domain (Jan 2001)
Germany2%
mil2% com
34%
net28%
Canada2% Japan
4%
US-dom3%
UK2%
edu6%
Others18%
T.Sharon-A.Frank12
Hosts: Large Three-Letter Domains
0
5,000,000
10,000,000
15,000,000
20,000,000
25,000,000
30,000,000
35,000,000
40,000,000
Jan-96 Jan-97 Jan-98 Jan-99 Jan-00 Jan-01
comnetedumilorggov
T.Sharon-A.Frank13
Trends in Domains Growth
• The largest domain is COM, jumping 7.8 million hosts since January 2000 to a new high of 36.3 million hosts. That represents a current annualized growth rate of 32%. As a percentage of the entire Internet, the COM host count stayed about the same at 33.2% of all Internet hosts.
• The number of hosts in the NET domain - which is heavily used by ISPs for dialup customers - remained the fastest growing of all the large domains, expanding at an annual growth rate of 45% to 30.8 million hosts.
T.Sharon-A.Frank14
Internet/WWW Statistics
• Internet Size & Growth
• Population Sizes
• Various Activities
• Web Size & Growth
• Web Pages and Formats
T.Sharon-A.Frank16
Internet Users around the Globe
Source: http://www.geocities.com/Eureka/Enterprises/6930/enstat.html
T.Sharon-A.Frank17
Internet Users Statistics
Source: http://www.geocities.com/Eureka/Enterprises/6930/enstat.html
T.Sharon-A.Frank23
Internet/WWW Statistics
• Internet Size & Growth
• Population Sizes
• Various Activities
• Web Size & Growth
• Web Pages and Formats
T.Sharon-A.Frank24
Language populations
Source: Global reach http://glreach.com/globstats/index.php3?goto
T.Sharon-A.Frank29
Internet/WWW Statistics
• Internet Size & Growth
• Population Sizes
• Various Activities
• Web Size & Growth
• Web Pages and Formats
T.Sharon-A.Frank31
Number of Web Sites
• 1997: 1,570,000
• 1998: 2,851,000
• 1999: 4,882,000
• 2000: 7,399,000
• 2001: 8,745,0000
100000020000003000000400000050000006000000700000080000009000000
10000000
T.Sharon-A.Frank32
Number of Unique Web Sites*
• 1998: 2,636,000
• 1999: 4,662,000
• 2000: 7,128,000
• 2001: 8,443,000
0
1000000
2000000
3000000
4000000
5000000
6000000
7000000
8000000
9000000
* If a site is located at multiple IP addresses, the site is retained in the sample only if the numerically lowest IP address is in the sample.
T.Sharon-A.Frank33
Types of Unique Web Sites
PublicPrivateProvisional
1998:1,457,000315,000864,000
1999:2,229,000790,0001,643,000
2000:2,942,0001,494,0002,692,000
2001:3,119,0002,078,0003,246,000
Public: Offers content that is freely accessible to the general public.
Private: Offers restricted access to content: for example, via fee payment or prior authorization.
Provisional: Is in a transitory or unfinished state (e.g., “under construction”).
T.Sharon-A.Frank34
Growth of Unique Web Sites
1997-2001
1997-1998
1998-1999
1999-2000
2000-2001
Sites:457%82%71%52%18%
Unique Sites:
N/aN/a77%53%18%
Public Sites:
290%82%53%32%6%
T.Sharon-A.Frank35
Web Pages Statistics (1)
Note: all numbers below (source data: "Accessibility of Information on the Web“) refer to publicly indexable web pages; publicly indexable web pages exclude pages that are not normally considered for indexing by web search engines, such as pages with authorization requirements (including firewalls), pages excluded from indexing using the robots exclusion standard, dynamic pages, etc;
• 12/97 : At least 320 million pages;
• 02/98 : 2.8 million servers on the publicly indexable web;
289 average pages per server;
800 million publicly indexable web pages;
18.7 kilobytes is the mean size of a page;
3.9 kilobytes is the median size of a page;
T.Sharon-A.Frank36
Web Pages Statistics (2)• 02/99:
– 7.3 kilobytes average size of textual content per page (after
removing HTML tags, comments and extra white space);
– 0.98 kilobytes median size of the textual content;
– 15 terabytes of pages is the amount of data on the web;
– 6 terabytes is the amount of text data;
– 62.8 images per web server;
– 15.2 Kbytes - average image size;
– 5.5 Kbytes - median image size;
– 180 million images on the publicly indexable web;
– 3 terabytes - total amount of image data;
T.Sharon-A.Frank37
Web Pages Statistics (3)
• As of 7/5/2000, the web has roughly:
– 2,170,000,000 pages;
– 40,800,000,000,000 bytes of text;
– 489,000,000 images;
– 8,160,000,000,000 bytes of image data;
• In the last 24 hours, the web added:
– 4,420,000 new pages;
– 82,800,000,000 new bytes of text;
– 994,000 new images;
– 16,600,000,000 new bytes of image data;
– 49,400,000 pages changed;
– 11,100,000 images changed;
• Average life span of the web page: 44 days;
T.Sharon-A.Frank38
Internet/WWW Statistics
• Internet Size & Growth
• Population Sizes
• Various Activities
• Web Size & Growth
• Web Pages and Formats
T.Sharon-A.Frank41
What is the "average page" like?
The page sizes are highly variable, as illustrated in Table , which covers one snapshot of 1.524million pages.Mean 6518Median 2021Standard Deviation 31678
T.Sharon-A.Frank42
Embedded Image Count
The Web is quite graphically rich. The Table shows that just over 50% ofall pages contain at least one image reference. It is interesting to note thatabout 15% of pages contain exactly one image. Quite likely, for many ofthe pages that contain large numbers of images, those images are in facttypographical marks of the "reddot.gif" () variety.
T.Sharon-A.Frank45
References• Internet Domain Survey
– http://www.isc.org/ds/• Online Computer Library Center
– http://wcp.oclc.org/stats/size.html• UCLA Center for Communication Policy
– http://www.ccp.ucla.edu/pages/InternetStudy.asp• Network Facts
– http://www.netfactual.com/• Internet Statistics
– http://www.mit.edu/people/mkgray/net/• Domain Statistics
– http://www.domainstats.com