Electronic Publishing and Misc. Topics
Electronic Publishing
Advantages of Electronic Publishing Convenient
easy to do text searches easy to access past issues
Dynamic - easy to modify, faster to release Easy to Publish
Inexpensive – no paper cost, no distribution cost Disadvantages of Electronic Publishing
Limited Audience - Only reach those with Internet access Credibility – more editing, reviewing w/printed materials Readability – difficult to view lengthy work (e.g., Stephen King) Those Annoying Ads Easy to copy
difficult to track down offenders difficult to establish ownership
Copyright Most material available on-line is there to:
Generate income for the authors Provide information Promote the exchange of ideas
Some assume that because something is available on-line, anyone can copy, use, link-to, etc. → not so
US Copyright Law (aka copyright statute) encourages and promotes artistic expression by: Protecting an individual’s creative effort Allowing the author to benefit financially from it
In US, work must meet 3 requirements: It must be an original work (not derived or copied) It must be in tangible form – written, recorded, videotaped, saved
on disk It must be more than an idea (however, an expression of the idea
can be copyrighted) As soon a the work is “fixed,” the work receives copyright protection
Copyright (con’t) Copyright notices on Web pages remind others not to borrow your
work:© Paul Kimwong, 2007. All rights reserved.
After March 1, 1989 the copyright notice did not need to appear in order to receive protection
Consists of: Copyright symbol - © or &169; or word copyrightfollowed by: Name of ownerfollowed by: Year work was first publishedAlso: “All rights reserved” needs to appear in other countries (use since
you do not know where your Web pages will be viewed) Copyright owner can:
Prohibit copies of his/her work OR Profit by charging for copies
Copyright (con’t)
Difficult to control access to copyrighted electronic work as the quality of the copy is same as the original (unlike a photocopier) WIPO (World Intellectual Property Organization) and US
Government believe that copyright protection needs to extend to include electronically published materials
Others want to make all info on the Web freely available: Digital Future Coalition, Electronic Frontier Foundation –
“Information is free.” Conflict between cultural desire to make original material freely
available and the rights of creators to receive compensation and protection
Copyright (con’t)
Two approaches to making info freely accessible on-line and ensuring the rights of the owner: Use legal constraints and stiff penalties for copyright violation Develop technology that deals with credit and copyright issues →
digital rights management systems InterTrust’s DigiBox containers RightsMarket RightsPublish IBM Cryptolope containers Digital watermarks
To avoid plagiarism (that is, using someone else’s work and calling it your own): Get permission from copyright owner Do not assume that if an electronic document does not possess a
copyright notice that it is not copyrighted – it probably is
Project Gutenberg/On-line Publishing
Some books are available on-line in their entirety – is this copyright infringement? No – copyrights expire a certain number of years after the death
of the owner; then the work is said to be in the public domain After copyrights expire anyone can make it available on-line or
use it how they would like Project Gutenberg – effort to make previously published books
available on-line Books are called e-texts or electronic texts Michael Hart, a student at the University of Illinois in 1971, based
Project Gutenberg on premise that anything that can be entered into a computer can be reproduced indefinitely
Hart decided to make works of literature available in electronic form for free, anticipating that one day the Internet would be available to the general public; he started with the US Declaration of Independence, then Bill of Rights, then US Constitution, then the Bible
Project Gutenberg (con’t) Mission of Project Gutenberg:
To make information, books, and other materials available to the general public in a form that a vast majority of the computers, programs, and people can easily read, use, quote, and search.
Original goal was to make 10,000 books available in the PG Electronic Library by the end of 2001; as of Oct. 2006, there were over 19,000 items in its collection
Books are scanned using OCR software by volunteers and usually saved in plain text (aka “plain vanilla ASCII”) so that all hardware and software currently available can read and search the e-text
Works are divided into 3 groups: Heavy Literature: Shakespeare, Bible, Moby Dick Light Literature: Alice In Wonderland, Peter Pan References: Dictionaries, almanacs, set of encyclopedias
Web address: http://www.gutenberg.org
On-line Publishing Other on-line efforts:
Bartleby.com – (http://bartleby.com) Electronic text archive that started out as Project Bartleby in January 1993 by Steven H. van Leeuwen
Google Print Project – (http://books.google.com) Print Publisher Program – publisher can authorize Google to
scan full text of book Print Library Project – Google wants to scan materials from
libraries at Harvard, Stanford, Oxford Univ., the U of Michigan, and New York Public Library (controversial)
Wikipedia – (http://wikipedia.org) “The Free Encyclopedia That Anyone Can Edit” Wiki – type of Web site that allows anyone visiting the site to
add, to remove, or otherwise edit all content quickly and easily; considered a collaborative writing tool
Started on 1/15/2001, its open nature allows vandalism, inaccuracy, and opinion
On-line Publishing (con’t) Electronic Magazines, Newspapers:
Goals usually involve making money May need to subscribe or register May only get to see an abridged version with the purpose being
to get you to buy print version Supported by advertising
Scholarly journals: Not for profit; purpose is to report discoveries and convey info Academics are reluctant to publish on-line: credibility issue
E-zines: Electronic version of a zine (fanzine or fan magazine) Self-published for self expression Originally focused on science fiction and comic books, now they
exist on any and every topic Not for profit, original zines charged a small fee to cover
publishing expenses Web is perfect medium for zines – fast and cheap
Communication Mechanisms Then - computers used for computation; Now – Computers provide
information and facilitate communication Methods of communication:
Email – 1972; Allowed sending messages directly to other people or groups of people; Oldest means of communication over the Internet
Newsgroups – late 1979 First one was set up between Duke U. and U. of N.C. at
Chapel Hill Offers continuous public discussion on a topic Need a newsreader to view or post articles; newsreader is a
graphical or a text-based news client NNTP – Network News Transfer Protocol
Way news articles are posted Decentralized – messages are not kept on a single server;
they are kept on hundreds of news servers around the world
Communication Mechanisms (con’t) Newsgroups (con’t)
Sometimes news groups are moderated – someone reads and evaluates articles
Sometimes articles are encoded in ROT13 (aka Caesar cipher) to “hide” content (racy joke, movie ending); ROT13 is encoding scheme that maps letters 13 characters down the alphabet: a ↔ n, b ↔ o, c ↔ p, d ↔ q, etc.
V ybir kugzy! Newsgroups are organized in hierarchies – name describes
what the group is about; name of newsgroup is formed by hierarchy under which it falls: main topic, then sub-topic, etc. moving from general → specific
alt.animals.dogs.labs comp.lang.java.programmer
First newsgroup was termed USENET Big Eight = USENET’s original 8 newsgroup categories –
comp, humanities, misc, sci, news, soc, talk, rec alt was not part of the original 8 but is now the biggest
general category (alt = alternative, free form)
Communication Mechanisms (con’t) Mailing Lists (or listserv) – early 80’s
Type of broadcast email; many got their start on BITNET Like a personal email distribution list except:
Anyone on the list can send email to all others on the list A program called a listserver (not a person) receives and
distributes the email Terminology is similar to that of newsgroups except that it uses
email, not a newsreader: Subscriber – person whose email address is on the mailing list List owner – person in charge Lurker – person who is subscribed to the list, reads posts, but
does not post messages Differences between mailing lists and newsgroups:
NG – message copy is stored by your news server – you need to retrieve it; ML – message is delivered to your mailbox
NG – uses newsreader (or WWW now); ML uses email client NG – can read news when you want until it expires; ML – need to
delete messages or they can fill your mailbox
Communication Mechanisms (con’t) Chat – 1988 Internet application that allows you to have a “real-time” conversation Chat room – actually a channel or path that allows communication
between 2 or more computers on the Internet Uses chat room software – everyone sees what everyone else types IRC – Internet Relay Chat
Instant Messaging – 11/1996 ICQ (I Seek You) – was first introduced as a free utility by Mirablis AOL releases AOL IM (AIM) and it becomes the leading IM utility Other IM programs:
MSN Explorer includes IM Yahoo! messenger
Allows users to share: text, Web links, images, sounds, files, talk, and create custom chat rooms
Not considered a secure way to communicate – so do not send confidential info thru system
Is email the new snail mail??
Communication Mechanisms (con’t) Forum/Discussion Boards
Like a newsgroup except that most forums are kept on a single server maintained by the owner of the board
Blogs Web logs (on-line diaries) Web page made up of short, frequently updated posts that are
arranged chronologically like a journal – like im to the WWW Content and purposes of blogs vary:
Internet/Web topics Women’s issues News and politics Sports, travel, art, photos, career, personal diary
Examples: Baghdad Blogger’s “Where’s Raed?” (
http://dear_raed.blogspot.com) Blogger (http://www.blogger.com) – now owned by Google;
start your own…
File Sharing Napster – most popular Web site ever created at its peak Program was written by Shawn Fanning who was attending
Northeastern U; dropped out in Jan 99 to focus on Napster Pioneered concept of peer-to-peer file sharing (MP3 music files) Store files you want to share on your hard disk and share directly
with other people using downloaded Napster software; Central index server kept track of who had what to share
This approach worked: No way a central server could have enough disk space to hold all
songs (or band width to handle all requests) Took advantage of loophole in copyright law that allows friends to
share music files with friends Courts decided it was promoting copyright infringement
Records labels sued, forcing Napsters’s shutdown in July 2001 It was easy for court order to shut down the site (eventually) – it
just had to eliminate central database of song titles
File Sharing (con’t)
Gnutella replaces Napster with another peer-to-peer network Similarities:
Users place files they want to share on their hard disks and make them available to everyone else for downloading in peer-to-peer fashion
Users run gnutella software to connect to gnutella network Differences (makes it hard for a simple court order to shut them
down): No central database that knows all the files available on the
gnutella network- all machines on the network tell each other about avaailable files using a distributed query approach
There are many different client applications available to access the gnutella network:BearShare, Gnucleus, LimeWire, Morpheus, WinMX, XoloX
File Sharing (con’t) How gnutella finds a song:
Type in name of song Your machine knows of at least one other gnutella machine
somewhere on the network and the song name is sent to them These machines check to see if requested file is on the local hard disk
and if so, sends back file name and machine IP address to the requester
Machines send out same request to machines they are connected to Disadvantages:
No guarantee that the file you want will be found Queries can take some time to complete Your machine is part of the network so it needs to answer requests,
etc. Gnutella is itself legal – there is no law against sharing public domain
files; it is illegal when people use it to distribute copyrighted music
RSS
RSS – Really Simple Syndication OR Rich Site Summary OR RDF Site Summary Introduced in 1999 by Netscape (abandoned in 2001) and called
Rich Site Summary Another version pioneered by UserLand Software called it Really
Simple Syndication Provides a way to have your favorite Web sites notify you when their
content has been updated; otherwise: You might keep your favorite sites as bookmarks and check them
periodically You might keep them in your head and check periodically
Difference between RSS feeds and Web content – Content published in an RSS feed is set up to send out notification whenever new material is available
RSS (con’t) RSS terms:
RSS/XML/Atom are technologies RSS and Atom are two varieties of the same thing – a feed which is
a “wrapper” for pieces of regularly updated content XML – the base technology that RSS, Atom, and xhtml are built on
Syndication – the process of using RSS/Atom for automated updates News reader or aggregator – a program or a Web site that checks
your list of bookmarks and lets you know what is new on each site in your list (list only has to be set up one time) Works by pulling in the feeds of your various bookmarks so it also
delivers the content that has been updated as well as notifying you that something has changed: You can read the new content in the news reader OR You can leave the reader and visit the site
Analogy: News reader acts like a customizable newspaper; you can pull in content from a growing number of sources into one place but the source must provide a feed
RSS (con’t)
Some popular news readers: Bloglines – completely Web-based news reader as well as a feed
search tool (http://bloglines.com) Stand alone news readers that are platform specific:
FeedDemon for Windows NetNewsWire for Mac OS X
Problem with stand alone news readers – information about what feeds you subscribe to and which articles you have read stays on your computer; will not be able to access from home and work
To subscribe to a Web site or blog, look for a button or label that says “RSS” or “Atom” or “Syndicate This” or “SUB BLOGLINES” and then copy the link that button points to into your news reader
RSS (con’t) How to create a feed:
Use a text editor (if you have a blog, several blogging tools automatically generate RSS files)
Your file needs to include at least one item→ item is a Web page that you want others to link to
Need 3 pieces of info about each item: Title Description Link
Example of item:
<item><title>CS403 Sections 01, 03, and 06 Updates to class FAQs page</title><description>11/16/06 – Added link to search examples</description><link>http://pubpages.unh.edu/~cs403d/CS403/FAQ/faq.html</link></item>
RSS (con’t) Creating a feed (con’t)
You can include up to 15 items – insert new items at the top Now you need to define the site as a channel; use same tags as
with items – title, description, link but this time info is about your entire site:Example:<?xml version="1.0" ?><rss version="2.0"><channel>
<title>CS403 Sections 01, 03, and 06 Site Updates</title><description>CS403 Updates to our class site</description><link>http://pubpages.unh.edu/~cs403d/CS403/</link>
<item><title>CS403 Sections 01, 03, and 06 Updates to class FAQs
page</title><description>11/16/06 – Added link to search examples</description><link>http://pubpages.unh.edu/~cs403d/CS403/FAQ/faq.html</link></item>
</channel></rss>
RSS (con’t) Creating a feed (con’t)
Save this file in public_html with .xml extension Change permissions on the file so that it is world-readable Validate it: http://feedvalidator.org (is the RSS file correct?) Syndicate it – let others know about your feed:
Visit the RSS directories and search engines as they offer submission pages
Post a link on your Web page: Can use ordinary link Many sites use a small orange XML icon to link to feed or blue
RSS icon This was an example of making a feed by hand; some alternatives:
RSS Headline Creator – fill info into form and your RSS file’s code will be generated (you need to copy and paste)
Create a Channel (myRSS) Enter the url of any Web pages that lists articles or content on your
Web site The myRSS spider will “scrape” your page – it will guess at what the
headlines are, loads a new page, and generates the RSS file
Dot Con Frontline PBS video:
“When the Internet bubble burst in March 2000, unlucky investors watched more than $3 trillion of their money disappear. What spurred the incredible don-com bull run on Wall Street? Was the public blinded by dreams of small forunes and easy living or did the nation’s investment banks manipulate the IPO market and exploit public trust?”
August 9, 1995 – 16-month old Netscape goes public: First big “pop” stock (some consider this the start of the Internet
boom) Opening price was $28/share, went up to $75/share, closed at
$58/share; by Dec. 1995 it was trading at $171/share Proved that you did not need to be profitable to go public
“Stock Market Bubble” – term applied to a self-perpetuating rise or boom in the share prices of stocks of a particular industry.
Dot Con site: http://www.pbs.org/wgbh/pages/frontline/shows/dotcon
Dot Con (con’t)
Questions to consider: What caused the tech bubble to burst? When the stock is bid up in the early days of trading, what money
does the company make? What influence did the analysts have who touted their
recommendations? So who was getting rich? Who lost out? What is a “Dutch Auction?”
Google used a Dutch Auction for selling share in August 2004 It uses the Internet to bring transparency and level the playing
field