Date post: | 25-Dec-2015 |
Category: |
Documents |
Upload: | alyson-hines |
View: | 215 times |
Download: | 0 times |
Web Services
CPTE 433John Beckett
Players
• Server – provides resources in terms of “pages”
• Client – – Browser on a PC– Browser on a smaller device– Current trend: “App”
• HTTP: higher-level protocol defined by W3C– Migrating toward HTTPS
• IP: lower-level protocol defined by the IETF
Why Open Standards
• Seems like a dumb question now• Formerly, systems could not talk to
each other• Then gateways were used to go
between• Now common standards and
protocols are used widely• We are no longer dependent on a
single vendor to make the net work
Building Blocks
• URL specifies where a service is located:– Protocol– Username/Password
• http://jbeckett:[email protected]– Hostname– Directory– Parameter(s)
• Name• Value
Web 1 Server-Side
• Form is used to send information and request a response
• Data is transferred from client to host by either of:– Post – data is not visible on URL line– Get – data is visible on URL line
Data can be implied by the URL itself• Server-side program accepts data,
process it, and returns result via HTTP (Perl, PHP, ASP)
Web 1 Client Side
• Embedded scripting language– Originally LiveScript (Netscape)– Became JavaScript for marketing
reasons– Microsoft developed Jscript (ignored by
market)– Microsoft includes JavaScript now– Legal name is ECMA Script
Web 2: AJAX• Asynchronous Javascript And XML
– Evens load on the server• Set of techniques to disconnect from
traditional query/response cycle– Can make it more difficult for user to
determine state (e.g. “Did I click it?”), resulting in duplicate or confused requests
• Best example is Google Earth/Maps• Now very fashionable in the industry
HTTP Messages to Know
• 200 (OK) Request completed• 301 (Moved Permanently) – Need to
begin using the new URL• 302 (Redirect to specified URL)• 307 (Redirect to specified URL
temporarily)• 401 (Try again with authentication)• 403 (Unauthorized Access)• 404 (No such page)
Webmaster Role
• Enable people to do their own updates– Data– Web pages
• Use software to do this– Content Management System– DreamWeaver?
• Adobe Contribute allows you to control style while users enter content
– Microsoft SharePoint
Web SLA• Lead time for changes
– Better yet, they do their own changes• Performance:
– For the proponent of the site, latency at a given number of queries per second
– For the site visitor, other traffic is unimportant so your response to them as an individual is key
– Metaphor: Web store versus brick store• In the bricks, people see traffic• In the Web store, they think of typing in another URL
Architectures• Static Web server• CGI server
– “CGI” in this chapter is a generic term for all server-side methods.
– Perl is traditional CGI, creates additional threads– Module based: PHP, VBScript, Java– Thread creation can be a problem in modPerl (better to
not use it)• Database-driven site
– Wide variety of methods used for this– Well-developed: Content Management System– CPU performance can be an issue
• Multimedia (streaming) server– CPU performance can be an issue
LAMP
• Linux, Apache, MySQL, Perl• Linux, Apache, MySQL, PHP• Linux, Apache, MySQL, Python
• Best to have a name for your application architecture to save time
• Like anything else, standardize your application architecture
Multiple Servers Per Host
• Apache and IIS can sense the URL the user was going to and automatically serve the appropriate page.– Hostname– Protocol (http:// versus https:// versus ftp://)
• You could use multiple Ethernet ports for the same purpose– Improves performance if you have a very high-
bandwidth Internet connection• You could virtualize the entire server
– Simplifies https:// configuration– Requires separate IP address per site
The Scaling Dilemma
• If your site is not useful, it won’t be used much
• If your site is useful, it will be overwhelmed
• Horizontal: Use a cluster• Vertical: Segregate by function
– Web application– Database server / Web services server
Horizontal Scaling
• Round-robin name server recordsC:\nslookup google.comServer:
cns.s3woodstock.ga.atlanta.comcast.net
Name: google.comAddresses: 72.14.207.99,
64.233.187.99, 64.233.167.99
DNS Cacheing
• DNS cacheing can defeat round-robins:
1. Browser remembers last lookup2. DNS client in the OS the browser is
using may remember– ipconfig /flushdns
3. Forwarders may rememberBetter answer:• Hardware load balancer
Vertical Scaling
• Partition your service according to type of use:
• Static Web application• Dynamic Web application• Database server• Media file server
Application State
• The Web is a stateless system– It doesn’t “remember” of itself where the
conversation was previously• Applications must add on state control
– Cookies– Server-side information (tokens connect one
page to the next)• The state-management system can
present a scalability problem– Might be the hardest to solve
Security
• Information is going over the Net• Cross-scripting
• Partition your data so that “above the fold” items are not kept on the application server
Above the fold: So significant that a newspaper would put it on the top half of page 1
Secure Connections & Certificates
• SA Responsibility: Key management– Private part should not be on the same
server!• When is there a person available?• CPU time can be an issue
– If you are network bound, it is not an issue
Protecting Content
• Deny automatic directory generation• Beware of directory traversal• Use server-side to hide things. In
PHP:<html><head><title>Hiding the
secret</title></head><body><P>No, you aren't going to see the secret!</P><?php $a = "My Secret Code is 42" ?></body></html>
http://computing.southern.edu/jbeckett/secret.php
Cross-Scripting
• Visitor looks at your HTML, and creates their own HTML that mimics the parameters yours provides…
• except that it is hostile.• Check the referrer information• Double-validate
– In the browser– In your back-end program
SQL Injection
• Don’t use what people type in as SQL – it’s a very powerful language
• Choose which SQL elements to include based on the user’s choices
• Are you using SQL that is stored in a database?– Can somebody put hostile SQL in?
Limit Potential Damage
• When possible, your Web server should contain only a copy of the “real” data
• Perhaps the Web server should dish out static pages that are created when the underlying data changes
• Bonus of this technique: Performance improvement
• Use read-only mode when possible• Use OS permissions, limit to least needed• Log, log, log
Webmaster or SA?
• Webmaster should have the privileges he/she needs, and not more
• Consider establishing a separate host for the Web – perhaps even a separate security zone
• Text: Don’t become the Webmaster, let the company hire one.
Types of Web Changes
• Update – Newer material than what’s there
• Change – Revising structure• Fix – Correcting improper contents or
behavior
Three Web hosts?
• www-draft – where new things are developed
• www-qa – where new things are placed for verification before going “live”
• www – where the public sees things
What Is This Server For?
• Internal, external, or both?• Specific application?• Who will be using it?• Who will be updating it?• Uptime requirements?• Account management?• Storage needs?• Traffic expectations?
Namespace Principles
• People expect URLs to keep working• A URL should not have confidential
info in it (duh)– Student ID numbers
• Use the Include capability of your server software to implement your namespace scheme– /etc/dav for instance
External Sites – 4 steps
Task
• Registering domain
• DNS hosting• Web hosting• Content
Example for sgsdaschool.org
• No-ip.com• 204.15.252.7• 216.249.119.71• /home/
sgsdaschoolIf you have content already online when the domain is registered, Web spiders will find you automatically.
Never pay money to be “listed in all the best search engines.” Worthwhile search engines will find you. Only thing better is “buying a word”.
Why Outsource Web Hosting?
Pros• No local software to
install or maintain– Already set up with
most packages you need
• Dashboard makes it easy
• They specialize in Web apps
• Cost can be extremely low
Cons• Batch transfers to the
server take longer• Lack of control
Web Application Authentication
• .htaccess/.htpasswd – Cumbersome to maintain, do not scale well
• PAM (Pluggable Authentication Module)
• SQL lookup on external database• Active Directory lookup
• Who is updating the password service?
Mashup Apps• Quickly bring life to an idea• Grab existing site and re-format data
– Is it your data to control?• XML / Web Services are used extensively
for such apps• Inherent inefficiency may produce scaling
problems• Ability to cache may address scaling
problems– …and create propagation delay problems
http://hw.cs.southern.edu/Rooms/