Date post: | 19-Dec-2015 |
Category: |
Documents |
View: | 219 times |
Download: | 1 times |
Page 1
Architecting, Building and Deploying Successful Commercial Websites
Gist.com case study
Paul Finster – Chief Technology OfficerDave Ekhaus - Director, Platform Engineering
NYUFeb 29, 2000
Page 2
Architecting, Building and DeployingArchitecting, Building and Deploying
Agenda Part I – Hardware Configurations
Server Farms Databases Paradigms
Part II – Software Technology Architectural elements
Relational database Java Server Pages (JSP) Java Beans
Testing & Deploying Part III - Questions & Answers
Page 3
Part IPart I
Hardware Configurations
Page 4
The Internet is here!The Internet is here! Informational websites are big today!
Yahoo! Snap. MarketWatch Gist.com
Even 1% of Yahoo! traffic is a lot of traffic Gist.com runs tv.yahoo.com on 6 servers
Like e-Commerce, informational websites are mission critical applications for those business and individuals that rely on it
These are enterprise class applications! Denial of Service attacks proved popular need
Page 5
What is the Commercial Website landscape?What is the Commercial Website landscape?
The scale and dynamic nature of the web changes everything
Possibly 100’s of thousands of hits per day Dynamic customized content Huge peaks during certain times of day
Common platforms in use XML, WAP, HTTP XSL, CSS
Custom Applications TV Listings – built in-house
Generic Content Applications Newsfeeds
Page 6
What is the right Hardware Architecture?What is the right Hardware Architecture?
Two major hardware philosophies/paradigms Many cheap redundant machines
Example: Yahoo.com 100’s of Intel BSD machines with specially
modify Apache web server Content stored in huge memory caches
Cost Estimate: $2,000 per server Few expensive highly-reliable machines
Example: IWON.com 12 High-end Sun Solaris web server Content stored in 2 parallel Oracle
databases running on Sun E10000 servers Cost Estimate: $20,000-$100,000 per server
Page 7
Common Hardware RequirementsCommon Hardware Requirements
Co-location at data centers Exodus GlobalCenter Level3 AboveNet
Hardware Load Balancing: Cisco,F5, Radware Application level switches Hi-speed virtual networks Firewalls Network Monitoring software Enterprise Storage devices
Page 8
Typical N-tier Hardware ArchitectureTypical N-tier Hardware Architecture
LegacyApplications
(if any)
Web Server
Database Server
Web Server Web Server Web Server Web Server
Database Server
EnterpriseStorage
Load BalancerFirewall
Page 9
Gist’s Hardware ArchitectureGist’s Hardware Architecture
ISAPI DLL
NT Web Server
SQL Server 7.0Database Server
NT Web Server NT Web Server NT Web Server NT Web Server
SQL Server 7.0 Database Server
EnterpriseStorage
EMC
RadWare: WSD Load BalancerCisco PIX Firewall
Page 10
Part I IPart I I
Software Technology
Page 11
Java BeansScripting Tool: JSPGist
DevTools
Templates
Other ApplicationsTV ListingsGRID
WebsiteProductio
n Bulletin Boards Process Step Custom App
DataSources
NT Solaris IIS NSAPISQL Server& Informix
Oracle & Sybase
Cookie-based Server-side Sessions
Services &
APIs
System Independence
JSP Adapter
SQLAbstraction
Content Interface
OSDrivers
WebDrivers
JDBCDrivers
Gist’s Application Framework
Filesystem
ArticleDrivers
Page 12
Java Code SamplesJava Code Samples<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"><%@ page import="gist.external.gistcom.*,gist.*" %><jsp:useBean id="adf" scope="session" class="gist.internal.publishing.ADFObject"></jsp:useBean><%
UserObject u = UserObject.getUser (request,response); if (u.isDefault()){
response.sendRedirect ("/tv/login.jsp?nexturl=/tv/channels.jsp");return;
}String nexturl = request.getParameter("nexturl");if ( nexturl == null )
nexturl = "/tv/channels.jsp";%><%@ include file="/tv/templates/global.jsp" %>..Channel channelsDisplay [] = u.getSortedChannels (); String c_name = null ;for(int i = 0; i < channelsDisplay.length; i++){
if (channelsDisplay[i].isVisible() ) {
chanchecked = " CHECKED ";}else{
chanchecked = "";}
}
Page 13
TechnologyTechnology
Architecting
Page 14
Architecture RequirementsArchitecture Requirements
Scalability - performance, growth
Security - authentication, access control, privacy
Management - monitoring, dynamic configuration
Availability - fault tolerance
Stability - data integrity
Portability - OS, DB, WS independence
Extensibility - ability to adapt to changes in technology
Integration - integration of RDBMS and legacy systems
Page 15
ScalabilityScalability 1000’s of “transactions” per minute
Sub-second response time Database connection pooling
Performance Concurrency – multithreading
Inherent support in Java
Growth Load balancing
Want to be able to “Throw” hardware at the problem
Page 16
SecuritySecurity Authentication
Identifying the user via Cookies Architecture supports “cookie-less” mode with
URL re-writing of session parameters Access Control
What is the user permitted to do? Ordering PPV over the web; credit card numbers Attributes of our UserObject
Privacy (if required) Encryption - RC4, MD5 SSL URL rewriting
Page 17
ManagementManagement Monitoring
Consistent logging and reporting of system activity One way: Extensive use of site-wide email
diagnostics Enterprise Integration – SNMP
Integration with load balancer Rebooting crashed servers (NT primarily)
Automated Manual (if all else fails)
Dynamic Configuration Adding/Removing new features on-the-fly Incremental updating of site content Incremental Database updates
Page 18
AvailabilityAvailability 100% Availability
24x7x365 Operations Fault Tolerance
Hardware solutions Backup servers
Software solutions Dynamic database connections
Page 19
StabilityStability Data Integrity
Support for “transactions” Redundant databases
Protection Isolation of subsystems and application execution
Java sandbox and exception handling Resource Recovery
Database Connectivity System Resources – memory
Java memory garbage collection
Page 20
PortabilityPortability OS independence
NT Various flavors of UNIX LINUX
DB independence SQL Server, Sybase, Oracle, Informix
WS independence Netscape Enterprise Server, Microsoft IIS, Apache
Page 21
ExtensibilityExtensibility Adapt to Changes in Business
How well does the architecture allow you to change the content or navigation of your commercial website?
Does the architecture support your current legacy systems?
Does the architecture provide for Content and/or Editorial changes?
Adapt to Changes in Technology How quickly can you leverage new standards?
XML WAP WDL
Page 22
IntegrationIntegration Partner Advertising
Co-branded websites with differing ad serving ratios
Statistical Processing/Analysis MarketWave statistics
Navigation controls Partner cookies versus Gist.com cookies URL links back and forth between partners
Billing Partners Measuring Page views
Page 23
What we’ve learnedWhat we’ve learned Prototype ASAP in order to discover
architectural dependencies Database statements
Be specific in your SQL Test more: Then test again! Keep objects as light as possible Never store moving data
Member Age vs. Brithdate Channels Change: Excluded channels
User Migration is HARD! Incremental vs. batch
Get training
Page 24
TechnologyTechnology
Building
Page 25
DevelopmentDevelopment Prototypes
Work closely with partners to determine functionality
Prototype Deep, not Wide Process
Small Development Teams (2-4 people) Include: Designers, Technical producers
Develop Components in Parallel Frequent Releases
3-6 day development cycles Scoping - Controlled Feature Set
Page 26
TechnologyTechnology
Testing & Deploying
Page 27
TestingTesting Quality
Unit Test - thread safety, code coverage Smoke Test - quick validation BVT - build validation test Full Functional Test Regression Test - consistent functionality Load Test - high availability Benchmark - by Platform Installation Test - by Platform
Automated Tools Repeatability
Consistent and Isolated Environment
Metrics Measure real world scenarios Load test specific subsystems
Page 28
DeployingDeploying Capacity Planning - sizing exercise Beta Testing - early, well defined subset Focus Groups – early feedback Performance - simulating real world load Benchmarking - critical areas to measure Maintenance - staging environment,
versioning
Page 29
Capacity PlanningCapacity Planning Sizing Exercise
What workloads run at each node? What hardware is needed to maintain service due
to workload growth? How many more users can each existing server
support? How will server utilization be impacted if the
number of transactions increase by n%? What are those “transactions” doing?
Page 30
PerformancePerformance Simulating Real World Load
This is a Challenge Analyze existing system (web server logs) Forecast activity by looking at competitors Number of registered users
“Transactions” per minute Database access requirements Legacy connection requirements Networking requirements
Principles of Algorithms really matter
Page 31
Performance (continued)Performance (continued) Scalability and Fail-over are required for
24x7x365 availability
Determine appropriate hardware architecture- maximum acceptable response time, target server CPU utilization at 80% (leave room for growth)
Determine # and type of transactions- reading web pages, executing a query, updating a database, searching, sorting
Page 32
Benchmarking “Transactions”Benchmarking “Transactions”
Home Page Many hits, as light as possible
Grid Page Many hits, as fast as possible
Article Pages Remove archive links
Soaps Updates Pages Pre-compile pages if possible
User Registration Transactions As clear as possible
The answer to many problems is “caching”!
Page 33
MaintenanceMaintenance Staging Environment
Mirrored hardware/software Separate Database Migration strategy
Versioning Component Version Control Change Management
Page 34
Part I I IPart I I I
Questions & Answers
Page 35
Architecting, Building and Deploying Successful Commercial Websites
Paul Finster – Chief Technology [email protected]
Dave Ekhaus - Director, Platform [email protected]
NYUFeb 29, 2000