+ All Categories

FAQ

Date post: 10-May-2017
Category:
Upload: mansha99
View: 219 times
Download: 1 times
Share this document with a friend
40
Q: What is servlet? A: A servlet is a Java programming language class that is used to extend the capabilities of servers that host applications accessed by means of a request- response programming model. Before the servlet, CGI scripting language was used as server side programming language. Q: What is the use of servlet? A: Uses of servlet includes: Processing and storing data submitted by an HTML form. Providing dynamic content. A Servlet can handle multiple request concurrently and be used to develop high performance system Managing state information on top of the stateless HTTP. Q: What is the life cycle of servlet? A: Life cycle of Servlet: Servlet class loading Servlet instantiation Initialization (call the init method) Request handling (call the service method) Removal from service (call the destroy method) Q: Why do we need constructor in servlet if we use the init ()? A: Even though there is an init method in a servlet which gets called to initialize it, a constructor is still required to instantiate the servlet. Even though you as the developer would never need to explicitly call the servlet's constructor, it is still being used by the container. Q: How servlet is loaded? A: The servlet is loaded by: First request is made. Server starts up (auto-load).
Transcript
Page 1: FAQ

Q: What is servlet?

A: A servlet is a Java programming language class that is used to extend the capabilities of servers that

host applications accessed by means of a request- response programming model. Before the servlet, CGI

scripting language was used as server side programming language.

Q: What is the use of servlet?

A: Uses of servlet includes:

Processing and storing data submitted by an HTML form.

Providing dynamic content.

A Servlet can handle multiple request concurrently and be used to develop high performance system

Managing state information on top of the stateless HTTP.

Q: What is the life cycle of servlet?

A: Life cycle of Servlet:

Servlet class loading

Servlet instantiation

Initialization (call the init method)

Request handling (call the service method)

Removal from service (call the destroy method)

Q: Why do we need constructor in servlet if we use the init ()?

A: Even though there is an init method in a servlet which gets called to initialize it, a constructor is still

required to instantiate the servlet. Even though you as the developer would never need to explicitly call

the servlet's constructor, it is still being used by the container.

Q: How servlet is loaded?

A: The servlet is loaded by:

First request is made.

Server starts up (auto-load).

Page 2: FAQ

There is only a single instance which answers all requests concurrently. This saves memory and allows a

Servlet to easily manage persistent data.

Administrator manually loads.

Q: When the servlet is unloaded?

A: Servlet gets unloaded when:

Server shuts down.

Administrator manually unloads.

Q: What is servlet interface?

A: The central abstraction in the Servlet API is the Servlet interface. All servlets implement this interface,

either directly or more commonly by extending a class that implements it.

Q: What is the generic servlet class?

A: GenericServlet is an abstract class that implements the Servlet interface and the ServletConfig

interface. In addition to the methods declared in these two interfaces, this class also provides simple

versions of the lifecycle methods init () and destroy (), and implements the log method declared in the

ServletContext interface.

Q: What is the difference between GenericServlet and HttpServlet?

A: The difference is:

The GenericServlet is an abstract class that is extended by HttpServlet to provide HTTP protocol-specific

methods. But HttpServlet extends the GenericServlet base class and provides a framework for handling

the HTTP protocol.

The GenericServlet does not include protocol-specific methods for handling request parameters,

cookies, sessions and setting response headers. The HttpServlet subclass passes generic service method

requests to the relevant doGet () or doPost () method.

GenericServlet is not specific to any protocol. HttpServlet only supports HTTP and HTTPS protocol.

Q: Why HttpServlet class is declared abstract?

A: The HttpServlet class is declared abstract because the default implementations of the main service

methods do nothing and must be overridden. This is a convenience implementation of the Servlet

interface, which means that developers do not need to implement all service methods.

If your servlet is required to handle doGet () requests for example, there is no need to write a doPost ()

method too.

Page 3: FAQ

Q: Can servlet have a constructor?

A: Yes

Q: What are the type of protocols supported by the HttpServlet?

A: It extends the GenericServlet base class and provides a framework for handling the HTTP protocol. So,

HttpServlet only supports HTTP and HTTPS protocol.

Q: What is the difference between the doGet () and doPost ()?

A: The difference is:

In doGet() the parameters are appended to the URL and sent along with header information. In doPost

(),send the information through a socket back to the webserver and it won't show up in the URL bar.

The amount of information you can send back using a GET is restricted as URLs can only be 1024

characters. You can send much more information to the server by using post and it's not restricted to

textual data either. It is possible to send files and even binary data such as serialized Java objects!

DoGet() is a request for information.It does not change anything on the server. (doGet () should be

idempotent). doPost () provides information (such as placing an order for merchandise) that the server

is expected to remember.

Q: When to use doGet() and when doPost()?

A:Always prefer to use GET (As because GET is faster than POST), except mentioned in the following

reason:

If data is sensitive.

Data is greater than 1024 characters.

If your application don't need bookmarks.

Q: How do I support both doGet () and doPost () from same servlet?

A:The easy way is, just support POST, then have your doGet method call your doPost method.

Q: Should I override the service () method?

A: We never override the service method, since the HTTP Servlets have already taken care of it. The

default service function invokes the doXXX() method corresponding to the method of the HTTP request.

For example, if the HTTP request method is GET, doGet () method is called by default.

A servlet should override the doXXX() method for the HTTP methods that servlet supports. Because

HTTP service method checks the request method and calls the appropriate handler method, it is not

necessary to override the service method itself. Only override the appropriate doXXX() method.

Page 4: FAQ

Q: What is the ServletContext?

A: A servlet context object contains the information about the Web application of which the servlet is a

part. It also provides access to the resources common to all the servlets in the application. Each Web

application in a container has a single servlet context associated with it.

Q: What is the difference between the ServletConfig and ServletContext interface?

A: The ServletConfig interface is implemented by the servlet container in order to pass configuration

information to a servlet. The server passes an object that implements the ServletConfig interface to the

servlet's init () method. A ServletContext defines a set of methods that a servlet uses to communicate

with its servlet container.

Q: What is the difference between forward () and sendRedirect ()?

A: The difference is:

A forward is performed internally by the servlet. A redirect is a two step process, where the web

application instructs the browser to fetch a second URL, which differs from the original.

The browser is completely unaware that it has taken place, so its original URL remains intact. But in

sendRedirect, the browser, in this case, is doing the work and knows that it's making a new request.

Q: What is the difference between forward() and include()?

A: The RequestDispatcher include() method inserts the contents of the specified resource directly in the

flow of the servlet response, as if it were part of the calling servlet. The RequestDispatcher forward()

method is used to show a different resource in place of the servlet that was originally called.

Q: What is the use of servlet wrapper classes?

A: The HttpServletRequestWrapper and HttpServletResponseWrapper classes are designed to make it

easy for developers to create custom implementations of the servlet request and response types.

The classes are constructed with the standard HttpServletRequest and HttpServletResponse instances

respectively and their default behaviour is to pass all method calls directly to the underlying objects.

Q: What is a deployment descriptor?

A: A deployment descriptor is an XML document with an .xml extension. It defines a component's

deployment settings. It declares transaction attributes and security authorization for an enterprise bean.

The information provided by a deployment descriptor is declarative and therefore it can be modified

without changing the source code of a bean.

Q: What is the preinitialization of servlet?

A: A container does not initialize the servlets as soon as it starts up; it initializes a servlet when it

receives a request for that servlet first time. This is called lazy loading.

Page 5: FAQ

The servlet specification defines the element, which can be specified in the deployment descriptor to

make the servlet container load and initialize the servlet as soon as it starts up. The process of loading a

servlet before any request comes in is called preloading or preinitializing a servlet.

Q: What is the <load-on-startup> element?

A: The <load-on-startup> element of a deployment descriptor is used to load a servlet file when the

server starts instead of waiting for the first request. It is also used to specify the order in which the files

are to be loaded.

Q: What is session?

A: A session refers to all the requests that a single client might make to a server in the course of viewing

any pages associated with a given application. Sessions are specific to both the individual user and the

application.

Q: What is the session tracking?

A: Session tracking is a mechanism that servlets use to maintain state about a series of requests from

the same user (requests originating from the same browser) across some period of time.

Q: What is the need of session tracking in web application?

A: HTTP is a stateless protocol. Every request is treated as new request. For web applications to be more

realistic they have to retain information across multiple requests. Such information which is part of the

application is referred as "state". To keep track of this state we need session tracking.

Q: What are the different types of session tracking?

A: Different types are:

URL rewriting

Hidden Form Fields

Cookies

Secure Socket Layer (SSL) Sessions

Q: How do I use cookies to store session state on client?

A: In a servlet, the HttpServletResponse and HttpServletRequest objects passed to method HttpServlet.

Service () can be used to create cookies on the client and use cookie information transmitted during

client requests. JSPs can also use cookies, in scriptlet code or, preferably, from within custom tag code.

To set a cookie on the client, use the addCookie() method in class HttpServletResponse. Multiple cookies

may be set for the same request, and a single cookie name may have multiple values.

Page 6: FAQ

To get all of the cookies associated with a single HTTP request, use the getCookies() method of class

HttpServletRequest

Q: What are the advantages of storing session state in cookies?

A: Cookies are usually persistent, so for low-security sites, user data that needs to be stored long-term

(such as a user ID, historical information, etc.) can be maintained easily with no server interaction. For

small- and medium-sized session data, the entire session data (instead of just the session ID) can be kept

in the cookie.

Q: What is URL rewriting?

A: URL rewriting is a method of session tracking in which some extra data is appended at the end of each

URL. This extra data identifies the session. The server can associate this session identifier with the data it

has stored about that session.

Q: How can destroyed session in servlet?

A: Using session.invalidate() method.

Q: What is servlet lazy loading?

A: A container does not initialize the servlets as soon as it starts up; it initializes a servlet when it

receives a request for that servlet first time. This is called lazy loading.

Q: What is servlet chaining?

A: Servlet Chaining is a method where the output of one servlet is piped into a second servlet. The

output of the second servlet could be piped into a third servlet, and so on. The last servlet in the chain

returns the output to the Web browser

Q: What is filter?

A: Filters are Java components that are used to intercept an incoming request to a Web resource and a

response sent back from the resource. It is used to abstract any useful information contained in the

request or response.

Q: What are the advantages of jsp over servlet?

A: The advantage of JSP is that they are document-centric. Servlets, on the other hand, look and act like

programs. A Java Server Page can contain Java program fragments that instantiate and execute Java

classes, but these occur inside an HTML template file and are primarily used to generate dynamic

content.

Some of the JSP functionality can be achieved on the client, using JavaScript. The power of JSP is that it

is server-based and provides a framework for Web application development.

Q: What is the life cycle of jsp?

A: Life cyle of jsp:

Page 7: FAQ

Translation

Compilation

Loading the class

Instantiating the class

jspInit()

_jspService()

jspDestroy()

Q: What is the jspInit() method?

A: The jspInit() method of the javax.servlet.jsp.JspPage interface is similar to the init() method of

servlets. This method is invoked by the container only once when a JSP page is initialized. It can be

overridden by a page author to initialize resources such as database and network connections, and to

allow a JSP page to read persistent configuration data.

Q: What is the _jspService ()?

A: The _jspService() method of the javax.servlet.jsp.HttpJspPage interface is invoked every time a new

request comes to a JSP page. This method takes the HttpServletRequest and HttpServletResponse

objects as its arguments. A page author cannot override this method, as its implementation is provided

by the container.

Q: What is the jspDestroy ()?

A: The jspDestroy() method of the javax.servlet.jsp.JspPage interface is invoked by the container when a

JSP page is about to be destroyed. This method is similar to destroy() method of servlets. It can be

overridden by a page author to perform any cleanup operation such as closing a database connection.

Q: What jsp life cycle method can I override?

A: You cannot override the _jspService() method within a JSP page. You can however, override the

jspInit() and jspDestroy() methods within a JSP page. JspInit() can be useful for allocating resources like

database connections, network connections, and so forth for the JSP page. It is good programming

practice to free any allocated resources within jspDestroy().

Q: What are implicit objects in jsp?

A: Implicit objects in JSP are the Java objects that the JSP Container makes available to developers in

each page. These objects need not be declared or instantiated by the JSP author. They are automatically

instantiated by the container and are accessed using standard variables; hence, they are called implicit

objects.

Page 8: FAQ

Q: How many implicit objects are available in jsp?

A: These implicit objects are available in jsp:

Request

Response

PageContext

session

application

Out

config

page

exception

Q: What are jsp directives?

A: JSP directives are messages for the JSP engine. i.e., JSP directives serve as a message from a JSP page

to the JSP container and control the processing of the entire page.

They are used to set global values such as a class declaration, method implementation, output content

type, etc. They do not produce any output to the client.

Q: What is page directive?

A: Page Directive is:

A page directive is to inform the JSP engine about the headers or facilities that page should get from the

environment.

The page directive is found at the top of almost all of our JSP pages.

There can be any number of page directives within a JSP page (although the attribute – value pair must

be unique).

The syntax of the include directive is: <%@ page attribute="value">

Q: What are the attributes of page directive?

Page 9: FAQ

A: There are thirteen attributes defined for a page directive of which the important attributes are as

follows:

Import: It specifies the packages that are to be imported.

Session: It specifies whether a session data is available to the JSP page.

ContentType: It allows a user to set the content-type for a page.

IsELIgnored: It specifies whether the EL expressions are ignored when a JSP is translated to a servlet.

Q: What is the include directive?

A: Include directive is used to statically insert the contents of a resource into the current JSP. This

enables a user to reuse the code without duplicating it, and includes the contents of the specified file at

the translation time.

Q: What are the jsp standard actions?

A: The JSP standard actions affect the overall runtime behaviour of a JSP page and also the response

sent back to the client. They can be used to include a file at the request time, to find or instantiate a Java

Bean, to forward a request to a new page, to generate a browser-specific code, etc.

Q: What are the standards actions available in jsp?

A: The standards actions include:

<jsp:include>

<jsp:forward>

<jsp:useBean>

<jsp:setProperty>

<jsp:getProperty>

<jsp:param>

<jsp:plugin>

Q: What is the <jsp: useBean> standard action?

A: The <jsp: useBean> standard action is used to locate an existing Java Bean or to create a Java Bean if

it does not exist. It has attributes to identify the object instance, to specify the lifetime of the bean, and

to specify the fully qualified class path and type.

Page 10: FAQ

Q: What is the scope available in <jsp: useBean>?

A: Scope includes:

Page scope

Request scope

application scope

session scope

Q: What is the <jsp:forward> standard action?

A: The <jsp:forward> standard action forwards a response from a servlet or a JSP page to another page.

The execution of the current page is stopped and control is transferred to the forwarded page.

Q: What is the <jsp: include> standard action?

A: The <jsp: include> standard action enables the current JSP page to include a static or a dynamic

resource at runtime. In contrast to the include directive, include action is used for resources that change

frequently. The resource to be included must be in the same context.

Q: What is the difference between include directive and include action?

A: The difference is:

Include directive, includes the content of the specified file during the translation phase–when the page

is converted to a servlet. Include action, includes the response generated by executing the specified

page (a JSP page or a servlet) during the request processing phase–when the page is requested by a

user.

Include directive is used to statically insert the contents of a resource into the current JSP. Include

standard action enables the current JSP page to include a static or a dynamic resource at runtime.

Q: What is the difference between pageContext.include () and <jsp: include>?

A: The <jsp: include> standard action and the pageContext.include() method are both used to include

resources at runtime. However, the pageContext.include () method always flushes the output of the

current page before including the other components, whereas <jsp: include> flushes the output of the

current page only if the value of flush is explicitly set to true.

Q: What is the <jsp: setProperty> action?

A: You use jsp: setProperty to give values to properties of beans that have been referenced earlier.

Q: What is the <jsp: getProperty> action?

Page 11: FAQ

A: The <jsp: getProperty> action is used to access the properties of a bean that was set using the action.

The container converts the property to a String as follows:

If it is an object, it uses the toString() method to convert it to a String. If it is a primitive, it converts it

directly to a String using the valueOf() method of the corresponding Wrapper class.

The syntax of the <jsp: getProperty> method is: <jsp: getProperty name="Name" property="Property" />

Q: What is the <jsp: param> standard action?

A: The <jsp: param> standard action is used with <jsp: include> or <jsp: forward> to pass parameter

names and values to the target resource.

Q: What is the <jsp: plugin> action?

A: This action lets you insert the browser-specific OBJECT or EMBED element needed to specify that the

browser run an applet using the Java plugin.

Q: What is the scripting element?

A: JSP scripting elements let you insert Java code into the servlet that will be generated from the current

JSP page.

Expressions

Scriptlet

Declarations

comment

Q: What is the scriptlet?

A: A scriptlet contains Java code that is executed every time a JSP is invoked. When a JSP is translated to

a servlet, the scriptlet code goes into the service() method.

Hence, methods and variables written in scriptlet are local to the service() method. A scriptlet is written

between the <% and %>tags and is executed by the container at request processing time.

Q: What is the jsp declaration?

A: JSP declarations are used to declare class variables and methods in a JSP page. They are initialized

when the class is initialized. Anything defined in a declaration is available for the whole JSP page. A

declaration block is enclosed between the <%! and %>tags. A declaration is not included in the service()

method when a JSP is translated to a servlet.

Q: What is the jsp expression?

Page 12: FAQ

A: A JSP expression is used to write an output without using the out.print statement. It can be said as a

shorthand representation for scriptlet. An expression is written between the <%= and %> tags. It is not

required to end the expression with a semicolon, as it implicitly adds a semicolon to all the expressions

within the expression tags.

Q: How is scripting disabled?

A: Scripting is disabled by setting the scripting-invalid element of the deployment descriptor to true. It is

a subelement of jsp-property-group. Its valid values are true and false.

Q: Why is _jspService () start with ‘_’?

A: _jspService() method will be written by the container hence any methods which are not to be

overridden by the end user are typically written starting with a '_'. This is the reason why we don't

override _jspService() method in any JSP page.

Q: How to pre-compile jsp?

A: Add jsp_precompile as a request parameter and send a request to the JSP file. This will make the jsp

pre-compile. http://localhost:8080/jsp1/test.jsp?jsp_precompile=true

It causes execution of JSP life cycle until jspInit() method without executing _jspService() method.

Q: What is the benefit of pre-compile jsp page?

A: It removes the start-up lag that occurs when a container must translate a JSP page upon receipt of the

first request.

Q: What is the difference between variable declared inside the declaration tag and variable declared in

scriptlet?

A: Variable declared inside declaration part is treated as a instance variable and will be placed directly at

class level in the generated servlet. Variable declared in a scriptlet will be placed inside _jspService ()

method of generated servlet. It acts as local variable.

Q: What are the three kind of comment in jsp?

A: These are the three types of commenst in jsp:

JSP Comment: <%-- this is jsp comment -- %>

HTML Comment: <!-- this is HTMl comment -- >

Java Comments: <% // single line java comment /* this is multiline comment */ %>

Q: What is the output comment?

A: The comment which is visible in the source of the response is called output comment. <!-- this is HTMl

comment -- >

Page 13: FAQ

Q: What is a hidden comment?

A: This is also known as JSP comment and it is visible only in the JSP and in rest of phases of JSP life cycle

it is not visible. <%-- this is jsp comment -- %>

Q: How does jsp handle the run time exception?

A: You can use the errorPage attribute of the page directive to have uncaught run-time exceptions

automatically forwarded to an error processing page.

Q: How can I implement the thread safe jsp page?

A: You can make your JSPs thread-safe by having them implement the SingleThreadModel interface. This

is done by adding the directive in the JSP. <%@ page isThreadSafe="false" %>

Q: Is there a way to reference the “this” variable within the jsp?

A: Yes, there is. The page implicit object is equivalent to "this", and returns a reference to the generated

servlet.

Q: Can you make the use of servletOutputStream object within jsp?

A: Yes. By using getOutputStream () method on response implicit object we can get it.

Q: What is autoflush?

A: This command is used to autoflush the contents. If a value of true is used it indicates to flush the

buffer whenever it is full. In case of false it indicates that an exception should be thrown whenever the

buffer is full. If you are trying to access the page at the time of conversion of a JSP into servlet will result

in error.

Q: What is the different scope available in jsp?

A:The different scopes are:

Page: Within the same page.

Request: After forward or include also you will get the request scope data.

Session: After sendRedirect also you will get the session scope data. All data stored in session is available

to end user till session closed or browser closed.

Application: Data will be available throughout the application. One user can store data in application

scope and other can get the data from application scope.

Q: When to use application scope?

A: If we want to make our data available to the entire application then we have to use application scope.

Q: Can a jsp page instantiate a serialized bean?

Page 14: FAQ

A: No problem! The use Bean action specifies the bean Name attribute, which can be used for indicating

a serialized bean.

Q: In which situation we can use the static include and dynamic include?

A: If the target resource won’t change frequently, then it is recommended to use include directives. If

the target resource will change frequently, then it is recommended to use include action.

Q: What is the JDBC?

A: Java Database Connectivity (JDBC) is a standard Java API to interact with relational databases form

Java. JDBC has set of classes and interfaces which can use from Java application and talk to database

without learning RDBMS details and using Database Specific JDBC Drivers

Q: What are the basic steps of using jdbc in java?

A: The basic steps are:

Load the RDBMS specific JDBC driver because this driver actually communicates with the database.

Open the connection to database which is then used to send SQL statements and get results back.

Create JDBC Statement object. This object contains SQL query.

Execute statement which returns resultset(s). Resultset contains the tuples of database table as a result

of SQL query.

Process the result set.

Close the connection.

Q: What are the main component of jdbc?

A: The main components are:

DriverManager

Driver

Connection

Statement

Resultset

Q: What is DriverManager?

Page 15: FAQ

A: DriverManager is a static class. It manages a list of database drivers. Matches connection requests

from the java application with the proper database driver using communication sub protocol. The first

driver that recognizes a certain sub protocol under JDBC will be used to establish a database

Connection.

Q: What is Driver?

A: The JDBC API defines the Java interfaces and classes that programmers use to connect to databases

and send queries. A JDBC driver implements these interfaces and classes for a particular DBMS

vendor.database communications link, handling all communication with the database.

Normally, once the driver is loaded, the developer need not call it explicitly.

Q: What is the connection?

A: Interface with all methods for contacting a database. The connection object represents

communication context, i.e., all communication with database is through connection object only

Q: What is the statement?

A: Encapsulates an SQL statement which is passed to the database.

Q: What is the resultset?

A: The Resultset represents set of rows retrieved due to query execution.

Q: How we load a database driver with JDBC?

A: Provided the JAR file containing the driver is properly configured, just place the JAR file in the

classpath. Java developers NO longer need to explicitly load JDBC drivers using code like Class.forName()

to register a JDBC driver.

The DriverManager class takes care of this by automatically locating a suitable driver when the

DriverManager.getConnection() method is called. This feature is backward-compatible, so no changes

are needed to the existing JDBC code.

Q: What is the JDBC Driver interface?

A: The JDBC Driver interface provides vendor-specific implementations of the abstract classes provided

by the JDBC API. Each vendor driver must provide implementations of the

java.sql.Connection,Statement,PreparedStatement, CallableStatement, ResultSet and Driver

Q: What does the connection objects represents?

A: The connection object represents communication context, i.e., all communication with database is

through connection object only.

Q: What is the statement?

A: Statement acts like a vehicle through which SQL commands can be sent. Through the connection

object we create statement kind of objects.

Page 16: FAQ

Q: What is the prepared statement?

A: A prepared statement is an SQL statement that is precompiled by the database. Through

precompilation, prepared statements improve the performance of SQL commands that are executed

multiple times. Once compiled, prepared statements can be customized prior to each execution by

altering predefined SQL parameters.

Q: What is the difference between statement and PreparedStatement?

A: The difference is:

A standard Statement is used to create a Java representation of a literal SQL statement and execute it

on the database. A PreparedStatement is a precompiled statement. This means that when the

PreparedStatement is executed, the RDBMS can just run the PreparedStatement SQL statement without

having to compile it first.

Statement has to verify its metadata against the database every time. While a prepared statement has

to verify its metadata against the database only once.

If you want to execute the SQL statement once go for STATEMENT. If you want to execute a single SQL

statement multiple number of times, then go for PREPAREDSTATEMENT. PreparedStatement objects

can be reused with passing different values to the queries

Q: What is the callable statement?

A: Callable statements are used from JDBC application to invoke stored procedures and functions.

Q: How to call a stored procedure from jdbc?

A: PL/SQL stored procedures are called from within JDBC programs by means of the prepareCall()

method of the Connection object created. A call to this method takes variable bind parameters as input

parameters as well as output variables and creates an object instance of the CallableStatement class.

Q: What are the types of JDBC Driver?

A: The types are:

Type 1: JDBC/ODBC

Type2: Native API (partly-Java driver)

Type 3: Open Protocol-Net

Type 4: Proprietary Protocol-Net(pure Java driver)

Q: Which type of jdbc driver is the faster one?

Page 17: FAQ

A: JDBC Net pure Java driver(Type IV) is the fastest driver because it converts the JDBC calls into vendor

specific protocol calls and it directly interacts with the database.

Q: Does the JDBC-ODBC Bridge support multiple concurrent open statements per connection?

A: No, You can open only one Statement object per connection when you are using the JDBC-ODBC

Bridge.

Q: What are the standard isolation levels defined by the jdbc?

A: The standard isolation levels are:

TRANSACTION_NONE

TRANSACTION_READ_COMMITTED

TRANSACTION_READ_UNCOMMITTED

TRANSACTION_REPEATABLE_READ

TRANSACTION_SERIALIZABLE

Q: What is the resultset?

A: The ResultSet represents set of rows retrieved due to query execution. Example: ResultSetrs =

stmt.executeQuery(sqlQuery);

Q: What are the types of resultset?

A: The types are:

TYPE_FORWARD_ONLY specifies that a resultset is not scrollable, that is, rows within it can be advanced

only in the forward direction.

TYPE_SCROLL_INSENSITIVE specifies that a resultset is scrollable in either direction but is insensitive to

changes committed by other transactions or other statements in the same transaction.

TYPE_SCROLL_SENSITIVE specifies that a resultset is scrollable in either direction and is affected by

changes committed by other transactions or statements within the same transaction.

Q: What is the difference between TYPE_SCROLL_INSENSITIVE and TYPE_SCOLL_SENSITIVE?

A: An insensitive resultset is like the snapshot of the data in the database when query was executed. A

sensitive resultset does NOT represent a snapshot of data; rather it contains points to those rows which

satisfy the query condition.

Page 18: FAQ

After we get the resultset the changes made to data are not visible through the resultset, and hence

they are known as insensitive. After we obtain the resultset if the data is modified then such

modifications are visible through resultset.

Q: What is the RowSet?

A: A RowSet is an object that encapsulates a set of rows from either Java Database Connectivity (JDBC)

result sets or tabular data sources like a file or spreadsheet. RowSets support component-based

development models like JavaBeans, with a standard set of properties and an event notification

mechanism.

Q: What are the different types of RowSet?

A:The different types are:

Connected - A connected RowSet object connects to the database once and remains connected until the

application terminates.

Disconnected - A disconnected RowSet object connects to the database, executes a query to retrieve the

data from the database and then closes the connection. A program may change the data in a

disconnected RowSet while it is disconnected. Modified data can be updated in the database after a

disconnected RowSet re-establishes the connection with the database.

Q: What is the need of BatchUpdates?

A: The BatchUpdates feature allows us to group SQL statements together and send to database server in

one single trip.

Q: What is the data source?

A: A DataSource object is the representation of a data source in the Java programming language. In basic

terms,

A DataSource is a facility for storing data.

DataSource can be referenced by JNDI.

Data Source may point to RDBMS; file System, any DBMS etc.

Q: What are the advantages of data source?

A: The advantages are:

An application does not need to hardcode driver information, as it does with the DriverManager.

The DataSource implementations can easily change the properties of data sources.

Page 19: FAQ

The DataSource facility allows developers to implement a DataSource class to take advantage of

features like connection pooling and distributed transactions.

Q: What is the main advantage of connection pooling?

A: A connection pool is a mechanism to reuse connections created. Connection pooling can increase

performance dramatically by reusing connections rather than creating a new physical connection each

time a connection is requested.

Q: What is the multi programming?

A: Multiprogramming is a rapid switching of the CPU back and forth between processes.

Q: What is the difference between TCP and UDP?

A: TCP is designed to provide reliable communication across a variety of reliable and unreliable networks

and internets.UDP provides a connectionless so it isbasically an unreliable service. Delivery and

duplicate protection are not guaranteed.

Q: What is socket?

A: The combination of an IP address and a port number is called a socket.

Q: What is the advantage of java socket?

A: The advantages are:

Sockets are flexible and sufficient.

Efficient socket based programming can be easily implemented for general communications.

Sockets cause low network traffic.

Q: What is the disadvantage of java socket?

A: The disadvantages are:

Security restrictions are sometimes overbearing because a Java applet running in a Web browser is only

able to establish connections to the machine where it came from, and to nowhere else on the network.

Despite all of the useful and helpful Java features, Socket based communications allows only to send

packets of raw data between applications. Both the client-side and server-side have to provide

mechanisms to make the data useful in any way.

Since the data formats and protocols remain application specific, the re-use of socket based

implementations is limited.

Q: What is RMI?

Page 20: FAQ

A: It stands for Remote Method Invocation. RMI is a set of APIs that allows to build distributed

applications. RMI uses interfaces to define remote objects to turn local method invocations into remote

method invocations.

Q: What is socket()?

A: The socket () is very similar to socketPair() except that only one socket is created instead of two. This

is most commonly used when if the process you wish to communicate with is not the child process.

Q: What is ServerSocket?

A: The ServerSocket class is used to create serverSocket. This object is used to communicate with client.

Q: What is bind()?

A: It binds the socket to the specified server and port in the SocketAddress object. Use this method if

you instantiated the ServerSocket using the no-argument constructor.

Q: What is the Datagram?

A: A datagram is an independent, self-contained message sent over the network whose arrival, arrival

time, and content are not guaranteed.

Q: What is getLocalPort()?

A: It returns the port that the server socket is listening on. This method is useful if you passed in 0 as the

port number in a constructor and let the server find a port for you.

Q: What is accept()?

A: It waits for an incoming client. This method blocks until either a client connects to the server on the

specified port or the socket times out, assuming that the time-out value has been set using the

setSoTimeout() method. Otherwise, this method blocks indefinitely.

Q: What is the network interface?

A: A network interface is the point of interconnection between a computer and a private or public

network. A network interface is generally a network interface card (NIC), but does not have to have a

physical form.

Q: What is the encapsulation technique?

A: Hiding data within the class and making it available only through the methods. This technique is used

to protect your class against accidental changes to fields, which might leave the class in an inconsistent

state.

Q: How does the race condition occur?

A: It occurs when two or more processes are reading or writing some shared data and the final result

depends on who runs precisely when.

Page 21: FAQ

Q: What information is needed to create a TCP Socket?

A: Socket is created from this information:

Local System's: IP Address and Port Number

Remote System’s: IPAddress and

Contains a brief description on the Life Cycle and the different Testing Models.

SDLC:

The software development life cycle (SDLC) is a conceptual model used in project management that

describes the stages involved in an information system development project, from an initial feasibility

study through maintenance of the completed application/product.

V-Model:

The V-Model shows and translates the relationships between each phase of the development life cycle

and its associated phase of testing. The V-model is a software development model which is considered

to be an extension of the waterfall model. Instead of moving down in a linear way, the process steps are

targeted upwards after the coding phase, to form the typical V shape.

Requirements analysis: In this phase, the requirements of the proposed system are collected by

analyzing the needs of the user(s). This phase is concerned about establishing what the ideal system has

to perform. However, it does not determine how the software will be designed or built. Usually, the

users are interviewed and a document called the user requirements document is generated. The user

requirements document will typically describe the system’s functional, physical, interface, performance,

data, security requirements etc as expected by the user. The user acceptance tests are designed in this

phase.

System Design: System engineers analyze and understand the business of the proposed system by

studying the user requirements document. They figure out possibilities and techniques by which the

user requirements can be implemented. If any of the requirements are not feasible, the user is informed

of the issue. A resolution is found and the user requirement document is edited accordingly.

The software specification document which serves as a blueprint for the development phase is

generated. This document contains the general system organization, menu structures, data structures

etc. It may also hold example business scenarios, sample windows, reports for the better understanding.

Other technical documentation like entity diagrams, data dictionary will also be produced in this phase.

The documents for system testing is prepared in this phase.

Page 22: FAQ

High-level design: This phase can also be called as high-level design. The baseline in selecting the

architecture is that it should realize all which typically consists of the list of modules, brief functionality

of each module, their interface relationships, dependencies, database tables, architecture diagrams,

technology details etc. The integration testing design is carried out in this phase.

Low-level design: This phase can also be called as low-level design. The designed system is broken up in

to smaller units or modules and each of them is explained so that the programmer can start coding

directly. The low level design document or program specifications will contain a detailed functional logic

of the module, in pseudo-code - database tables, with all elements, including their type and size - all

interface details with complete API references- all dependency issues- error message listings- complete

input and outputs for a module. The unit test design is developed in this stage."

Waterfall Model:

The waterfall model is a popular version of the systems development life cycle model for software

engineering. Often considered the classic approach to the systems development life cycle, the waterfall

model describes a development method that is linear and sequential. Waterfall development has

distinct goals for each phase of development. Imagine a waterfall on the cliff of a steep mountain. Once

the water has flowed over the edge of the cliff and has begun its journey down the side of the mountain,

it cannot turn back. It is the same with waterfall development. Once a phase of development is

completed, the development proceeds to the next phase and there is no turning back.

The advantage of waterfall development is that it allows for departmentalization and managerial

control. A schedule can be set with deadlines for each stage of development and a product can proceed

through the development process like a car in a carwash, and theoretically, be delivered on time.

The disadvantage of waterfall development is that it does not allow for much reflection or revision.

Once an application is in the testing stage, it is very difficult to go back and change something that was

not well-thought out in the concept stage.

Stages: Project Planning -> Requirements definition -> Design -> Development -> Integration and Testing

-> Installation/Acceptance -> Maintenance

Spiral Model:

There are four phases in the "Spiral Model" which are: Planning, Evaluation, Risk Analysis and

Engineering. These four phases are iteratively followed one after other in order to eliminate all the

problems, which were faced in "The Waterfall Model". Iterating the phases helps in understating the

problems associated with a phase and dealing with those problems when the same phase is repeated

next time, planning and developing strategies to be followed while iterating through the phases.

Agile Process:

Agile aims to reduce risk by breaking projects into small, time-limited modules or timeboxes

("iterations") with each iteration being approached like a small, self-contained mini-project, each lasting

Page 23: FAQ

only a few weeks. Each iteration has it own self-contained stages of analysis, design, production, testing

and documentation. In theory, a new software release could be done at the end of each iteration, but in

practice the progress made in one iteration may not be worth a release and it will be carried over and

incorporated into the next iteration. The project's priorities, direction and progress are re-evaluated at

the end of each iteration.

Test life cycle:

1. Test Requirements stage - Requirement Specification documents, Functional Specification documents,

Design Specification documents (use cases, etc), Use case Documents, Test Trace-ability Matrix for

identifying Test Coverage.

2. Test Plan - Test Scope, Test Environment, Different Test phase and Test Methodologies, Manual and

Automation Testing, Defect Management, Configuration Management, Risk Management, Evaluation &

identification – Test, Defect tracking tools, test schedule, resource allocation.

3. Test Design - Traceability Matrix and Test coverage, Test Scenarios Identification & Test Case

preparation, Test data and Test scripts preparation, Test case reviews and Approval, Base lining under

Configuration Management.

4. Test Environment Setup - Test Bed installation and configuration, Network connectivity's, All the

Software/ tools Installation and configuration, Coordination with Vendors and others.

5. Test Automation - Automation requirement identification, Tool Evaluation and Identification,

Designing or identifying Framework and scripting, Script Integration, Review and Approval, Base lining

under Configuration Management.

6. Test Execution and Defect Tracking - Executing Test cases, Testing Test Scripts, Capture, review and

analyze Test Results, Raise the defects and tracking for its closure.

7. Test Reports and Acceptance - Test summary reports, Test Metrics and process Improvements made,

Build release, Receiving acceptance.

This FAQ deals with HTML. For information about the differences between HTML and XHTML, please see

the XHTML vs HTML FAQ.

What is HTML?

HTML, or HyperText Markup Language, is a tagged markup language primarily used for web documents.

A tagged markup language means that the content is interspersed with instructions, tags, that mark up

the semantic meaning of certain passages. HTML is an application of SGML (Standard Generalized

Markup Language), a more generic markup language.

HTML defines a number of element types (written in uppercase throughout this FAQ, although HTML is

case insensitive). An element type, e.g., EM, assigns some semantic meaning to its content.

An element is a concrete instance of an element type. An element usually consists of a start tag (<em>),

some content, and an end tag (</em>). Tags are written in lowercase in this FAQ. HTML allows some end

tags (and even a few start tags) to be omitted. Do not confuse tags with elements; the BODY element

Page 24: FAQ

will be present even if the <body> and </body> tags are omitted. Certain element types – declared as

EMPTY – must not have an end tag. One example is the IMG element type.

A start tag can contain attributes, comprising an attribute name, an equals sign (=), and an attribute

value. Example: lang="en". Attribute values must be quoted in some instances, so it is good practice to

always quote all attribute values. Some boolean attributes are allowed to be minimised in HTML, which

means the name and the equals sign are omitted; e.g. selected. Some attributes are required for some

element types, e.g., the alt attribute in an IMG element.

An example of an EM element with a lang attribute:

HTML Code:

<em lang="en">content</em>

Beginners often use phrases like 'alt tag', but that is not correct nomenclature; alt is an attribute, not a

tag. Tags are surrounded by <…>.

The first version of HTML (1989) didn't have a version number; it was just 'HTML'.

The first 'standardised' version of HTML (IETF, 1995) was called HTML 2.0.

Then the World Wide Web Consortium (W3C) was formed. It presented its first 'standard' version (W3C

isn't a standards body, so their 'standards' are really called 'recommendations') in 1997: HTML 3.2.

HTML 4.0 came out in 1998, and was quickly replaced by HTML 4.01 in 1999. That is the latest and

current version of HTML. The W3C has announced that it will not create further versions of HTML. HTML

4.01 is the recommended version for creating HTML documents.

However, the Web Hypertext Application Technology Working Group (WHATWG) are working on what is

referred to as HTML5, hoping that it will eventually be accepted as a W3C recommendation.

What does the DOCTYPE declaration do?

The DOCTYPE declaration, which must precede any other markup in the document, can look something

like this:

HTML Code:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">

It specifies the element type of the document's root element (HTML), a public identifier and a system

identifier.

The public identifier (-//W3C//DTD HTML 4.01//EN) shows who has issued the document type

definition, or DTD, (W3C); the name of the DTD (DTD HTML 4.01); and the language in which the DTD is

written (EN, for English). Note that it doesn't say anything about the language of the web page itself; it is

the language of the DTD that is specified.

The system identifier (http://www.w3.org/TR/html4/strict.dtd) is the URI (uniform resource identifier,

or 'web address') for the actual DTD.

Page 25: FAQ

The DOCTYPE declaration tells a validator (a program that checks the syntactic validity of a web page)

against which DTD to test for compliance. Browsers didn't used to care about the DOCTYPE declaration,

but modern browsers use it for a completely different purpose: deciding if the page is 'modern' (and

presumably standards compliant) or old-school. This affects the rendering mode in IE5/Mac, IE6+/Win,

Opera, Firefox (and other Gecko browsers), Safari, etc. A complete DOCTYPE declaration (including the

system identifier) tells the browser that it is a modern document. If the system identifier is missing, or if

there is no DOCTYPE declaration at all, browsers assume it is an old document and render it in 'quirks

mode'.

What is a DTD?

A DTD, or document type definition, specifies the element types and attributes that we can use in our

web page. It also specifies how these element types relate to one another, e.g., which type can be

subordinate to which. We can regard the DTD as the grammar specification for our markup language.

The DTD can also declare the character entities we can use; more about those later.

A validator will test a web page for compliance with the DTD specified in the DOCTYPE declaration;

either explicitly via the system identifier or implicitly using the public identifier. Browsers use non-

validating parsers and do not actually read the DTD. They have built-in knowledge about the various

element types and usually a hard-coded list of character entities as well.

For HTML 4.01, which is the latest and greatest version of HTML and the only one we should consider

when creating new web pages, there are three different DTDs: Strict, Transitional and Frameset.

What is the difference between Strict, Transitional and Frameset?

The difference is which element types and attributes they declare, and how they allow or require

element types to nest.

The HTML 4.01 Strict DTD emphasises the separation of content from presentation and behaviour. This

is the DTD that the W3C recommend for all new documents.

The HTML 4.01 Transitional DTD is meant to be used transitionally when converting an old-school (pre-

HTML4) document into modern markup. It is not intended to be used for creating new documents. It

contains 11 presentational element types and a plethora of presentational attributes that are

deprecated in the Strict DTD. The Transitional DTD is also often necessary for pages that reside within a

frameset, because it declares the target attribute required for opening links in another frame.

The HTML 4.01 Frameset DTD is for frameset pages. Frames are deprecated by the W3C. For modern

websites, using server-side scripting technologies is usually regarded as a far better solution.

Which DOCTYPE should I use?

If you are creating a new web page, the W3C recommend using HTML 4.01 Strict.

Page 26: FAQ

If you are trying to convert an ancient HTML 2.0 or HTML 3.2 document to the modern world, you can

use HTML 4.01 Transitional until you have managed to transfer all presentational issues to CSS and all

behavioural issues to JavaScript.

Why should I validate my markup?

Why should you spell-check your text before publishing it on the Web? Because mistakes and errors can

confuse your readers and detract from the important information. It is the same with markup. Invalid

markup can confuse browsers, search engines and other user agents. The result can be improper

rendering, dysfunctional pages, unindexed pages in the search engines, program crashes, or the end of

the universe as we know it.

If your page doesn't display the way you intended, always validate your markup before you start looking

for other problems (or asking for help on SitePoint). With invalid markup, there are no guarantees.

Use the HTML validator at W3C to check for compliance. Don't forget to include a DOCTYPE declaration,

so the validator knows what to check against.

HTML Tidy is a free tool that can help you tidy up sloppy markup and make it nicely formatted and easier

to read.

Why does HTML allow sloppy coding?

It doesn't, but it recommends that user agents handle markup errors and try to recover.

It is sometimes alleged – usually as an argument for why XHTML is superior to HTML – that HTML allows

improperly nested elements like <b><i>foo</b></i>. That is not true; the validator will complain about

that because it is not valid HTML. However, browsers will usually guess what the author meant, so the

error can go by undetected.

Some dislike that HTML allows certain (but not all!) end tags to be omitted. That is not a problem for

browsers, because valid markup can be parsed unambiguously anyway. In the early years it was very

common to omit certain end tags, e.g., </p> and </li>. Nowadays it's usually considered good practice to

use explicit end tags for all elements except those where it is forbidden (like BR and IMG).

Why does the validator complain about my <embed> tag?

Because EMBED has never been part of any HTML recommendation. It is a non-standard extension

which, although supported by most browsers, is not part of HTML.

During the 'browser wars' of the late 1990s, browser vendors like Microsoft and Netscape competed by

adding lots of 'cool' features to HTML, to make it possible to style web pages. The problem with those

additions was that they were not standardised and that they were mostly incompatible between

browsers.

Page 27: FAQ

There are other elements that used to be quite common (MARQUEE anyone?) that have never been

included in an HTML recommendation. Don't use them, if you can avoid it.

There are also a number of attributes that were very common in the 1990s, but which have never been

included in an official HTML recommendation. For example, marginwidth.

What does character encoding (charset) mean?

Computers can only deal with numbers. What we see on the screen as letters or images are really just

numeric codes, which the computer sees as groups of binary digits (ones and zeros).

First, we need to define a minimum unit capable of conveying some sort of information. This is called a

character. This is a rather abstract concept. The character known as 'uppercase A' has no defined visual

appearance; it's more like 'the idea of an A'.

Then we need to establish a set of such abstract characters that we will want to use. That is called a

character repertoire, or sometimes a 'character set', but that term is used for several different things, so

I will avoid it here. A character repertoire is the total set of abstract characters that we have at our

disposal. For HTML, the character set is ISO&#160;10646, which is virtually the same thing as Unicode. It

is a repertoire of tens of thousands of characters representing most of the written languages on the

planet.

The visual appearance of a character is called a glyph. A certain set of glyphs is known as a font. The

glyph for 'uppercase A' will differ between fonts, but that doesn't change the underlying meaning of the

abstract character.

Now, since computers only deal with numbers, we must have a way to represent each character with a

numeric code. Each character in a repertoire has a code position, or code point. The code point is the

numeric representation (index) of the character within the repertoire. Code points in Unicode are

usually expressed in hexadecimal, e.g., U+0041 for 'uppercase A'.

Finally, the encoding – sometimes, unfortunately, called a 'character set' or 'charset' – is a mechanism

for expressing those code points, usually with octets, which are groups of 8 binary digits (thus capable of

representing numbers between 0 and 255, inclusive).

In the early days of computer communication, people used small character repertoires containing only

the bare necessities for a specific language. The most well-known one is probably ASCII (ISO&#160;646),

which only contains 128 characters – and 33 of those are unprintable 'control codes' (the C0 range plus

DEL). The repertoire has 128 code points numbered sequentially from 0 to 127. The encoding is a simple

one-to-one: the codepoint for 'uppercase A' is 65 (0x41), which is encoded as 65 (the octet 01000001, in

binary).

Page 28: FAQ

ASCII isn't very useful outside the English-speaking world, because it only contains the letters A-Z, digits

0-9, and some punctuation. ISO issued a set of standards called ISO&#160;8859, which augment the

ASCII repertoire with characters that are needed in other languages. In the Western world, the most

common one is ISO&#160;8859-1, known as Latin-1. It contains characters needed to write most

Western European languages. The ISO&#160;8859 series are both character repertoires and encodings

(one-to-one). Each repertoire contains 256 characters, which can be encoded using single octets. They

use the ASCII repertoire as a subset, i.e., the first 128 code points are the same.

But even 256 characters is not enough to write some languages. Chinese, for instance, needs thousands

of characters. Several mutually incompatible encodings for Chinese were devised, but there was still a

big problem when you wanted to exchange information across linguistic and cultural barriers. The

ISO&#160;8859-1 encoding for 'uppercase A' might be something totally different in one of the Chinese

encodings; perhaps even something rude!

An easy solution would be to create a 32-bit encoding that would enable direct access to four billion

code points, or at least a 16-bit encoding (65,536 code points). Both of those exist, but there is a

drawback: there will be a lot of useless octets for most Western languages. With a 32-bit encoding,

every document would be four times are large as with an 8-bit encoding.

The solution was a variable-length encoding called UTF-8. It uses between one and six octets to encode

each code point, and it can address the entire Unicode (or ISO&#160;10646) character repertoire. The

first 128 code points are encoded with single octets, and are identical to the same code points in the

ISO&#160;8859 series or in ASCII (US version). Most Western European languages can be encoded with

single octets, sprinkled with the occasional double octet for letters with diacritical marks (e.g., '&#196;').

OK, so how does this affect us as authors of web documents? If we use characters whose code point is

outside the ASCII range, the encoding becomes really crucial. Specify the wrong encoding, and the page

will be difficult – or even impossible – to read.

So how do we go about specifying the encoding? The proper way to do it is to send this information in

the Content-Type HTTP header:

Code:

Content-Type: text/html; charset=utf-8

The HTTP headers are sent by our web server, so we must tweak the server to change the encoding

information. How to do that depends on which web server we use. For Apache, it can be specified in the

global configuration file (/etc/httpd.conf) or in local .htaccess files. But if we are on a shared host, we

may not have sufficient privileges to tweak the configuration. In that case, we need a server-side

scripting language to send our own HTTP header; e.g., with PHP:

PHP Code:

header('Content-Type: text/html; charset=utf-8');

We can also specify the encoding using an HTTP equivalent in a META element:

HTML Code:

Page 29: FAQ

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

This META element will be ignored if the real HTTP header contains encoding information. It can be

useful anyway, however, because it will be used if a visitor saves our page to the hard drive and looks at

it locally. In that situation there is no web server to send HTTP headers, so the META element will be

used instead.

There is no default encoding for HTML, so we should always make sure to specify it.

Under Microsoft Windows, a common encoding is Windows-1252. It is very similar to ISO&#160;8859-1,

but there are differences. In ISO&#160;8859-1, the range of code points between 128 and 159 (0x80-

0x9F) is reserved for C1 control characters. In Windows-1252, that range is instead used for a number of

useful characters that are missing from the ISO encoding, e.g., typographically correct quotation marks.

This is not an encoding that I would recommend for use on the Web, since it's Windows specific. It is,

however, the default encoding in many text editors under Windows.

What is a BOM?

The BOM, or byte order mark, is used for some encodings that use multiple octets to encode code

points, e.g., UTF-8 and UTF-16. Computer processors (CPUs) can employ different schemes for storing

large integer numbers, e.g., 'big-endian' or 'little-endian' (this has to do with whether the least

significant byte comes first or last). The BOM is a representation of the value 0xFFFE (65,535), which

means that it's possible to detect the byte order (FF FE vs FE FF). The BOM is thus two octets which are

written at the very beginning of the file, to tell the parser how to interpret multi-octet values.

Unfortunately, many older browsers cannot handle this, so they display these octets as character data. If

you see something like '&#239;&#187;&#191;' at the top of the page, the reason is probably a BOM that

isn't handled by the browser (or an incorrectly specified encoding).

The only resolution is to avoid using the BOM. Editors that can save as UTF-8 will usually allow us to

choose whether or not to include the BOM.

What encoding should I declare?

It's very, very simple: we must specify the encoding that we used when saving our .html file! If we save

the file as ISO&#160;8859-1, we must specify the encoding as iso-8859-1; if we save as UTF-8, we specify

it as utf-8. The only problem here is that we may not always know what encoding our editor is using to

save the file. Any editor worth its salt should give us an option to specify the encoding, though.

If we are writing in English, it doesn't matter all that much what encoding we choose, because we are

mostly going to use characters that are encoded the same in most encodings. US-ASCII, ISO&#160;8859-

1, UTF-8, … take a pick. For those of us who write in other languages, the choice becomes more

important. My native language – Swedish – uses three letters more than what the English alphabet has

to offer. Those are present in ISO&#160;8859-1, though, so I can choose between that and UTF-8. If you

are writing in an East Asian language like Chinese or Japanese, you may want to look into UTF-16, since

Page 30: FAQ

UTF-8 can be a bit inefficient for those languages. Otherwise, though, I wouldn't recommend UTF-16,

because there seems to be quite a bit of problems with that in Western browsers.

Avoid Windows-1252 on public web pages, since it's a Windows specific encoding. Use ISO&#160;8859-1

instead (or ISO&#160;8859-15, if you need the Euro sign, €).

If at all possible, my recommendation is to use UTF-8. It can natively represent any character in the

Unicode repertoire.

How do I insert characters outside the encoding range?

What if we are using ISO&#160;8859-1 as the encoding and wish to include an em-dash in our content?

There is no em-dash in that character repertoire, and hence no way to encode it, although it is present

in ISO&#160;10646 and can be used on a web page.

We have two choices: a named entity or a numeric reference.

The named entity for an em-dash is &mdash;. Entities start with an ampersand (&) and end with a

semicolon. In some circumstances we can get away with omitting the semicolon, but it is definitely good

practice to always put it in.

A numeric reference can be either decimal (&#38;#8212;) or hexadecimal (&#38;#x2014;), but it's

generally safer to stick with decimal notation, because some old browsers don't handle the hex version.

Note that the numeric value references the code point in ISO&#160;10646; it has nothing whatsoever to

do with what encoding we have specified for our document.

References (in decimal) always work. Named entities may cause problems in older browsers, because

some of them only support a subset of HTML entities.

Why do I need to write &amp; instead of &?

(How do I show HTML markup on the page?)

Certain characters have special meaning in HTML: '<' (less than), '>' (greater than), '&' (ampersand), '"'

(double quote) and ''' (single quote). In some circumstances, we need to 'escape' them. For instance, the

'<' character signals the start of a tag, and needs to be escaped. The ampersand signals the start of a

named entity or numeric reference, and must always be escaped (except in CDATA elements like SCRIPT

and STYLE). The quotation marks need only be escaped within attribute values surrounded by the same

quotation marks.

The first four have predefined entities in HTML, but not the single quote. XML defines &apos;, but HTML

does not, so a single quote (apostrophe, really) can only be escaped using a reference (&#38;#39;). The

entities for the other four are as follows:

&lt; (<)

&gt; (>)

&amp; (&)

Page 31: FAQ

&quot; (")

Since the ampersand is very special, it must always be escaped, including when it's used inside attribute

values. Like the href attribute of links. Unfortunately, the ampersand is a very common argument

separator in URIs, which means that it's quite common to encounter ampersands in URIs.

Most of the time, it doesn't break anything (in HTML; XHTML is a different story). The error handling

routines in browsers recover from the error and it all works. But if we should happen to have a query

parameter whose name matches one of the predefined named entities in HTML …

How should heading elements be used?

HTML heading element types are H1, H2, H3, H4, H5 and H6. The number denotes the structural level of

the heading, which means we should look at headings as in those outlines we had to learn in school (and

promptly forgot about right after graduation).

The top-level heading on a page must be an H1. It should describe what the page is about. Most pages

will only have one H1 heading, but very complex documents that deal with several disparate topics may

need more.

H2 headings will mark up the next structural level. Any sub-levels under that will be H3, and so on. We

can never skip a heading level (downward). An H4 cannot possibly follow an H2; there must be an H3 in

between. (The validator will not complain, this is merely an issue of good practice.)

It's important to mark up headings with the Hx element types. Assistive technologies, e.g., screen

readers, can make use of a proper heading hierarchy to present an outline of the document. If we use

<font size="7">...</font>, they cannot.

What are block-level and inline elements?

There are two main categories of element types in HTML: block-level elements and inline elements. The

differences between them are mainly semantic and grammatic.

Block-level elements are usually 'containers' for other elements. Examples of block-level elements are

DIV, P, FORM and TABLE. Some block-level elements, e.g., P, can only contain text and inline elements.

Others, e.g., FORM, can only contain block-level elements (in the Strict DTD). And some, like DIV, can

contain text, inline elements and block-level elements. Block-level elements are by default rendered

with an implicit line break before and after; in other words, we cannot have two block-level elements

side by side using only Strict HTML (that requires CSS).

Inline elements are elements that can occur 'inline' within text. Examples: A, EM, Q and SPAN. An inline

element can only contain text and other inline elements. An inline element cannot contain a block-level

Page 32: FAQ

element, with one exception: OBJECT (which is known as a replaced inline element, the same as IMG).

Inline elements, when rendered, do not have any implied line breaks before or after.

Sometimes there are additional restrictions on child element types. For instance, anchor links (A) can

contain text and inline elements, but not other A elements; you cannot nest links.

The rules are somewhat different between the Strict and the Transitional DTD. In the Strict DTD, some

block-level elements like BODY, BLOCKQUOTE and FORM can only have block-level children. In the

Transitional DTD they can also contain text and inline elements as immediate children.

Can I make an inline element block-level with CSS?

No. This is a common misconception. Beginners sometimes think that using display:block on an A

element will allow them to put a block-level H1 inside the link. That is not the case.

HTML has block-level and inline elements. CSS has block and inline boxes (plus a few others). Those are

very different things. The distinction in HTML has to do with semantics and syntax, while the distinction

in CSS has to do with rendering and presentation. By default, block-level elements generate block boxes,

and inline element generate inline boxes (grossly simplified). The display property can change the type

of the generated box, but CSS cannot change the grammatical or syntactical rules of HTML.

Why are external CSS and JavaScript files a good idea?

From a maintenance perspective, a full separation between content, presentation and behaviour is

something to strive for. If we want to redesign our site, we can simply edit a single style sheet instead of

updating possibly thousands of HTML documents. If we use style attributes and write inline CSS, we will

have to edit those HTML documents when redesigning our site, instead of simply editing a single style

sheet file.

There is also another issue: both CSS and JavaScript often contain characters with special meaning in

HTML. If the CSS code or JavaScript code is embedded into the HTML document, these characters need

to be escaped. If we have embedded JavaScript, and use the archaic practice of 'hiding' the script code

within SGML comments (<!--…-->), we cannot use the decrement operator (e.g., --i), because the double

hyphen will terminate the comment.

Should I use P or BR?

The P element marks up a paragraph of text. A paragraph is one or more sentences that deal with a

single thought.

A line break (BR) is mostly a presentational thing, and should be handled by CSS rather than HTML.

However, there are a few cases where line breaks can be said to have semantic meaning, for instance in

poetry, song lyrics, postal addresses and computer code samples. These can be legitimate uses for BR,

but using BR to separate 'paragraphs' is definitely not a legitimate use.

Page 33: FAQ

On the other hand, P has a very clear semantic meaning: it denotes a paragraph. Sometimes web

authors tend to treat P as a generic block-level container element, but that's not correct. It's not

uncommon to see a LABEL and an INPUT field wrapped inside a P within a FORM, but I would argue that

it's semantically wrong. A label and an input field does not constitute a 'paragraph'.

What does 'semantic' mean?

se&#183;man&#183;tic [si-'man-tik]

adj. Of, pertaining to, or arising from the different meanings of words or other symbols.

When we talk about 'semantic markup', we mean the proper use of element types – based on their

meaning – to mark up content. The opposite is 'presentational markup' or 'tag soup', where authors

choose element types because of their default rendering, rather than their semantic meaning.

An example: This is a semantically correct way to mark up the top-level heading of a web page:

HTML Code:

<h1>Heading Text</h1>

This is an unsemantic (presentational) way to do it:

HTML Code:

<br><font size="7"><b>Heading Text</b></font><br>

The semantic richness of HTML is quite limited. HTML was originally used by physicists to exchange

scientific documents, and that shows quite clearly in the set of available element types. HTML would

probably have had a very different set of element types if it had been invented by accountants or

librarians.

HTML has two semantically neutral element types as well: the block-level DIV and the inline-level SPAN.

Neither of those two implies any particular semantics about its content; DIV is just a 'division of the

document', while SPAN is a 'span of characters'. On the other side of the spectrum we have element

types with clearly defined semantics: P (paragraph of text), TABLE (tabular data), UL (unordered list), etc.

The purpose of HTML is to mark up the semantics of a document, and – to some extent – to show the

structure of its content. It has nothing at all to do with the way this document looks in a browser

(although browsers have a default style for each element type).

Should I replace B/I with STRONG/EM?

Only if we really mean to emphasise something. They are not interchangeable.

In the Bad Old Days, authors would use B and I to emphasise words.

In the Equally Bad Modern Days, authors will use STRONG and EM to make text boldfaced or italic.

EM signifies semantic emphasis. The content should have some sort of emphasis (louder, more slowly)

when read out loud. STRONG indicates even stronger emphasis, but is now often considered to be

redundant (you could nest EM elements to indicate increasing emphasis). Some experts recommend

Page 34: FAQ

that STRONG be used only for certain page elements that should be clearly indicated (like a 'current

page' indicator), and not to mark up words or phrases in the body copy.

B and I have no semantics, they only indicate bold or italics. They are useful for adhering to typographic

conventions that do not have a semantically correct element type in HTML. For instance, ship names are

traditionally written in italics, but there is no SHIP element type in HTML. Thus we can use

<i>Titanic</i>.

Why are layout tables considered harmful?

Because it is semantically wrong to mark up non-tabular information as a TABLE.

Because they can cause accessibility or usability problems (especially with some assistive technologies),

particularly when nested several levels deep.

Because they mix presentational issues with the content, making it difficult or impossible to achieve

alternate styling and output device independence.

Because they bloat the markup with lots of unnecessary HTML tags, which can be detrimental for low-

bandwidth users (dial-up, mobile devices) as well as for the web server's load and bandwidth.

Should I use DIVs instead of layout tables?

No, we should use semantically correct element types as far as possible, and only revert to DIVs when

there are no other options.

Abusing DIVs is no better than abusing TABLEs. We can set id and class attributes on virtually any

element type. We can assign CSS rules to virtually any element type. Not only DIVs.

Are all TABLEs bad?

Not at all. TABLE is the proper, semantically correct element type to use for marking up tabular data:

information with relationships in two or more dimensions. Tables are not deprecated, only layout tables.

What is the use of the ADDRESS element type?

To mark up contact information for the page (or for a part of a page). This can be a postal address, an

email address, a telephone number, or virtually anything. ADDRESS is a block-level element which can

only contain text and inline elements. The default rendering is italic in most browsers, but that can easily

be changed with CSS.

A common misconception is that ADDRESS is meant to be used for marking up any postal address, but

that is not the case.

What is the use of the DFN element type?

To mark up the 'defining instance' of a term. It is a typographic convention, especially in scientific

documents, that the first time a new term – with which the reader cannot be expected to be familiar –

appears in the text, it is italicised. The default rendering of DFN is thus italic.

Page 35: FAQ

A common misconception is that DFN means 'definition', and many authors use it in the same what that

they use ABBR or ACRONYM: using the title attribute to provide an explanation of the term. A certain

term should only be marked up with DFN once in a document (where it is first used and explained).

What is the use of the VAR element type?

To mark up a variable, or placeholder, part of an example. It is a typographic convention to italicise such

variables, which will be replaced by actual data in real life use. For instance, in a telephone system

manual, the instruction for relaying incoming calls to another extension could look something like this:

HTML Code:

<kbd>* 21 * <var>extension</var> #</kbd>

Here, a VAR element is used to mark up 'extension' (which will be italic by default). Someone trying to

program the telephone system to relay his incoming calls to extension 942 would type '*21*942#'. Thus

the VAR element indicates that you shouldn't actually type 'e-x-t-e-n-s-i-o-n', but enter the actual

extension number instead. The word 'extension' is a variable.

A common misconception is that VAR should be used for marking up variables in programming code

samples.

Should I use quotation marks within or around a Q element?

No, the specification clearly says that it is the responsibility of the user agent to add quotation marks to

inline quotations. Unfortunately, Internet Explorer 6 and older do not comply with the specification and

will not add quotation marks. An option is to insert those with JavaScript for IE, and use some special

styling with CSS to distinguish quotations for IE users with JavaScript disabled. Some CSS-only solutions

have been proposed, but they will fail in non-CSS browsers like Lynx.

What is the difference between ABBR and ACRONYM?

No one really seems to know. Even the HTML specification is contradicting itself.

ABBR was a Netscape extension to HTML during the browser wars. ACRONYM was Microsoft's

extension. Both meant the same thing, more or less. Both element types were incorporated into the

HTML specification, with different semantics. The problem is that no one seems to be able to explain

what those semantics are.

Let us look at a couple of dictionary definitions, then:

ab&#183;bre&#183;vi&#183;a&#183;tion [uh-bree-vee-'ey-shuhn]

n. A shortened or contracted form of a word or phrase, used to represent the whole.

ac&#183;ro&#183;nym ['ak-ruh-nim]

n. A word formed from the initial letters or groups of letters of words in a set phrase or series of words.

Page 36: FAQ

The definition for acronym says that it is a word, i.e., it can be pronounced. Thus, NATO would be an

acronym, formed from the initial letters in the phrase North Atlantic Treaty Organization. FBI, however,

would not be an acronym according to the dictionary definition, because it is not pronounced as a word,

but rather spelled out (eff bee eye). And this is where the problems begin. FBI is technically known as an

initialism, about which the dictionary has the following to say:

in&#183;i&#183;tial&#183;ism [i-'nish-uh-liz-uhm]

n. 1. A name or term formed from the initial letters of a group of words and pronounced as a separate

word.

2. A set of initials representing a name, organization, or the like, with each letter pronounced separately.

The first definition is almost the same as for acronym, but the second is more relaxed. But there is no

INITIALISM element type in HTML, and the confusion is exacerbated by the fact that 'acronym' in normal

American parlance is used as a synonym for 'initialism'.

The HTML specification has the following definitions:

ABBR: Indicates an abbreviated form (e.g., WWW, HTTP, URI, Mass., etc.).

ACRONYM: Indicates an acronym (e.g., WAC, radar, etc.).

So far it looks like it is adhering to the dictionary definitions, which means that FBI should be marked up

with ABBR since it's not pronounceable as a word. However, a few paragraphs further down, the

specification says,

Western languages make extensive use of acronyms such as "GmbH", "NATO", and "F.B.I.", as well as

abbreviations like "M.", "Inc.", "et al.", "etc.".

Are you confused yet? I am. The safe thing to do then, should be to always use ABBR, since all acronyms

are abbreviations, but not vice versa. Aaahh … ahem … there's a slight problem with that. Microsoft

were so miffed when the W3C decided(?) to use ABBR for abbreviations and initialisms instead of their

ACRONYM, that they actually refused to support ABBR! They've finally promised to support ABBR in IE7,

only eight years after HTML4 became a recommendation, but there will still be millions of users with

older IE versions out there for many years to come.

So what is a poor web author to do? Why should we even bother? It might be nice to have an element

to attach a title attribute to, but we could use SPAN for that. The idea, allegedly, is that marking up

abbreviations and acronyms would be beneficial for assistive technologies; especially screen readers.

But screen readers tend to ignore ABBR and ACRONYM, since no one knows how to use them properly

and Microsoft doesn't support ABBR. Catch-22.

The answer to this frequently asked question is: I don't know! I personally use ABBR for obvious

abbreviations like 'Inc.' and for initialisms like 'FBI', and I use ACRONYM for things that can be

pronounced as words, like 'GIF'. But due to the ambiguity of the specification, I cannot fault anyone for

marking up 'FBI' as an acronym (although 'Inc.' certainly is not). And what about 'SQL', which some spell

out and others pronounce as 'sequel'? (I would use ABBR.)

Page 37: FAQ

Why is <feature X> deprecated?

The most common 'feature' that beginners ask about is the target attribute for links. This is deprecated

(disapproved) in HTML 4.01 Strict, but it's still valid in HTML 4.01 Transitional. Many other element types

and attributes that are allowed in Transitional are removed from Strict.

The reason for deprecating those things is that the W3C want to promote the separation between

content (HTML), presentation (CSS) and behaviour (JavaScript). Making an element centred within the

viewport is a presentational issue, thus it should be handled with CSS instead of a CENTER element.

Opening a new browser window is a behavioural issue, thus it should be handled with JavaScript instead

of a target attribute.

The deprecated features are those that arose during the browser war era of the late 1990s, when

browser vendors were competing by adding various extensions to HTML to make it into some sort of

page layout language. They were included in HTML 3.2 to bring some sort of order to the chaos, but this

is not what HTML was intended for. When HTML4 came out, the authors tried to 'reclaim the Web' by

deprecating what they saw as 'harmful' parts of HTML 3.2, at least in the Strict DTD.

In other words, things are deprecated for a reason. Don't use them unless you absolutely have to.

Must I have an ALT attribute for every image?

Yes, the alt attribute is required for the IMG element type. Why? Because not all users can perceive

images, and because not all user agents can understand or display images. Examples:

A person who is blind or has very low vision cannot see an image. A screen reader cannot describe an

image.

Users with slow connections (dial-up or mobile) sometimes disable images for faster surfing.

Text browsers like Lynx do not support images.

Search engine 'bots cannot understand images.

Thus we have to provide a text equivalent for each image, using the alt attribute. This should not

describe the image; it should convey the equivalent information. Writing good text equivalents is not

easy, and it takes a lot of practice. The text equivalent is displayed instead of the image.

So what is a good text equivalent for a given image? That depends on the context in which the image is

used! It's not like there is a single 'perfect' text equivalent for each image. Let us look at an example: say

we have an image of a grazing cow. This particular cow happens to be an Aberdeen Angus. Let us then

consider a few use cases for this image.

In the first case, this image is used as a generic illustration for an article about beef cattle farming in

Scotland. The actual cow isn't germane to the article; it's just an illustration, a decorative design element

that draws the reader's eye and relieves the monotony of the text. In this case, the image doesn't

convey any relevant information. Therefore it should have a null (empty) text equivalent: alt="".

Page 38: FAQ

In the second case, the image is used on a children's website about farm animals. The page shows

pictures of various animals: a horse, a sheep, a pig, a cow, etc. Next to each image is a block of text that

presents some facts about each species. In this case, alt="Cow:" could be appropriate. It's not important

that it's an Aberdeen Angus; it represents bovine quadrupeds in general.

In the third case, the image is used on a site about different breeds of cattle. Here it is used to illustrate

what an Aberdeen Angus looks like, and how it is different from other breeds. The page comprises a

number of images, each with a caption that identifies the breed, but no other textual information. In

this case, the text equivalent should describe the particular attributes and traits that are specific to an

Aberdeen Angus: the robust build, the massive chest, the relatively short legs, the buffalo-like hump

behind the head, etc.

In the fourth case, the image is used on a photographer's portfolio page. It's one image among several

others, with very different motifs. This is one of the few cases where the alt attribute might actually

include a description of the image itself, e.g., 'A black Aberdeen Angus grazing in the sunshine with Ben

Nevis in the background.'

As we can see, the appropriate text equivalent depends on the context. Sometimes (often, actually) it

should be null, because the image doesn't convey any information that isn't available in the

accompanying text. Some claim that such images should be background images specified via CSS, but

there are many cases where that is impractical and where the image is really part of the content – even

though it doesn't convey any useful information to those who cannot see it.

For images containing text, the text equivalent should of course be the same text as in the image. For

things like pie charts, the text equivalent should convey information about the percentages: the same

information as the image conveys.

The alt text shouldn't be too long. Some browsers don't word wrap them, and they cannot be formatted

in any way. If we need a longer text equivalent, we should put it somewhere else and link to it via the

longdesc attribute.

Internet Explorer and old Netscape browsers display the alt attribute in a tool-tip when the user hovers

the mouse pointer over the image. This is wrong. We should use the title attribute for 'tool-tip'

information. To suppress the tool-tip for alt texts, we can use an empty title: title="" (at least in IE).

What is the difference between CLASS and ID?

An ID uniquely identifies a particular element in an HTML document. It's like a social security number,

providing a unique handle for that element. Just as two people cannot have the same SS#, no two

elements in a document can have the same ID. IDs must be unique within the page.

Page 39: FAQ

A class says that an element has some traits which it (possibly) shares with other elements. An element

can belong to more than one class. An analogy could be professions: a person could be both a carpenter

and a nurse, and there are many carpenters and many nurses. (They all have unique social security

numbers, though.)

Both IDs and classes are mainly used with CSS and/or JavaScript. In CSS, an ID has higher specificity than

a class, making it easy to specify special rules for a specific element. With JavaScript we can look up an

element using its ID (document.getElementById()).

We assign IDs to page elements that can occur at most once per page, like a navigation menu, a footer, a

sidebar, etc. We can also assign IDs to specific elements that only occur once in the whole site, like a

specific image, if we want to have certain CSS rules for it or manipulate it with JavaScript.

We assign classes to elements that share some common traits, usually display properties via CSS rules.

IDs and class names should be as 'semantic' as possible. They should describe what something is, not

what it looks like. Thus, id="menu" is much better than id="left"; especially if we redesign and move the

menu to the right-hand side.

IDs and class names are case sensitive, even in HTML. We shouldn't rely on that, though – ie., we should

not have names that differ only in case.

=====================================================================================

Frequently asked questions by external

What exactly your project does?

What are the roles of different team members?

Why only this project is developed by you?

What extra functionalities it provides that are not provided by existing software?

Frontend and backend of your project.

Why this programming language? (Don’t say that we are comfortable in it, rather

discuss properties of your project and match it will the respective programming

language.)

Why this backend? why not others.

In designing, you MUST have a clear picture of your project.

Page 40: FAQ

ER-diagrams, DFDs, Class, Use case and sequence diagram, (you must have a

proper understanding of these terms).

Which s/w engineering model?

Cost estimation of your project.

SWOT analysis of your project.

Well written references, (don't write www.google.com or www.wikipedia.org)


Recommended