Student Seminar Report & Project Report With Presentation (PPT,PDF,DOC,ZIP)

Full Version: Introduction to Java Servlets
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
[attachment=11792]
An Introduction to Java Servlets
by Hans Bergsten

Java servlets are making headlines these days, claiming to solve many of the problems associated with CGI and proprietary server APIs. In this article I will describe the overall servlet architecture and what you need to develop your application with servlets. I will use several code examples to show you how to use the Servlet API, and compare it with CGI and proprietary server APIs where appropriate.
March 10, 1999
I assume that you're familiar with HTTP and CGI or a proprietary server API like NSAPI or ISAPI. I also assume that you are somewhat familiar with Java programming or some other object-oriented language, such as C++. Even if you're not a Java programmer you should be able to appreciate the benefits of servlets reading this article, but before you develop your own servlets I recommend that you first learn the Java basics.
The Dark Ages
Early in the World Wide Web's history, the Common Gateway Interface (CGI) was defined to allow Web servers to process user input and serve dynamic content. CGI programs can be developed in any script or programming language, but Perl is by far the most common language. CGI is supported by virtually all Web servers and many Perl modules are available as freeware or shareware to handle most tasks.
But CGI is not without drawbacks. Performance and scalability are big problems since a new process is created for each request, quickly draining a busy server of resources. Sharing resources such as database connections between scripts or multiple calls to the same script is far from trivial, leading to repeated execution of expensive operations.
Security is another big concern. Most Perl scripts use the command shell to execute OS commands with user-supplied data, for instance to send mail, search for information in a file, or just leverage OS commands in general. This use of a shell opens up many opportunities for a creative hacker to make the script remove all files on the server, mail the server's password file to a secret account, or do other bad things that the script writer didn't anticipate.
The Web server vendors defined APIs to solve some of these problems, notably Microsoft's ISAPI and Netscape's NSAPI. But an application written to these proprietary APIs is married to one particular server vendor. If you need to move the application to a server from another vendor, you have to start from scratch. Another problem with this approach is reliability. The APIs typically support C/C++ code executing in the Web server process. If the application crashes, e.g. due to a bad pointer or division by zero, it brings the Web server down with it.
Servlets to the rescue!
The Servlet API was developed to leverage the advantages of the Java platform to solve the issues of CGI and proprietary APIs. It's a simple API supported by virtually all Web servers and even load-balancing, fault-tolerant Application Servers. It solves the performance problem by executing all requests as threads in one process, or in a load-balanced system, in one process per server in the cluster. Servlets can easily share resources as you will see in this article.
Security is improved in many ways. First of all, you rarely need to let a shell execute commands with user-supplied data since the Java APIs provide access to all commonly used functions. You can use JavaMail to read and send email, Java Database Connect (JDBC) to access databases, the File class and related classes to access the file system, RMI, CORBA and Enterprise Java Beans (EJB) to access legacy systems. The Java security model makes it possible to implement fine-grained access controls, for instance only allowing access to a well-defined part of the file system. Java's exception handling also makes a servlet more reliable than proprietary C/C++ APIs - a divide by zero is reported as an error instead of crashing the Web server.
The Servlet Run-time Environment
A servlet is a Java class and therefore needs to be executed in a Java VM by a service we call a servlet engine.
The servlet engine loads the servlet class the first time the servlet is requested, or optionally already when the servlet engine is started. The servlet then stays loaded to handle multiple requests until it is explicitly unloaded or the servlet engine is shut down.
Some Web servers, such as Sun's Java Web Server (JWS), W3C's Jigsaw and Gefion Software's LiteWebServer (LWS) are implemented in Java and have a built-in servlet engine. Other Web servers, such as Netscape's Enterprise Server, Microsoft's Internet Information Server (IIS) and the Apache Group's Apache, require a servlet engine add-on module. The add-on intercepts all requests for servlets, executes them and returns the response through the Web server to the client. Examples of servlet engine add-ons are Gefion Software's WAICoolRunner, IBM's WebSphere, Live Software's JRun and New Atlanta's ServletExec.
All Servlet API classes and a simple servlet-enabled Web server are combined into the Java Servlet Development Kit (JSDK), available for download at Sun's official Servlet site (see Resources below). To get started with servlets I recommend that you download the JSDK and play around with the sample servlets.
As this article is written (early March 1999), the released version of the JSDK is for the Servlet 2.0 API, with an Early Access version of the JSDK 2.1 available at Java Developer's Connection. All servlet engines mentioned above support the Servlet 2.0 API, and a few also support the 2.1 API. The examples of 2.1 API features in this article are clearly marked so you don't have to be surprised when they don't work with your 2.0 servlet engine.
Servlet Interface and Life Cycle
Let's implement our first servlet. A servlet is a Java class that implements the Servlet interface. This interface has three methods that define the servlet's life cycle:
• public void init(ServletConfig config) throws ServletException
This method is called once when the servlet is loaded into the servlet engine, before the servlet is asked to process its first request.
• public void service(ServletRequest request, ServletResponse response) throws ServletException, IOException
This method is called to process a request. It can be called zero, one or many times until the servlet is unloaded. Multiple threads (one per request) can execute this method in parallel so it must be thread safe.
• public void destroy()
This method is called once just before the servlet is unloaded and taken out of service.
The init method has a ServletConfig attribute. The servlet can read its initialization arguments through the ServletConfig object. How the initialization arguments are set is servlet engine dependent but they are usually defined in a configuration file.
A typical example of an initialization argument is a database identifier. A servlet can read this argument from the ServletConfig at initialization and then use it later to open a connection to the database during processing of a request:
...
private String databaseURL;
public void init(ServletConfig config) throws ServletException {
super.init(config);
databaseURL = config.getInitParameter("database");
}
The Servlet API is structured to make servlets that use a different protocol than HTTP possible. The javax.servlet package contains interfaces and classes intended to be protocol independent and the javax.servlet.http package contains HTTP specific interfaces and classes. Since this is just an introduction to servlets I will ignore this distinction here and focus on HTTP servlets. Our first servlet, named ReqInfoServlet, will therefore extend a class named HttpServlet. HttpServlet is part of the JSDK and implements the Servlet interface plus a number of convenience methods. We define our class like this:
import javax.servlet.*;
import javax.servlet.http.*;
public class ReqInfoServlet extends HttpServlet {

...

}
An important set of methods in HttpServlet are the ones that specialize the service method in the Servlet interface. The implementation of service in HttpServlet looks at the type of request it's asked to handle (GET, POST, HEAD, etc.) and calls a specific method for each type. This way the servlet developer is relieved from handling the details about obscure requests like HEAD, TRACE and OPTIONS and can focus on taking care of the more common request types, i.e. GET and POST. In this first example we will only implement the doGet method.
protected void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {

...

}
Request and Response Objects
The doGet method has two interesting parameters: HttpServletRequest and HttpServletResponse. These two objects give you full access to all information about the request and let you control the output sent to the client as the response to the request.
With CGI you read environment variables and stdin to get information about the request, but the names of the environment variables may vary between implementations and some are not provided by all Web servers. The HttpServletRequest object provides the same information as the CGI environment variables, plus more, in a standardized way. It also provides methods for extracting HTTP parameters from the query string or the request body depending on the type of request (GET or POST). As a servlet developer you access parameters the same way for both types of requests. Other methods give you access to all request headers and help you parse date and cookie headers.
Instead of writing the response to stdout as you do with CGI, you get an OutputStream or a PrintWriter from the HttpServletResponse. The OuputStream is intended for binary data, such as a GIF or JPEG image, and the PrintWriter for text output. You can also set all response headers and the status code, without having to rely on special Web server CGI configurations such as Non Parsed Headers (NPH). This makes your servlet easier to install.
Let's implement the body of our doGet method and see how we can use these methods. We will read most of the information we can get from the HttpServletRequest (saving some methods for the next example) and send the values as the response to the request.
protected void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
response.setContentType("text/html");
PrintWriter out = response.getWriter();
// Print the HTML header
out.println("<HTML><HEAD><TITLE>");
out.println("Request info");
out.println("</TITLE></HEAD>");
// Print the HTML body
out.println("<BODY><H1>Request info</H1><PRE>");
out.println("getCharacterEncoding: " + request.getCharacterEncoding());
out.println("getContentLength: " + request.getContentLength());
out.println("getContentType: " + request.getContentType());
out.println("getProtocol: " + request.getProtocol());
out.println("getRemoteAddr: " + request.getRemoteAddr());
out.println("getRemoteHost: " + request.getRemoteHost());
out.println("getScheme: " + request.getScheme());
out.println("getServerName: " + request.getServerName());
out.println("getServerPort: " + request.getServerPort());
out.println("getAuthType: " + request.getAuthType());
out.println("getMethod: " + request.getMethod());
out.println("getPathInfo: " + request.getPathInfo());
out.println("getPathTranslated: " + request.getPathTranslated());
out.println("getQueryString: " + request.getQueryString());
out.println("getRemoteUser: " + request.getRemoteUser());
out.println("getRequestURI: " + request.getRequestURI());
out.println("getServletPath: " + request.getServletPath());
out.println();
out.println("Parameters:");
Enumeration paramNames = request.getParameterNames();
while (paramNames.hasMoreElements()) {
String name = (String) paramNames.nextElement();
String[] values = request.getParameterValues(name);
out.println(" " + name + ":");
for (int i = 0; i < values.length; i++) {
out.println(" " + values[i]);
}
}
out.println();
out.println("Request headers:");
Enumeration headerNames = request.getHeaderNames();
while (headerNames.hasMoreElements()) {
String name = (String) headerNames.nextElement();
String value = request.getHeader(name);
out.println(" " + name + " : " + value);
}
out.println();
out.println("Cookies:");
Cookie[] cookies = request.getCookies();
for (int i = 0; i < cookies.length; i++) {
String name = cookies[i].getName();
String value = cookies[i].getValue();
out.println(" " + name + " : " + value);
}
// Print the HTML footer
out.println("</PRE></BODY></HTML>");
out.close();
}
The doGet method above uses most of the methods in HttpServletRequest that provide information about the request. You can read all about them in the Servlet API documentation so here we'll just look at the most interesting ones.
getParameterNames and getParameterValues help you access HTTP parameters no matter if the servlet was requested with the GET or the POST method. getParameterValues returns a String array because an HTTP parameter may have multiple values. For instance, if you request the servlet with a URL like http://companyservlet/ReqInfoServlet?foo=bar&foo=baz you'll see that the foo parameter has two values: bar and baz. The same is true if you use the same name for more than one HTML FORM element and use the POST method in the ACTION tag.
If you're sure that an HTTP parameter only can have one value you can use the getParameter method instead of getParameterValues. It returns a single String and if there are multiple values it returns the first value received with the request.
You have access to all HTTP request headers with the getHeaderNames and getHeader methods. getHeader returns the String value of the header. If you know that the header has a date value or an integer value you can get help converting the header to an appropriate format. getDateHeader returns a date as the number of milliseconds since January 1, 1970, 00:00:00 GMT. This is the standard numeric representation of a timestamp in Java and you can use it to construct a Date object for further manipulation. getIntHeader returns the header value as an int.
getCookies parses the Cookie header and returns all cookies as an array of Cookie objects. To add a cookie to a response the HttpServletResponse class provides an addCookie method that takes a Cookie object as its argument. This saves you from dealing with the format for different versions of cookie header strings.
If you compile the ReqInfoServlet and install it in your servlet engine you can now invoke it through a browser with a URL like http://companyservlet/ReqInfoServlet/foo/bar?fee=baz. If everything goes as planned you will see something like this in your browser:
________________________________________
Request info
getCharacterEncoding:
getContentLength: -1
getContentType: null
getProtocol: HTTP/1.0

getRemoteAddr: 127.0.0.1
getRemoteHost: localhost
getScheme: http
getServerName: company.com
getServerPort: 80
getAuthType: null
getMethod: GET
getPathInfo: /foo/bar
getPathTranslated: D:\PROGRA~1\jsdk2.1\httproot\servlet\ReqInfoServlet\foo\bar
getQueryString: fee=baz
getRemoteUser: null
getRequestURI: /servlet/ReqInfoServlet/foo/bar
getServletPath: /servlet/ReqInfoServlet

Parameters:
fee:
baz

Request headers:
Connection : Keep-Alive
User-Agent : Mozilla/4.5 [en] (WinNT; I)
Host : company.com
Accept : image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*
Accept-Encoding : gzip
Accept-Language : en
Accept-Charset : iso-8859-1,*,utf-8
Cookie : TOMCATID=TO04695278486734222MC1010AT

Cookies:
TOMCATID : TO04695278486734222MC1010AT
________________________________________
What if you want this servlet to handle both GET and POST requests? The default implementations of doGet and doPost return a message saying the method is not implemented. So far we have only provided a new implementation of doGet. To handle a POST request the same way we can simply call doGet from doPost:
protected void doPost(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
doGet(request, response);
}
Persistent and Shared Data
One of the more interesting features of the Servlet API is the support for persistent data. Since a servlet stays loaded between requests, and all servlets are loaded in the same process, it's easy to remember information from one request to another and to let different servlets share data.
The Servlet API contains a number of mechanisms to support this directly. We'll look at some of them in detail below. Another powerful mechanism is to use a singleton object to handle shared resources. You can read more about this technique in Improved Performance with a Connection Pool.
Session Tracking
An HttpSession class was introduced in the 2.0 version of the Servlet API. Instances of this class can hold information for one user session between requests. You start a new session by requesting an HttpSession object from the HttpServletRequest in your doGet or doPost method:
HttpSession session = request.getSession(true);
This method takes a boolean argument. true means a new session shall be started if none exist, while false only returns an existing session. The HttpSession object is unique for one user session. The Servlet API supports two ways to associate multiple requests with a session: cookies and URL rewriting.
If cookies are used a cookie with a unique session ID is sent to the client when the session is established. The client then includes the cookie in all subsequent requests so the servlet engine can figure out which session the request is associated with. URL rewriting is intended for clients that don't support cookies or when the user has disabled cookies. With URL rewriting the session ID is encoded in the URLs your servlet sends to the client. When the user clicks on an encoded URL, the session ID is sent to the server where it can be extracted and the request associated with the correct session as above. To use URL rewriting you must make sure all URLs that you send to the client are encoded with the encodeURL or encodeRedirectURL methods in HttpServletResponse.
An HttpSession can store any type of object. A typical example is a database connection allowing multiple requests to be part of the same database transaction, or information about purchased products in a shopping cart application so the user can add items to the cart while browsing through the site. To save an object in an HttpSession you use the putValue method:
...
Connection con = driver.getConnection(databaseURL, user, password);
session.putValue("myappl.connection", con);
...
In another servlet, or the same servlet processing another request, you can get the object with the getValue method:
...
HttpSession session = request.getSession(true);
Connection con = (Connection) session.getValue("myappl.connection");
if (con != null) {
// Continue the database transaction
...
You can explicitly terminate (invalidate) a session with the invalidate method or let it be timed-out by the servlet engine. The session times out if no request associated with the session is received within a specified interval. Most servlet engines allow you to specify the length of the interval through a configuration option. In the 2.1 version of the Servlet API there's also a setMaxInactiveInterval so you can adjust the interval to meet the needs of each individual application.
ServletContext Attributes
All servlets belong to one servlet context. In implementations of the 1.0 and 2.0 versions of the Servlet API all servlets on one host belongs to the same context, but with the 2.1 version of the API the context becomes more powerful and can be seen as the humble beginnings of an Application concept. Future versions of the API will make this even more pronounced.
Many servlet engines implementing the Servlet 2.1 API let you group a set of servlets into one context and support more than one context on the same host. The ServletContext in the 2.1 API is responsible for the state of its servlets and knows about resources and attributes available to the servlets in the context. Here we will only look at how ServletContext attributes can be used to share information among a group of servlets.
There are three ServletContext methods dealing with context attributes: getAttribute, setAttribute and removeAttribute. In addition the servlet engine may provide ways to configure a servlet context with initial attribute values. This serves as a welcome addition to the servlet initialization arguments for configuration information used by a group of servlets, for instance the database identifier we talked about above, a style sheet URL for an application, the name of a mail server, etc.
A servlet gets a reference to its ServletContext object through the ServletConfig object. The HttpServlet actually provides a convenience method (through its superclass GenericServlet) named getServletContext to make it really easy:
...
ServletContext context = getServletContext();
String styleSheet = request.getParameter("stylesheet");
if (styleSheet != null) {
// Specify a new style sheet for the application
context.setAttribute("stylesheet", styleSheet);
}
...
The code above could be part of an application configuration servlet, processing the request from an HTML FORM where a new style sheet can be specified for the application. All servlets in the application that generate HTML can then use the style sheet attribute like this:
...
ServletContext context = getServletContext();
String styleSheet = context.getAttribute("stylesheet");
out.println("<HTML><HEAD>");
out.println("<LINK HREF=" + styleSheet + " TYPE=text/css REL=STYLESHEET>");
...
Request Attributes and Resources
The 2.1 version of the API adds two more mechanisms for sharing data between servlets: request attributes and resources.
The getAttribute, getAttributeNames and setAttribute methods where added to the HttpServletRequest class (or to be picky, to the ServletRequest superclass). They are primarily intended to be used in concert with the RequestDispatcher, an object that can be used to forward a request from one servlet to another and to include the output from one servlet in the output from the main servlet.
The getResource and getResourceAsStream in the ServletContext class gives you access to external resources, such as an application configuration file. You may be familiar with the methods with same names in the ClassLoader. The ServletContext methods, however, can provide access to resources that are not necessarily files. A resource can be stored in a database, available through an LDAP server, anything the servlet engine vendor decides to support. The servlet engine provides a context configuration option where you specify the root for the resource base, be it a directory path, an HTTP URL, a JDBC URL, etc.
Examples of how to use these methods may be the subject of a future article. Until then you can read about them in the Servlet 2.1 specification.
Multithreading
As you have seen above, concurrent requests for a servlet are handled by separate threads executing the corresponding request processing method (e.g. doGet or doPost). It's therefore important that these methods are thread safe.
The easiest way to guarantee that the code is thread safe is to avoid instance variables altogether and instead pass all information needed by a method as arguments. For instance:
private String someParam;

protected void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {

someParam = request.getParameter("someParam");
processParam();
}

private void processParam() {
// Do something with someParam
}
is not safe. If the doGet method is executed by two threads it's likely that the value of the someParam instance variable is replaced by the second thread while the first thread is still using it.
A thread safe alternative is:
protected void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {

someParam = request.getParameter("someParam");
processParam(someParam);
}

private void processParam(String someParam) {
// Do something with someParam
}
Here the processParam gets all data it needs as arguments instead of relying on instance variables.
Another reason to avoid instance variables is that in a multi-server system, there may be one instance of the servlet for each server and requests for the same servlet may be distributed between the servers. Keeping track of information in instance variables in this scenario doesn't work at all. In this type of system you can instead use the HttpSession object, the ServletContext attributes, or an external data store such as a database or an RMI/CORBA service to maintain the application state. Even if you start out with a small, single-server system it's a good idea to write your servlets so that they can scale to a large, multi-server system the day you strike oil.
Resources
This article barely scratches the surface on the Servlet API and all the things you can do with servlets. You can learn more by visiting some of the Web sites below:

http://java.sunproducts/servlet/
Sun Microsystem's official Servlet API site

http://java.sunproducts/servlet/runners.html
Servlet enabled Web servers and add-on servlet engines

http://java.sundocs/books/tutorial/servlets/index.html
The servlet chapter in Sun's Java tutorial

http://novocodedoc/servlet-essentials/
Novocode's Servlet Essentials, a Servlet programming tutorial

http://servletcentral.com
Servlet Central, articles about servlet technology, success stories, resources and more

http://javashareware
A database with many servlets, both freeware with source code and commercial products

http://servlets
Information about the O'Reilly Java Servlet Programming book by Jason Hunter and William Crawford
Hans Bergsten has worked in the computer industry for 18 years, with everything from IBM mainframes to PCs.
Last year he founded Gefion Software, a software development company focused on platform-independent network-based applications.
The current product line includes a free Servlet Engine named WAICoolRunner and a Servlet-based product for database access named InstantOnline Basic.