|Article >> Putting a legacy system on the Internet|
Putting a legacy system on the Internet
Every day the Internet gains more importance for both business, academic and personal uses. Already it has become the case that any serious business that cares about its image and customer service has to have an Internet presence in order to properly get out their message and compete effectively. As this trend continues, the need to place legacy information on the Internet for public and commercial purposes will increase. This article takes a look at some of the problems faced when interfacing a legacy system on the Internet, and some solutions that may be applicable to the unique set of circumstances that make for both a challenging and exciting project.
Legacy approaches vs. the Way of the ‘Net
The basic problem with interfacing a legacy system to the Internet has to do with the great change over the years in how the world is viewed in terms of computer access. The technical world is something very different today than it was just a short time ago. The paradigms have shifted to say the least. From central control to the desk top and back again to something very different than the original. We have come a long way in our cycles and back again.
The first computers were designed and expected to perform only one task at a time. A single program would be run and the results of this single program would be analyzed - etc.
Things progressed in the 60s to a time shared system. In such a system, more than one program could execute at the same time. This took great advantage of system resources, for as every programmer knows, it takes eons to perform file access as opposed to disk access. While the one program was waiting for a file buffer to fill, the other program could be performing a matrix calculation.
The next shift was to interactive processing. Each user was a job on the system, and the computer serviced each one in a sort of round robin fashion. State information is maintained for each user from the time the user signs on to the system to when they log off. At all times, the CPU knows who is at the end of each wire and what they are up to. Herein lies the problem one faces when converting such a system to the Internet.
In the world of the Internet, there is no such thing as a job associated with an individual user. By the very nature of the Internet, it is very likely that the next request from a given Internet user will be for a server half way around the world from the server that fielded that pervious request. In this case, each interaction with any server, with the exception of the user’s provider, can be considered as a sort of one-night stand. The connection is made, the information received, and as far as anyone knows, this is the end if it for all of eternity. This sort of situation just does not lend itself well to the interface of a legacy system which expects each user to be tied up neatly as an entity known as a job that has a specific identity with in the system. For example, logically it is meaningless for an Internet job to request the next page of information. The question arises - ‘Next page from where?’ With no associated state information, there is no way of knowing the answer to this question.
This problem becomes evident when considering a typical lookup situation. For example a name is entered, Smith. The computer locates the first screen full of Smiths and sends them back down the pike. However, the desired name is still not on the list, and the user requests to see the next screen full of names. In most cases, the legacy system is not designed to accept a request of "Give me screen two of all the Smiths," but rather "Give me the next screen full of data." The difference between these two concepts, although subtle, is quite profound to the designer. In the later case, there must be some sort of state information maintained to determine who’s next screen full of data to send to whom. This becomes a key part in designing the interface, so that the Internet server believes that it is operating in a stateless environment, and the host system believes the opposite.
The browser back button presents a similar problem to the interface designers. With this button, the user can page back to a previously cached page, thus ending up on a different screen than the host’s information indicates. This can lead to un-predictable results if the user request another page full of data.
Preliminary Design Considerations
The most important consideration in adapting an existing system to the Internet is to resist the temptation to develop parallel systems; one to continue providing the existing service, and the other to interface with the Internet. Not only is this a maintenance nightmare, it is also valuable to be able to operate the existing system locally as well as through the Internet for debugging purposes. If at all possible, it is highly recommended that the existing system be made to operate interactively while also being operated via the Internet. The small amount of time taken to provide this function will pay for itself many times over during the testing and debugging phase.
Modifying the Existing System to be Stateless
One way to deal with the state/stateless problem is to modify the legacy system to allow for a stateless request. For example "Give me the third screen full of names that begin with ‘Smith.’" One key advantage to the Internet over the traditional ‘Green Screen’ format is that the idea of a screen full of data need not be as restrictive as it once was. For example, if the legacy system has at most 2 or 3 screens full of data for each request, it may be advisable to just send the entire batch to the Internet requester and let the user take advantage of the scroll bars to navigate through the data. In some cases, it may be possible to combine several screens worth of input from the Legacy system into one screen in the web browser. (See Putting the Library on the Net - a case study.) In other words, let the function of the legacy system rather than the actual screens drive the Internet request.
When taking this approach, it is also necessary to monitor the jobs on the host system and cancel them after a set amount of time. With no way to determine whether or not the Internet user will make another request, this is the only practical method to deal with this situation.
Dealing with the Back Button
To prevent pages from getting out of sync, it is possible to prevent the browser from caching the page. To do this, use a Pragma No-Cache header. I also like to put in an expiration date that has passed as well, just for good measure. When the user clicks the back button, a message will be displayed indicating that the page has not been cached, and allowing them to re-load it. This causes a request to go to the server, thus indirectly informing the host of a page back request.
Avoiding the no-cache message
In most cases, it will be desirable to avoid this message and just send the request back to the server without involving the user. This can be done be using method=get instead of method=post in the CGI script. With method=get the request is stored in the URL instead of a file on the server, and this causes the request to just be sent back to the server with no fuss. It takes a little modification of the CGI script to use a method=get rather than method=post, so it is important to be aware of this consideration up front before designing the CGI code.
The Internet provides exciting new opportunities for business, education and all other aspects of life. An important part of the growth of the Internet on into the next century will continue to involve the placing of valuable legacy information accessible via this new medium. Some planning and information up front can aid in this process and minimize the negative impact.
Guatemalan Dark Roast