Shantanu Sen
4 min readApr 7, 2022

--

EVOLUTION OF WEB SERVICES — 1

HTML to XML

A computer, acting as a server, is having tons of digital assets in the form of rich text pages, images, music and movie.Another computer, from thousands of miles away, connects to the server using a complex web of interconnected communication networks, and transfers whatever digital asset it wants to itself for its immediate consumption. The complex web of interconnected communication networks has a better name. It’s called the Internet. The resource-hoarding, capitalistic computer also has a better name. It’s called a Web Server.
To start with the web severs were dumb clerks. They could give you information only it you could navigate them to the precise location. So, in order to know about Alaska, you had to click the following links in order: Nations > USA > States > Alaska. There was no option to search for “the coldest province of USA”.
The usual arrangement was :

But over time and keeping in pace with the explosive growth of the Internet, the humble web server became more responsive and computationally capable. It still continued to serve HTML pages. But those pages had more meaningful information. This it achieved by having some clever political alliances. Like this:

The application server is the game-changer. Application servers handled the problem in a crude way. They never attempted to understand the Internet. They were built in such a way that when invoked with a parameter string they would respond back with a complete HTML page. The web server would play the postman and deliver it to the server.

Application servers made the tough job of blending static content with the dynamic easy. Heavy text processing became the order of the day. To understand what I mean by blending of contents please take a look at the following random text extracted from www.zauba.com:

Microsoft Corporation (india) Pvt Ltd is a Private incorporated on 22 July 1988. It is classified as Non-govt company and is registered at Registrar of Companies, Delhi. Its authorized share capital is Rs. 150,000,000 and its paid up capital is Rs. 24,114,760. It is involved in OTHER BUSINESS ACTIVITIES

Microsoft Corporation (india) Pvt Ltd’s Annual General Meeting (AGM) was last held on 23 September 2020 and as per records from Ministry of Corporate Affairs (MCA), its balance sheet was last filed on 31 March 2020.

Directors of Microsoft Corporation (india) Pvt Ltd are Sashikumar Sreedharan, Benjamin Owen Orndorff, Keith Ranger Dolliver, Vivek Mehrotra, .

Microsoft Corporation (india) Pvt Ltd’s Corporate Identification Number is (CIN) U74899DL1988PTC032549 and its registration number is 32549.Its Email address is sanjay.agarwal@microsoft.com and its registered address is 807, New Delhi House Barakhamba Road New Delhi Central Delhi DL 110001 IN.

The italicized part, added by me, is dynamic. The rest is static.

The browser gets the same text in this format:

<!doctype html><html><head><meta charset=”utf-8"><title>Untitled Page</title><link href=”Untitled1.css” rel=”stylesheet”><link href=”index.css” rel=”stylesheet”></head><body>
<div id=”wb_Text1" style=”position:absolute;left:86px;top:43px;width:250px;height:432px;z-index:1;”><span style=”color:#000000;font-family:Arial;font-size:13px;”>Microsoft Corporation (india) Pvt Ltd is a Private incorporated on 22 July 1988. It is classified as Non-govt company and is registered at Registrar of Companies, Delhi. Its authorized share capital is Rs. 150,000,000 and its paid up capital is Rs. 24,114,760. It is involved in OTHER BUSINESS ACTIVITIES<br><br>Microsoft Corporation (india) Pvt Ltd’s Annual General Meeting (AGM) was last held on 23 September 2020 and as per records from Ministry of Corporate Affairs (MCA), its balance sheet was last filed on 31 March 2020.<br><br>Directors of Microsoft Corporation (india) Pvt Ltd are Sashikumar Sreedharan, Benjamin Owen Orndorff, Keith Ranger Dolliver, Vivek Mehrotra, .<br><br>Microsoft Corporation (india) Pvt Ltd’s Corporate Identification Number is (CIN) U74899DL1988PTC032549 and its registration number is 32549.Its Email address is sanjay.agarwal@microsoft.com and its registered address is 807, New Delhi House Barakhamba Road New Delhi Central Delhi DL 110001 IN</span></div></body></html>

Now you have to intercept that chunk of text at the browser’s end and extract two types of info: 1. names of the directors and 2. CIN, for further processing. There is an unreliable process called Web Scraping. But it is error prone and requires a lot of manual checks to be reliable in a computing environment. To cut a long story short, the demand of the hour was to make data available In a machine-readable format.

The B2B segment was the key driver. A buyer issuing a purchase order to a vendor in HTML format would put a stop on the downstream processing of the same. But a machine-readable PO would be read by the vendor’s system with ease and subsequently internalized as a Sales Order.

The next question to be answered was which particular document format was to be used as a standard?In the early days of the millennium, only one option remained. That was XML, SOAP — Simple Object access Protocol — was the best XML carrier available.

--

--

Shantanu Sen

I was born with a curious and carnivorous mind. A career in IT gave me an opportunity to feed it liberally. From PHP to OData, everything got accommodated.