| Introduction |
| History |
| Components |
| Implications for E-Commerce |
| XML and EDI |
| Community Standards |
| Predictions and Recommendations |
| Sources |
The introduction of the Extensible Markup Language, or XML as it is commonly known, created a buzz in the business world, particularly within the e-commerce community. It "provides both a standards-based way to identify the information that is of importance in a particular application, and the ability to process information tagged according to highly user-specific requirements with general-purpose software, such as editing tools, composition engines, and electronic browsers" (Usdin & Graham, 1998, p.125). In simpler terms, XML allows users to customize a markup language and apply it to an information object that can then be interpreted to determine its contents, whether it is an order form, a newspaper or an advertisement. Given these descriptions, it becomes apparent that XML is a tool, an enabling technology that can be used in conjunction with other tools to provide powerful Web applications. How this tool can be customized and utilized by the business community is the subject of this white paper.
XML's roots lie in the Standard Generalized Markup Language, or SGML. SGML was developed 20 years ago as a formal method of annotating documents to describe their meaning and structure, but it's complexity and cost hindered widespread acceptance. However, a subset of SGML called the Hypertext Markup Language, or HTML, is a phenomena that has enabled the rapid growth of the Web over the past decade. Used primarily for stylistic and formatting purposes, HTML has caused anxiety for many of its users who were interested in utilizing its tag set for more complex presentation control, data processing and programming (Treese, 1998). Because of these issues, the World Wide Web Consortium, or W3C, started a working group for a new subset of SGML, XML, in January 1997. The group "proposed a markup language that could work in concert with existing Web technologies, using some of the tools developed for use with HTML, while moving forward with more manageable techniques" (St. Laurent, 1999, p.11). A year later, in February 1998, the XML specification was ratified as a W3C standard.
While XML has its foundation in SGML, its philosophy differs and is based on four fundamental principles (Usdin & Graham, 1998).
The basic components of XML are similar to that of HTML: tags, elements and their attributes. A tag is a piece of markup such as an opening tag <P> and a closing tag </P> . When combined, these tags are used in the composition of elements. For example,
<P align="center"> This text is part of a paragraph element. It includes the <B> bold </B> element and the <I> italics </I> element. </P>
The entire paragraph has 6 tags comprising 3 elements, 2 of which are contained within the paragraph element. The paragraph element also contains an attribute specifying that the paragraph should be centered on the page. This style of markup is used in the creation of XML documents, which can be of two types: well-formed and valid. A well-formed document is syntactically correct and can be interpreted by the computer but does not refer to a Document Type Definition (DTD) that specifies tag requirements and allows the document to be validated. Syntactical correctness includes:
<?xml version="1.0"?>
<CATITEM CATEGORY="rugs">
<ITEMNAME>South Shore decorator rug</ITEMNAME>
<DESCRIPTION><STORY>This rug will add a new dimension to any room in your home and protect those hardwood floors life's daily activities.</STORY>
<FEATURES>Resilient, textural sisal, complemented by canvas band in dark green, black, blue or natural.</FEATURES></DESCRIPTION>
<MANUFACTURER NAME="PB-1">Pottery Barn</MANUFACTURER>
<ITEM><PRODNAME>South Shore decorator rug</PRODNAME>:
<LENGTH>5</LENGTH><WIDTH>7</WIDTH>
<PRICE>$99.95</PRICE>
<AIRSHIP>$14.00</AIRSHIP>
<GROUNDSHIP>$7.00</GROUNDSHIP></ITEM>
</CATITEM>
A valid XML document is well formed and complies with the guidelines of a DTD, which defines a tag set like the one used in the catalog entry example above. The DTD can be part of the XML document, or it can be referred to by the XML document. An example of a DTD for a catalog entry and the code that refers to it can be seen below. The XML document can be displayed by clicking here.
Document Type Definition XML Document
<!ELEMENT CATITEM (ITEMNAME,DESCRIPTION,PICTURE)>
<!ATTLIST CATITEM
CATEGORY CDATA #REQUIRED>
<!ELEMENT ITEMNAME (#PCDATA)>
<!ELEMENT DESCRIPTION (STORY,FEATURES,PRICE)>
<!ELEMENT STORY (#PCDATA)>
<!ELEMENT FEATURES (#PCDATA)>
<!ELEMENT PRICE (#PCDATA)>
<!ELEMENT MANUFACTURER (#PCDATA)>
<!ATTLIST MANUFACTURER
NAME CDATA #REQUIRED>
<?xml version="1.0"?>
<!DOCTYPE CATITEM SYSTEM "catalog.dtd">
<CATITEM CATEGORY="rugs">
<ITEMNAME>South Shore decorator rug</ITEMNAME>
<DESCRIPTION><STORY>This rug will add a new dimension to any room in your home and protect those hardwood floors life's daily activities.</STORY>
<FEATURES>Resilient, textural sisal, complemented by canvas band in dark green, black, blue or natural.</FEATURES>
<PRICE>$99.95</PRICE></DESCRIPTION>
<MANUFACTURER NAME="PB-1">Pottery Barn</MANUFACTURER>
</CATITEM>
The collaboration the XML document and the DTD provides content for the browser (in this case, Internet Explorer 5) to interpret and display. The <?xmlversion"1.0?> and the <!DOCTYPE CATITEM SYSTEM "catalog.dtd"> make up the prolog of the XML document, or "the glue that binds DTDs to the code that applies to them" (St. Laurent, 1999, p.117). The first statement tells the browser the version of XML in use, and the second statement provides the filename of the DTD, whether it is a system or public DTD, and its location/file name on the system. A system DTD is one that has been developed for a particular Web site or business, while a public DTD has been developed for use by types of organizations (e.g. advertising, newspapers etc.).
The elements and attributes comprise the logical structure of the XML document. The DTD defines the available elements and attributes, and these specifications can be incorporated by a single XML document or document groups. In the example above, <CATITEM> is the root element and contains the attribute CATEGORY. The value of the attribute, rugs, is enclosed in quotation marks. <DESCRIPTION> is also an element, and it is the parent element to the <PRICE>, <STORY>, and <FEATURES> elements. Another element, <MANUFACTURER> also contains an attribute, NAME, which requires the name of the manufacturer.
Notice the contents the XML documents presented above are not formatted; formatting requires the use of a stylesheet such as CSS (Cascading Style Sheets), or XSL (Extensible Style Language). Using a stylesheet adds another layer of complexity to the XML document display process. In the XML document, a line is added below the <?xmlversion"1.0?> line that contains a reference to the CSS formatting file such as, <?xml-stylesheet href="xml.css" type="text/css"?> . The contents of this CSS file are shown below.
CATITEM {
display:block;
font-family:arial;
}
ITEMNAME {
display:block;
font-size:16;
font-weight:bold;
}
DESCRIPTION {
display:block;
font-size:12
}
STORY{
display:block;
font-size:12
}
FEATURES{
display:block;
font-size:12;
}
PRICE {
display:block
font-weight:bold;
color:red;
}
MANUFACTURER {
display:block
font-weight:bold;
color:blue;
}
In this example, each element of the DTD and hence the resulting XML document is displayed according to formatting qualities such as display, font-size, font-weight, and color. The display style determines whether the contents of an element will be displayed as a separate paragraph or within an existing paragraph. Font-size, weight and color all refer to the style of the text. The view the catalog entry that refers to both the DTD and the CSS, click here.
The most ubiquitous and general effect of XML is the integration of different data
sources and the consequences of integration. The logistics of integrating data
sources have been limiting. Different legacy systems are difficult to transcend or
integrate into seamless new systems in order to process the data jointly. The departments
within an organization may have developed their own databases and processes to support
their efforts without coordination with other departments, often duplicating efforts.
Recent efforts have emphasized sharing knowledge within an organization, and large
custom-built systems were preferred over boxed applications. The rise of the Internet
and the Worldwide Web urged organizational data to appear on a seamless interface
for the customer. Back-end systems could be displayed via HTML, but HTML could not
transport or define the organizational information. XML offers that missing link.
XML facilitates integration of data from multiple sources that are disperse and/or
incompatibly formatted while retaining the meaning of the data through each step
of processing. The value-added by XML is retrieving data from several sources, combining,
customizing and stepping to the next process. Aggregating information from multiple
databases allows organizations to personalize the data and deliver it to browsers while
the original information stays in its original database in various formats. "Without
XML, data retrieval particular to each database would have to be implemented. The problem
with that is that you can not easily change what information you want or how it should
be combined" (SoftQuad White Paper).
In other words, XML really just tags relevant data with explanatory information; saying what
the data is and allowing manipulation of the data.
Companies have been "pushing HTML to its limits" (SoftQuad), attempting to use HTML
to provide more information than its tags were designed to hold. The next wave of
business web sites are sites shaped by their data, not their format alone. Organizations
can now utilize all of the data and information available to them to make their processes
much more robust and flexible. Managing the updates of information will be much easier
because data can be changed at the interface and be stored in the proper databases
without having to understand multiple sources. Companies can offer their users a variety
of "customizable slices of display data." Developers will have a new tool at their
disposal and smaller to medium-sized companies will be able to afford a demanding web
presence without trashing entire back-end legacy systems.
In addition to internal data integration, XML can perform a similar function for
business to business (B2B) transactions. It meets many of the interests of exchange between
businesses. B2B transactions via the web are experiencing exponential growth. Any inclusion
that could facilitate these transactions will also ease the amalgamation of the entire supply
chain. For example, a vendor would be able to utilize the information in systems of their
suppliers and/or manufacturers without physically transferring and matching up the
information within them. As long as all of the members of the supply chain are utilizing
the same XML tags, then the transactions among them are mutually functional. "Instant availability
transforms rigid supply chains
into 'supply Webs,' in which participants transact business spontaneously" (Glusko, et al).
With XML supply chain integration can more effortlessly implemented and incorporated into
the organization.
Electronic Data Interchange (EDI) has been the communication of B2B transactions for
many organizations with different equipment and connections. Although efficient at transferring
data, EDI implies direct computer to computer transaction using private networks and EDI
specific data formats (TechEncyclopedia).
EDI systems are intrinsically complex, expensive and proprietary networks, and brittle
syntax necessitates a custom integration solution between partners in a supply chain (Glusko).
Formal EDI standards were developed twenty-five years ago, but new business practices,
development of global economies, and advancements in computer technologies are just several
of the factors that have made those standards unworkable and impracticable for many
organizations (Laplante). Since the rise of the Internet, EDI has also begun to appear rather
unyielding for data transfer across the preferred Internet protocol, TCP/IP.
With the Internet, a universal platform for multi-directional information exchange permeated the business world.
Small to mid-sized companies that could not afford
EDI systems will be able to get into the e-commerce game; others that were able to afford
EDI systems will be able to utilize these existing systems. EDI/XML systems make the supply can
make chain flexible, increasing the circles of businesses that can interoperate.
XML enables businesses to connect, and increases the viability of EDI systems. EDI can
be carried by XML over TCP/IP. "EDI's attraction to XML lies in their shared love of
specificity" (Weiss, 42). Without XML, using EDI in coordination with the Internet was like
"square pegs and round holes" (Laplante).
Many XML projects are currently being worked on in the business community. The Open
Buying Initiative (OBI) provides a way to define the interactions between trading partners.
The Open Trading Protocol (OTP) provides a framework for consumer electronic commerce that can
incorporate different kinds of purchasing and payment protocols (Treese). The OTP is a
consortium with over thirty member companies that have developed an XML standard for
information exchange on the Internet to enable a framework for multiple forms of electronic
commerce (Usdin and Graham).
Other communities of business and organizational partners are working together on
common XML tags and data elements. The Astronomical Markup Language, the Legal XML
Working Group and Genealogical Data in XML are only a few examples of communities working
on common terms.
While some may wonder if XML will live up to all of the hype, several players in the Internet
world have already prepared for this evolution. Many of the browsers, specifically Internet
Explorer 5.0 and Netscape Communicator 5 have already extended support of XML. XML parsers
for several programming languages are now available, including Java and C++. Larry Wall, the
inventor of Perl announced that it would also soon support XML.
Creating Web pages that act like database records is a logical next step in the digitizing of data.
Companies should, at the very least, be thinking about XML and the possibilities of integrating
XML into their existing sites. Those who do not will be left playing catch-up with a competitor, and
may even risk losing business opportunities if they cannot easily conform to a community of standards.
Continuing to develop proprietary data forms will limit future growth and inclusion into
supply chain circles.
Companies will be using XML documents for publishing product catalogs, bank statements, placing orders
and scheduling shipments. The integration of data will permanently transform the use of the Internet
in business. The need for custom interfaces with every customer and supplier will be gone, empowering
buyers to compare products across vendors and formats. Sellers will be able to publish their catalog one
time and reach several potential buyers. Online businesses will build on each other's content and services
to create a new level of virtual markets and trading. Fears that XML-tagged information will make it
too easy for buyers to compare products and prices, or will compromise data integrity to their competitors will
realize that opportunities will be lost as e-commerce proliferates.
XML will provide businesses with benefits, but it will not be without its difficulties. It will be more
difficult than HTML because there are no ubiquitous rules or manuals. Communities developing XML
standards and libraries will overlap and conflict. Each supply chain cannot invent its own XML tags for
products and catalogs, or "the web would be scarcely more usable as a platform for agents and other
automated processes than it is today." The need for standards is obvious to many, as seen by all of the
XML working groups and communities. However, they are acting independently. The W3C will be pressured
to standardize certain building blocks that companies can mix and match to assemble XML applications
quickly while preserving the ability to appropriately customize them.
XML will fundamentally change the future of e-commerce. HTML was the simple and powerful tool that
made the first wave of e-commerce, the business-to-consumer phase, possible. XML is the tool that will
enable the second phase of e-commerce, business-to-business, universally achievable. XML will facilitate
enterprise integration. While the infrastructure of the Internet filled in the communication gaps that
limited e-commerce, XML will fill in the information gap between participants in the supply chain.
Implications for E-Commerce
XML and EDI
Community Standards
Predictions and Recommendations
Cited Sources
Reference Sources