XML Legacy Migration XML Legacy Migration XML Legacy Migration XML Legacy Migration XML Legacy Migration
xml Home Forecross Corporation Xml Solutions Migration Solutions Integrity Solutions XML Links Information and News Investor Relations
Second-Generation Legacy to Web Strategies via XML
Part 1: Business Strategic Issues

by
Don Estes

gradient.gif (3616 bytes)

1.6 XML Schema and Data Integrity

B2B e-commerce is the ultimate challenge in program-to-program data sharing. Implementing XML as the data exchange mechanism addresses a critical part of the problem with its loosely coupled architecture, but another critical part remains: data integrity. Successful general-purpose data exchange requires absolute data integrity.

Data integrity must be considered at 4 levels:

  • Physical layer - hardware
  • Logical layer - basic data attributes
  • Context layer - where and how used
  • Semantic layer - the meaning of the data

The physical layer is usually taken for granted, ensuring that parity errors and other failures at the hardware level cannot contaminate our data. The logical layer requires programmers and database analysts to coordinate each data item definition with regard to its basic attributes, such as type (numeric, alpha, date, etc.), size, format, scale, valid values, and etc. Getting the logical layer right has been an on-going challenge since the advent of computers, but diligence combined with solid tools and procedures generally makes a failure at this layer an unusual event at most sites. The context layer is the focus of data integrity issues at most sites. This involves referential integrity among data files and database tables, data flow between modules, and where and how the data items are used. Given that the logical layer is under control, this is the layer where most data related failures will occur. The semantic layer is seldom the focus of data integrity concerns, but this will change with a vengeance as B2B data exchange becomes com-mon. When all data is exchanged internally, there may be differences in the meaning of a given data item's definition, but it is readily resolvable when issues arise because both sides of the exchange are entirely under one roof. This will not be so with B2B data exchange. When data must be exchanged among partners and competitors, among dissimilar cultures and languages, and among differing hardware and software platforms, we are facing a digital version of the Biblical Tower of Babel.

XML provides the mechanism to address this issue through the concept of data tags. Regardless of what the data name or names by which a data items is identified in different contexts, it is assigned a universal data tag which uniquely identifies the semantic meaning of that data item whenever it is used. For example, our recent Y2000 experience could be characterized as a data tagging exercise, in which we had to identify dates wherever they appeared, ensure that their format was correct, and that their use was correct. Now, we need to do the same thing for all data items that are candidates for data exchange in B2B e-commerce. However,Y2000 was comparatively simple, because there was little problem agreeing on what MMDDYY or YYYYMMDD meant. Consider a simple tag such as "name". Does this mean "first name + space + last name", or "last name + comma + space + first name", or what? And that's a USA-centric view of the problem. How do you include the fact that Spanish names have two family names? What an American would consider to be the family name would not be the last name in Spanish and elsewhere. In many Asian countries the family name is the first name. Then there are cultures where some people have only one name. Globalization with its complementary problem of localization of display and usage is beginning to expose this sort of problem even before B2B e-commerce issues are considered.

Previous Next

Forecross is a registered trademark of Forecross Corporation.
Copyright © 1996-2008 Forecross Corporation
All Rights Reserved.