[[TracNav(TracNav/ISO15926Primer)]] [[Image(wiki:IdsAdiBranding:Logo-128x128.gif)]] = History of ISO 15926 = ---- [[PageOutline(2-4,Contents,inline)]] == Abstract == Interoperability of digital information became an issue almost as soon as computers made their way into engineering offices. Many organizations from around the world have been working on this topic for many years, from Owner/Operators, Constructors, Consulting Engineers, and Software Developers. Many standards organizations world wide are involved, some having been created just for this purpose. ---- == Metaphor: Interoperability is Like Heavier-Than-Air Flight == There have been many attempts at interoperability, some fizzling out in a few years, some lasting until today. Different organizations, with different needs have tried slightly different approaches. All of these attempts have had to deal with how to convey the ''meaning'' of the data as it (the data) is being transmitted. Some solutions are based on limiting the scope of the data in order to simplify the task of conveying meaning, others attempt to allow unlimited scope. At the lowest level, interoperability is extremely complex, just as the mechanics of flying is extremely complex. Fortunately, when it is mature, ''using'' ''ISO'' ''15926'' will be about as complicated as ''using'' ''flight'' is today. For instance, your humble author, sitting in the middle of Western Canada in the coldest winter since Al Gore started on the rubber chicken circuit, is right now thinking about using heavier-than-air flight. But if I do, I will not have to concern myself with things like power-to-weight ratios, or the exact curve of the wing to maximize the difference in air pressure between the upper and lower surfaces. I will simply phone my travel agent and book a flight to Mexico. Similarly, when ISO 15926 is mature, all most users will need to know is which button to push to connect to a business partner. ISO 15926 is a solution to interoperability of plant information made possible by the confluence of four areas of interest: * How we store and exchange textual information * How we know and understand things * How we use the Internet to find things * How we store and exchange plant information We may well end up with different tools for interoperability, just as there are many solutions today for heavier-than-air flight depending on one's need (glider, propeller airplane, jet airplane, helicopter, lifting body). But just as in flight, where the common element to all modes of flight is a particular shape of whatever is doing the lifting (wing, rotor, aircraft body), we are starting to see that the dictionary of terms is becoming a common element. In Figure 1, below, this is shown as the common use of ISO 15926-4, the reference data library. [[Image(History01.JPG, 500px)]] '''Fig 1 - History of ISO 15926''' ---- == How We Store and Exchange of Textual Information == Human society has always had to find ways to manage, store, and retrieve information. The Library of Alexandria, which burned down in 48 BC [http://en.wikipedia.org/wiki/Library_of_Alexandria (according to one story)], is an example of both the best technology for managing information in hard-copy form, and a major limitation of doing so. With the advent of computer-managed storage in the mid twentieth century, information managers have had to grapple with two problems: * Survival of information beyond the lifetime of proprietary hardware and software. * Moving a large amount of information between proprietary systems. === Dealing with Proprietary Hardware === A typical example of these types of questions is a help desk inquiry from the mid 1980's: ''I have data I want to keep for decades. Should I invest in a good card reader, or should I transfer my data to these far more efficient but newfangled "floppy disks"?'' Unfortunately, the best answer to this kind of question has always been rather labor intensive. That is, the only reliable way to keep digital information for decades is to upgrade your storage media every few years to whatever is the latest and greatest at the time. For personal use, in the 1980s it would have been 5 1/2" floppy disks. By the 1990s you would have had to copy your archive to 3 1/2" floppies. Then, sometime around 2000, the best storage medium became CDs, and a bit later, DVDs. At first everyone thought they would last for decades, but sometimes they didn't even last two years: * [http://www.computerworld.com/action/article.do?command=viewArticleBasic&taxonomyName=storage&articleId=9123244&taxonomyId=19&intsrc=kc_top Restored DVD key to conviction in criminal case] Now, nearing the end of the first decade in the twenty-first century, flash drives are looking like they will be readable for quite awhile. But ask yourself the likely hood of personal computers having USB ports in twenty years? Maybe, but whether in twenty years or forty years, at some point you will still have to load up your thumb drives and copy them to some new media; perhaps a three-dimensional, holographic memory block. === Dealing with Proprietary Software - Personal Scale === Unfortunately, even if you go through the exercise of transferring your archive every few years, how are you going to open the files twenty-five years from now? In the lifetime of your humble author (who is so old he can remember when an entire family had to make do with a single telephone), the word processor of choice has gone from !WordStar, to Word Perfect, to Microsoft Word. (This would be a good place for a Mac vs PC joke if I could think of one!) Working with Word 2002, now, as this is being written, we can see that Word users can open the following word processor file formats: * Word 2.0 * Word 5.1 for Mac * Word 6.0 (95) * Word Perfect 5.0 * Works 2000 Where is my beloved !WordStar? In addition to copies of all my data files, do I have to keep copies of all my old authoring software? And even if I do, what will I run it on? Do I also have to keep a working model of each vintage of personal computer? What if it breaks down? So now, if I actually want to be able to retrieve my personal archives for decades (perhaps I am thinking that after I become a famous author, a publisher will give me a million dollar advance to write my memoirs), I will have to open each of my archived files every couple years and somehow transfer the contents to whatever the new authoring software is. This will remove the problem of having to keep old hardware and software around, but will introduce a new set of problems: First, this solution will create an upper limit on how much information I can keep around. Since it will take a certain amount of time to upgrade my archive each cycle, I will have less and less time each round to create new information. Eventually I will just finish one upgrade when I will have to start over with new technology. Second, who's to say there will always be a clear and easy upgrade path from one authoring software to the next? For example, what if I have a large number of files authored with obscure CAD software? What if none of the current set of dominant players did not write the appropriate conversions into their offerings? Well, there is another option: [[Image(History_LongtermStorage.JPG, 500px)]] '''Fig 2 - Long Term Information Storage Using the Internet''' ''(This is taken from a Slashdot discussion on the topic of long-term data storage. [http://ask.slashdot.org/article.pl?sid=08/12/13/1434216 Here is the complete article.])'' === Dealing with Proprietary Software - Industrial Scale === If the problem of moving information between proprietary systems is daunting on a personal level, try to imagine what it is like for organizations that create large bodies of documentation. For instance, every model of aircraft you see today requires several million pages of documentation which has to be revised and published every quarter. [http://www.amazon.com/Charles-Goldfarbs-XML-Handbook-4th/dp/product-description/0130651982 (XML Handbook)] The combined documentation libraries of the aircraft industry probably rivals the size of the entire world wide web. Yet every few years the dominant hardware changes, and along with it, the software used. Governments and law firms are in a similar situation. === Markup Languages === It is precisely these issues, the survival of information beyond the lifetime of proprietary hardware, and moving a large amount of information between proprietary systems, that prompted Charles Goldfarb, with Ed Mosher, and Ray Lorie at IBM to create "Generalized Markup Language" (GML) in the early 1960s. * GML * SGML * HTML * XML Except for GML (which became SGML), all of these markup languages are in wide use today. SGML is used for managing large bodies of textual information. HTML is the language of the World Wide Web, linking documents for human retrieval. XML is increasingly being used to manage large bodies of ''knowledge'', including plant information with ISO 15926. Most people will not need to know how markup languages are used to manage plant information, but a brief history of markup languages will be interesting for background information. * [wiki:ISO15926Primer_History_Markup The History of Markup Languages] ---- == How We Know and Understand Things == When we go beyond custom-built methods to exchange information between two particular computer applications--when we try to design a way for any two computer applications to connect to each other automatically without having to know anything at all about each other--we confront the question of how we represent knowledge. This is not just sophistry; if two computer systems are to connect to each other automatically, we must have a way to embed the necessary context (the understanding that humans bring) within the data that is being exchanged. For this we need to understand how we know things. The study of how we know things in philosophy and mathematics is called ''ontololgy''. ''Philosophy, sufficiently advanced, is indistinguishable from bulls**t.[[BR]] --Greg Berge'' === Ontology === The study of ontology is well beyond what most people will need to know in order to use ISO 15926, and therefore beyond the scope of this primer. However, a brief example to explain what ''ontology'' is will be helpful: Your humble author rides a bicycle to work most days. (Among other things, it lets me indulge in the luxury of eating the fine Ukrainian food my wife cooks for me!) The distance to work makes a nice workout but is beyond walking if the bicycle were to break down. Therefore, I have developed what you might call an ''Ontology'' ''of'' ''Things'' ''That'' ''Will'' ''Carry'' ''A'' ''Bicycle.'' Now, in Western Canada, which to most Europeans is but a few years out of the horse age, the pickup truck is king. In Western Canada, all ''Real'' ''Men'' have pickups. As you can see from Figure 3, there is ample room in a pickup truck to carry a bicycle. [[Image(History_PickupTruck.JPG, 150px)]] '''Fig 3 - Pickup Truck''' So it is not hard to imagine that if my bicycle broke down on the way to work, I would try to think of everyone who owned a pickup truck that might have driven it to work that day. Suppose one such friend is Bill, who owes me a big favor. But when I talk to Bill he tells me he can't help me. He tells me he is going camping that weekend and to make a fast getaway he's already loaded his camper. How do I know this will be a problem? Because I know that when you load a "camper" onto a pickup truck, there is no room for a bicycle. [[Image(History_Camper.JPG, 150px)]] '''Fig 4 - Pickup Truck with a Camper Loaded''' But hold on! My father used to own a camper for his own pickup truck (he being a ''Real'' ''Man'' and all), and I remember looking inside it. There was space just inside the door that might be able to fit a bicycle. Alas, Bill tells me, he has already filled the available space with his other camping gear leaving no room. So with that conversation, I start planning how to get home on public transit. Being a ''Real'' ''Man'' myself, I own a pickup truck and will have to drive it back to work to pick up my bicycle. But by coincidence, a new engineer, who's just emigrated from the Czech Republic, walks by and overhears my dilemma. He tells me that when he moved to Canada, he brought with him his ''Felicia'' ''Fun''. I can't imagine what a ''Felicia'' ''Fun'' is, but judging by the expectant smile on his face I suspect it might be relevant so I ask him about it. Being new to Canada he doesn't know how to describe it so he says it is like the F150 his friend has, but a bit smaller. (The Czech Republic has ''Real'' ''Men'' too!) I immediately accept his kind offer to drive me and my bicycle home after work. (Oh, and I owe him a ''really'' big favor. Perhaps I will invite him in for Ukrainian food!) How did I know that a ''Felicia Fun'' would carry my bicycle? Because it is "like an F150", which is the name of a particular brand of pickup truck common in North America. Figure 5 shows the relationship of things in my ontology. [[Image(History05.JPG, 400px)]] '''Fig 5 - Ontology of Things That Will Carry A Bicycle''' This example is all most people will ever have to know about ontology. But if you are interested in digging deeper, the W3C Consortium has created two languages with which to create ontologies, Resource Description Framework (RDF), and the Web Ontology Language (OWL). Neither are for the feint of heart. * [http://www.w3.org/TR/rdf-primer/ RDF Primer] * [http://www.w3.org/2003/08/owlfaq.html Frequently Asked Questions about OWL] ---- == How We Use the Internet to Find Information == The Internet is the enabling technology for sharing plant information easily. (Going back to the ''flight'' metaphor, ''the Internet'' probably occupies the same place in the interoperability of plant information as does ''air'' in flight.) Without the Internet, on top of all the other steps required to transfer information between our software applications, we would have to add the chore of creating a link between each pair of business partners. But beyond the simple connection between plant project participants, when we try to use the Internet beyond simply calling up web pages, we run into many of the same issues that we run into trying to make plant applications communicate with each other. This brings us to the Semantic Web. === The Semantic Web === The Semantic Web is another topic that most people will not have to know about in order to use ISO 15926. However, it is interesting as background information. When Tim Berners-Lee invented the World Wide Web in 1990, he envisioned much more than what we see today, which is essentially version 1.0. He envisioned an web environment where people could ask their personal digital assistants questions like "Is there a medical doctor near here that specializes in geriatrics who has an open appointment before Friday noon?"--and then go for coffee. Currently the World Wide Web is built to link documents primarily for human consumption. Computers can process web pages for layout and visual format, but they have no way to process the semantics; to know what they ''mean''. Thus, if you wanted to find a doctor in the example above, you may be able to use the World Wide Web to get a list of doctors and their specialty, and maps with which to judge the distance, but you would still have to call each doctor's office individually to see if she is taking new patients, and if there is a suitable open appointment. Using existing sources of information, one might get lucky and get an appointment with the first call from the Yellow Pages, but it could easily take much longer. The Semantic Web is all about describing things in a manner that computers can understand, so that you can ask questions like this one and let a digital assistant do the leg work. Using Semantic Web technology, data can be shared and re-used across application, enterprise, and community boundaries. ISO 15926 uses some Semantic Web technology to describe plant objects in a way that computers can understand. Where it differs from the Semantic Web is in the level of precision. The Semantic Web initiative seeks to map all the legacy data on the World Wide Web in all its chaotic glory to give "pretty good" information. In the field of Plant Design, "pretty good" is a pretty good way to blow things up and kill people. ISO 15926 requires more precise definitions, but uses some of the same tools. If you are interesting in knowing more about the Semantic Web, here are some references: * [http://en.wikipedia.org/wiki/Semantic_Web Wikipedia] * [http://www.scientificamerican.com/article.cfm?id Scientific American article describing the doctor search scenario used above.] * [http://www.w3.org/2001/sw/ W3C main WC3 Semantic Web site] * [http://www.w3.org/2001/sw/BestPractices/Tutorials A compilation of resources for learning about Semantic Web enabling technology] * [http://infomesh.net/2001/swintro/ Introduction to the Semantic Web] ---- == How we Store and Exchange Plant Information == Interoperability of plant information between proprietary systems became an issue almost from the advent of CAD in the 1950s. There are many organizations dedicated to interoperability in just about every industry. Interoperability in the plant industry started in the mid twentieth century U.S. defense department, and expanded to include aerospace, automotive, and plant. Included here are some of the more significant initiatives. === Plant Information Interoperability Projects === * IGES * STEP / ISO 10303 * PlantSTEP * PISTEP * PIBASE * ProcessBASE [[Image(History_STEP.JPG, 600px)]] '''Fig 6 - History of STEP''' === The Initial Graphics Exchange Specification (IGES) === Computer based graphics systems started appearing in the mid 1950s in the U.S. Defense industry. By the 1970s the Department of Defense wanted a neutral format that would allow the digital exchange of information between CAD systems. The IGES project was started in 1979 by a group of CAD users and vendors, with the support of the National Bureau of Standards (NBS), now known as the National Institute of Science and Technology (NIST). In 1980 the NBS published what they called the Digital Representation for Communication of Product . This standard was also published by ASME/ANSI as Y14.26M, which is how many military standards refer to it. By 1988, any computer aided software vendor who wanted to sell to the DoD had to support reading and writing IGES format files. Since then, IGES has been used in the automotive, shipbuilding and defense industries for small parts up to entire aircraft carriers where the digital drawings have to be used many years after the vendor of the original design software has gone out of business. By 1994 a competing standard, STEP, was released as an ISO standard. Development of IGES was stopped. '''References''' * [http://en.wikipedia.org/wiki/IGES Wikipedia] === Standard for the Exchange of Product Data (STEP) === The development of STEP started in 1984. The objective was to provide a means of describing product data throughout its lifecycle, independent of any particular computer system. STEP shares many goals with ISO 15926. STEP's neutral files mean that product data can be archived over many years, and can be shared between different software systems. As well, the standard is implemented within commercial software particular to the engineering discipline, and so will be invisible to the average user. But STEP differs from ISO 15926 in two important ways: * The manner in which templates and descriptions of plant objects are changed: STEP requires a lengthy review before approval of changes, whereas ISO 15926 allows class extensions to be made in as little as five minutes, by trained and approved individuals. * The ability to store temporal, or time-related information. Recording changes to a processing plant over its lifetime is outside the scope of STEP. In 1994 STEP was issued as ISO 10303 Industrial systems and integration - Product data representation and exchange. STEP's credits include: * 1995 - Boeing 777 * 2000 - GM exchanges parts drawings with suppliers * 2004 - Endorsed for U.S. Navy '''References''' * [http://en.wikipedia.org/wiki/ISO_10303 Wikipedia] === PlantSTEP === PlantSTEP was active in the 1990s. It was a consortium of organizations with the purpose of developing and exchanging standards based on ISO 10303. The hope is that these standards will enable concurrent engineering, design, construction, and operation of large facilities by allowing full information sharing among all project contributors. The vision is for all parties to be able to use their own tools and work methods, but to be able to share appropriate information between them seamlessly. The list of specific benefits mirrors that of all interoperability initiatives: * Reuse data * Share and exchange data between multiple participants with full integrity and fidelity * Lifetime data availability and retrieval at varying levels of detail * Owners can receive consistent deliverables from vendors, engineers, and constructors * Allows easier plant modification over life of facility '''References''' * [http://cic.nist.gov/plantstep/ NIST PlantSTEP pages] === Process Industries STEP Consortium (PISTEP) === PISTEP was created in 1992 to further the awareness of STEP in the process industries. The first phase culminated with major presentations at conferences in London, England in 1993 and 1995. The second phase continued until the end of that decade raising awareness of STEP, by then known as ISO 10303, as well as ISO 15926. In 2000, PISTEP merged with POSC Caesar, with PISTEP becoming the UK chapter. '''References''' * [http://homepages.rya-online.net/matthew-west/pistep/index.html PISTEP] === PIEBASE === ''Process'' ''Industry'' ''Executive'' ''for'' ''Achieving'' ''Business'' ''Advantage'' ''using'' ''Standards'' ''for'' ''data'' ''Exchange'' (PIEBASE) was chartered in the fall of 1996. The intent was to achieve a common strategy and vision for the delivery and use of internationally accepted standards for information sharing and exchange. PIEBASE is a global umbrella for many process industry consortia active in the development of STEP and other standards for industrial data. Its mandate is the overall coordination of the development and implementation of these standards. '''References''' * [http://www.posc.org/piebase/ PIEBASE] * [http://fire.nist.gov/bfrlpubs/build98/art007.html PIEBASE Roadmap to Achieve Industry Vision for Information Exchange and Sharing] === European Union and ESPRIT === One of the main drivers for the European Union, founded in 1993, was to develop a single market for its member states. It has largely achieved this goal through a standardized system of laws, to which all member states adhere, and a common currency, which most have adopted. It is not surprising, then, that the EU is also interested in efforts to standardize the flow of information between its manufacturers. The European Commission saw that the success of Europe itself depended on the ability of its industry to provide competitive goods, and that this success in turn would be helped by standardized methods of exchanging information about these goods. Having had significant participation in STEP already, the EU used the ESPRIT programme to sponsor a significant new part, AP221 ''Functional'' ''data'' ''and'' ''schematic'' ''representation'' ''of'' ''process'' ''plants''. The ESPRIT programme was managed by the Directorate General for Industry of the European Commission. The Directorate General initiated a number of programmes which set the priorities for the EU's R&D activities; ESPRIT was part the fourth in that series. Running from 1994 to 1998, one of ESPRIT's initiatives was to co-fund !ProcessBase. '''References''' * http://cordis.europa.eu/esprit/src/intro.htm === !ProcessBase === !ProcessBase (alternately spelled ''Process Base'', ''ProcessBASE'', and ''PROCESSBASE'', depending on the source) was co-funded by ESPRIT and a consortium of mostly EU enterprises with two objectives: * To promote the use of new technologies in the area of product data, including STEP, and to facilitate data transfer among actors in the process industry. * To ensure that Application Protocol AP221, developed by !ProcessBase, would interoperate with the ''Spatial'' ''Configuration'' ''for'' ''Process'' ''Plants'' to be developed by the US PlantSTEP initiative. In this the programme was largely successful. !ProcessBase developed a neutral format based on ISO 10303 and demonstrated the exchange of process plant functional data and schematics between CAD systems, analysis systems, and process plant databases in a pilot project. They contributed the application protocol AP221 to the development of STEP, which helped to solve problems of size and harmonization that plagued previous application protocols. Participants: * Framatome, France (Coordinator) * AKZO Nobel Engineering, NL * Bertin et CIE, France * Caesar Systems Limited, UK * DRAL, Rutherford Appleton Laboratory, UK * Initec, Spain * PDIT, USA '''References''' * http://research.cs.ncl.ac.uk/cabernet/www.laas.research.ec.org/esp-syn/text/iim-us9403.html == The History of Markup Languages == Markup languages have a long history in enabling computers to handle large bodies of text properly, without human intervention. When encoded with a markup language, the ''content'' of a body of text is separated from the ''format'', or appearance of the text. This is an important concept in ISO 15926 where the goal is to embed enough ''context'' into the ''content'' that we do not need to see the format, or appearance, of the information to know what it means. Key factors in a Markup Language: * A standard format in which to store information that lasts many times longer than proprietary commercial software. * A means to transfer information between proprietary computer systems. === What is a Markup Language? === In the context of understanding ISO 15926, a "markup language" is a set of conventions for marking up text that are used together with the text to tell a computer the meaning of the text. At a very simple level, punctuation, capitalization, and even the spaces between words themselves can be considered ''markup''. These features tell human readers when there is a break between ideas, when to pause, and where individual words start and stop. (If the reader thinks these are obvious necessities for understanding written text, there are numerous examples in the history of human societies were written material was in the style of [http://en.wikipedia.org/wiki/Scriptio_continua scriptio continua]. Another example is spoken words, where the volume and tone of voice can be considered ''markup''. For instance, a given string of words may have a completely different meaning if they are yelled, spoken in a soft voice, or with a condescending tone of voice. Thus, the ''value'' of the message (that is, the actual words spoken) must be considered together with the ''tags'' (that is, the volume and tone of voice) to obtain the correct meaning. We have seen this concept previously in this primer. In the section about ''context'', we saw that the numerical value ''1034'' on its own had no meaning, but in the context of a particular spot on a particular data sheet, it meant the pressure of the seal flush of a centrifugal pump. Thus, the location of a value on a data sheet can be considered a sort of ''markup''. If the meaning of a piece of text is embedded in the text by means of a markup language, one can use the same body of text for different purposes without modifying the text manually. For example, consider a scientific journal that publishes papers both in a printed magazine format and on its website. In the magazine, footnotes might be grouped at the end of the article, while footnotes on the website might pop up in their own little window. The publisher could manually edit the text for each purpose, but this would be doing it the hard way. The easy way is to encode the text with a markup language that ''marks'' the beginning and end of footnotes, and showed the correct anchor point in the manuscript. When the publishing software prepares the text for print, it will group the footnotes at the end of the article, but when it prepares the text for the website, would include the necessary HTML tags to create a popup window. === 1960s GML === Generalized Markup Language (GML) was developed in 1969 by the team of Charles Goldfarb, Ed Mosher, and Ray Lorie. (Look at the initials formed by their last names--it's not a coincidence. In fact Goldfarb invented the term "Markup Language" just to be able to use them!) Goldfarb, a lawyer at the time, had joined IBM to get some high tech experience. He was assigned to a project to figure out how to merge case law research results together into one document, compose it, and print it. At the time there were no systems that would do all three things, so the text to be printed had to be transferred from one proprietary system to another, all without loosing it's fidelity, or meaning. GML was a set of macros that described the logical structure of the document, for instance, to declare some text to be a heading and other text to be a body paragraph. Note that the issue of being able to transfer information between proprietary systems is the same issue that drives ISO 15926. '''References''' * [http://www.sgmlsource.com/press/index.htm Charles Goldfarb's Press Online Kit] === 1980s Standard Generalized Markup Language (SGML) === SGML is a descendent of GML. SGML was originally intended for publishing databases and text. One of its first applications was publishing an early edition of the Oxford English dictionary. SGML is known as a ''metalanguage'' since it can be used to describe other markup languages. In the field of publishing, historically, ''markup'' has meant the marks that an editor makes when reviewing a transcript. For instance, marks to indicate that one phrase is to be rendered in bold face and another in italics. In an age of machine-readable text, this term has now come to mean special formatting codes inserted in-line with the text to give direction to the computer that does the publishing. ''Metalanguage'' means that SGML can be used to create other markup languages. SGML has the means to describe which markups are required and how to tell markups from text. Thus, you can use SGML to create other markup languages. The first working draft of SGML by the American National Standards Institute (ANSI) was published in 1980. By 1983 it was ready for prime time and was adopted by the US Internal Revenue Service and the US Department of Defense. The next year the International Organization for Standardization (ISO) had gotten involved and in 1986 issued SGML as the international standard (ISO 8879:1986) One feature of SGML that distinguishes it from other markup languages at the time is its emphasis on ''descriptive'' markup rather than ''procedural'' markup. This means that the tags ''describe'' the text rather than tell ''what to do with it''. For instance, it was common to use a proprietary markup language which told proprietary publishing equipment to, say, print this in 10pt Times Roman, and that in 12pt sans serif. But if the publisher wanted to process the text on different equipment, the tags would all have to be stripped out and new tags entered. SGML, however simply said "this is body text", or "this is a footnote". Of interest to the history of ISO 15926 are some of the reasons for using SGML: * In the government and law, large bodies of text must be readable for decades. Therefore it must not be stored in any proprietary format that may go out of fashion in a few years. This is also one of the reasons to use ISO 15926; the life of a typical plant also spans several decades, during which time computer operating systems and text handling software goes through many generations. The people dismantling the plant forty years later may not even remember the name of software that was used by the engineers designing the plant. * SGML was also used as a means to transfer texts from one system to another in a manner that preserved the intent of the formatting. Similarly, ISO 15926 can be used to transfer information about plant objects from one system to another in a manner that preserves the meaning of the attributes of the plant object. '''Example''' Continuing the example of a pump data sheet, here is what the information might look like encoded in SGML: {{{ CENTRIFUGAL PUMP DATA SHEET Client: ABC Chemical Company Tag No: P101 Service: Chemical Injection to D-101 ... Seal Flush Pressure: 1034 kPa }}} This shows the information on the data sheet as plain text. The title will likely be a larger font, and the two heading will be in bold face. The rest of the text is understandable by humans, but you could not have a computer read it to extract, for instance, the tag number of the pump, its attributes, or to extract its relationship to D-101. '''References''' Good introductory material: * [http://xml.coverpages.org/naggumWhat.html SGML: Erik Naggum's Brief Description] is one of the best places to start. * [http://www.isgmlug.org/sgmlhelp/g-index.htm A Gentle Introduction to SGML] (in HTML) or [http://xml.coverpages.org/gentle.html A Gentle Introduction to SGML] (in plain text). * [http://xml.coverpages.org/general.html#hist History of Generalized Markup and SGML] * [http://en.wikipedia.org/wiki/Standard_Generalized_Markup_Language Wikipedia: SGML] * [http://xml.coverpages.org/sgml.html SGML and XML as Markup Languages] For more detailed information: * [http://www.w3.org/MarkUp/SGML/ Overview of SGML Resources] === 1989 - Hypertext Markup Language (HTML) === HTML is a descendant of SGML. HTML was invented by Tim Berners-Lee as a way to embed references to a document within another document. Tim envisioned being able to directly open such a referenced document directly without having exit the first document. Berners-Lee based HTML on SGML since SGML could already be implemented on any machine. As with SGML, the idea was to be able to mark up text in a way that separated the ''message'' from the manner in which the message was displayed. For instance, '' some text '' meant that the enclosed text was to be somehow emphasized. Web browsers intended to be read with eyes might render the text slightly larger and bold face, or perhaps underlined. Alternatively, web browsers intended to be listened to might render it in a slightly louder tone. HTML attracted mostly academic interest for the first year or two. But as the Internet became more widely known, organizations started to realize how HTML could open the Internet to average people. From the early 1990s, HTML became a battleground for various competing interests who added their own tags. One of the biggest issues was getting fine control over the appearance of the text and images. The example above, '' some text '' says that "some text" should be somehow emphasized, but in what way? Print publishers were used tweaking text with by adjusting the point sizes, leading and kerning, and were not happy trusting the default handling of "emphasized" text. The result today is that HTML has a great many tags for fine tuning the appearance of text, but no tags to convey the meaning of text--you still need a person to read the material. '''Example''' HTML added a number of tags to SGML: * P, for paragraph * H1 thru H6, for heading levels * OL, ordered lists * UL, unordered lists * LI, list items * HREF, references to other objects * A, to anchor HREF references Here is how we might use them to encode our pump data sheet encoded in HTML: {{{ CENTRIFUGAL PUMP DATA SHEET

Client: ABC Chemical Company

Tag No: P101

Service: Chemical Injection to D-101

...

Seal Flush

}}} Here we have used some of the new tags to lay the data sheet out a little nicer. The title is the same, but now we can group the pump's attributes under headings. However, we are still formatting the text for human viewers. We have more tags to handle the appearance of the information, but nothing to tell a computer what the various bits of text mean. '''References''' * [http://www.w3.org/People/Raggett/book4/ch02.html History from W3.Org] * [http://infomesh.net/html/history/early/ The Early History of HTML] * [http://www.yourhtmlsource.com/starthere/historyofhtml.html The History of HTML] * [http://en.wikipedia.org/wiki/HTML Wikipedia: HTML] * [http://www.livinginternet.com/w/ww_html.htm Hypertext Markup Language (HTML)] === 1990s Extensible Markup Language (XML) === XML, also a descendent of SGML, is also a meta language in that it can be used to define other markup languages. XML was intended to get back to the SGML roots without the SGML complexity. When it was released in its first draft in late 1996, its developers were not shy about proclaiming it to be the holy grail of computing, solving the problem of universal data interchange between dissimilar systems. Since its introduction it has accomplished at least some of what was intended of it. For instance, most of our Office documents are now stored in XML format. While some argue that the particular dialect of OpenOffice XML isn't the best formed in the world, it's still an order-of-magnitude better than the myriad of proprietary formats that preceded it. Now it is much easier for third parties to reverse-engineer documents in order to open them in different authoring software. Of interest to the history of ISO 15926 are some of the implications of widespread use of XML in web publishing. Looking into our crystal ball we can see applications written by webmasters that will allow untrained users to write in something that looks like Microsoft Word, then upload their fine prose (or poetry, or...) straight in to the local content management system. And as XML-written documents displace documents written with proprietary software (and uploaded as inscrutable binary files), more and more data will be open, available to be searched and indexed, and therefore available for all. The "X" in XML means "Extensible". We can use this feature to mark up information about plant objects in a way that will let a computer read it. '''Example''' {{{ CENTRIFUGAL PUMP DATA SHEET ABC Chemical Company P101 Chemical InjectionD-101 ... 1034 kPa ... ... }}} === Advantages over HTML === This example above shows how we can extend XML to include any kind of tags we wish. Right away you can see how we could then use a computer program to search the information to pull back the name of the pump, it's associated equipment (D-101), and that the seal flush pressure was 1034 kPa. Since it is extensible, any organization can create its own tags for whatever it needs. === Drawback for Interoperability === Agreement on the definition of terms. In order to get interoperability between systems, the owners of the systems have to agree on terms. As we have seen in previous sections, getting the agreement not a trivial question. Descendants * SOAP * XML RPC '''References''' * [http://en.wikipedia.org/wiki/XML Wikipedia: XML] * [http://www.w3.org/XML/hist2002 Development History] * [http://www.w3.org/TR/WD-xml-961114.html Extensible Markup Language (XML)] * [http://www.itwriting.com/xmlintro.php Introducing XML] * [http://www.ibm.com/developerworks/library/x-xml2008prevw.html?ca=dgr-lnxw01XML-Future The future of XML] * [http://www.extropia.com/tutorials/xml/index.html Introduction to XML for Web Developers] == NEXT == * [wiki:ISO15926Primer_History_ExchangeTextInformation How we Store and Exchange Textual Information] ... ---- [[ViewTopic(ISO15926Primer_History)]]