Changes between Version 44 and Version 45 of ISO15926Primer_Mapping_Expensive

Show
Ignore:
Timestamp:
11/16/11 04:52:49 (12 years ago)
Author:
gordonrachar (IP: 75.156.216.35)
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ISO15926Primer_Mapping_Expensive

    v44 v45  
    33= Mapping Databases to Each Other is Expensive = 
    44 
    5 ---- 
     5The '''ISO 15926 Primer''' has been replaced with '''An Introduction to ISO 15926''', a free download from Fiatech. 
    66 
    7 [[PageOutline(2-4,Contents,inline)]] 
     7This page is out of date and has been depricated. 
    88 
    9 == Abstract == 
    10 When we want to exchange information between two software applications, the traditional way is to map the respective databases together.  We preserve context by manually examining the terms in each database to determine which are equivalent.  Directly mapping two applications together is often the easiest short-term solution.  But as more applications are mapped to form a network, the cost to maintain the individual maps grows exponentially. 
     9If you reached this page from a link in another web page please inform the webmaster. 
    1110 
    12 ---- 
     11For a peek at the new book and instructions on how to download a copy please follow this link. 
    1312 
    14 == Mapping In a Perfect World == 
    15 Let us say that you are an engineer working for a construction company planning the erection of some piping spools.  Part of your job is to plan the hydrotesting of all the pipelines before commissioning the plant.  So, along with a great deal of other information, you need to know the design temperatures and pressures of all the pipelines in the project.  Fortunately, the design engineer has given you a ''Line List'', a list of all the pipe line numbers and their critical attributes. 
    16  
    17 Your first job, then, is to simply get the design engineer's line list into a form you are used to dealing with--your company's construction management software.  You might decide to bite the bullet and just rekey the information manually.  But this gets old really quickly and after a few dozen lines you will be wondering why you can't just download the data from the engineer's design software into your construction software automatically.  After all, the design engineer did not (in this day and age) have a secretary type the line list onto a piece of paper with a typewriter and fax it to you; it almost certainly exists in an electronic database somewhere. 
    18  
    19 Now it is in the design engineer's best interest to make sure that the construction contractor has the correct information, so it is easy to imagine that "your IT folks" and "their IT folks" get together to give you a connection over the Internet to just "suck in the data."  And since the "lexical scope" of a Line List is only a few terms, it is easy to imagine that the data will "go right in". 
    20  
    21   ''"Artificial intelligence is the study of how to make real computers act like the ones in the movies."'' [[BR]]  Port 2000 Newsletter, The Information Technology Newsletter for Port Washington Educators, Dec. 96 
    22  
    23 In a perfect world, everyone developing an application for some part of plant design would use the same column names, and ensure the meaning of the contents of the columns were the same.  (In database jargon, the "schemas" would be the same.)  Figure 1 shows what this would look like. 
    24  
    25 [[Image(Mapping01.JPG, 300px)]] 
    26  
    27 '''Figure 1 - A Perfect World''' 
    28  
    29 == Mapping in the Real World == 
    30 But of course in the real world, things seldom work out this nicely.  Even in the rather small scope of a "line list" (where there the "lexical scope" is only a dozen or so terms) there is usually ambiguity. 
    31  
    32 [[Image(Mapping02.JPG, 300px)]] 
    33  
    34 '''Figure 2 - Actual Situation''' 
    35  
    36 For instance, in Figure 2, two columns have names that do not match.  One of them, "tag_no" in the Engineering Application is most probably equivalent to the "id" column in the Construction Application.  But what about the column "cl" in the Construction Application?  And, for that matter, what does "temp" really mean?  Could it mean "Temporary"?  It's possible.  It is common to have temporary lines erected for commissioning, then demolished when a plant is put in production.  And further, if you are being a bit paranoid (and if you have to sign off on the actual hydrotest pressure you ''better'' be at least a ''bit'' paranoid) what about "press"?  It almost certainly means "pressure", but what kind of pressure? 
    37  
    38  * Operating Pressure 
    39  * Maximum Allowable Working Pressure 
    40  * Design Pressure 
    41  
    42 If you have to specify the hydrotest pressure, you have to know for sure. 
    43  
    44 The reality is that we are inferring everything, based on context. 
    45  
    46 The only solution is to start digging.  Fortunately, the lexical scope of a line list is only a couple dozen terms.  To determine if "temp" means "temperature" or "temporary", you will have to look at a few rows of data.  If the data values are "y/n", or "0/1", it points to "temporary".  But if the data values are numbers in the range of, say "-50" to "1500", it probably means "temperature". 
    47  
    48 "press" will take a bit longer.  You will have to look for clues in other documents and use your engineering judgment to determine which kind of pressure it is.  You may have to contact the design engineer. 
    49  
    50 What looked fairly simple just a while ago is looking to be a bit more complex. 
    51  
    52   ''"There is always an easy solution to every human problem--neat, plausible, and wrong."'' - H. L. Mencken, 1917 (Variously attributed to Albert Einstein, Winston Churchill and others.) 
    53  
    54 == A Custom Map == 
    55  
    56 If you have to import information from the Engineering Application more than once, you will probably have someone create a custom mapping application. 
    57  
    58 [[Image(Mapping03.JPG, 350px)]] 
    59  
    60 '''Figure 3 - A Custom Solution''' 
    61  
    62 In Figure 3, someone has examined the data coming from the Engineering Application and determined which columns matched those of your Construction Application.  In software development jargon they have "Mapped the databases together."  The red box represents software that uses the map to transfer information from the Design application to the Construction application. 
    63  
    64 Note that very little data is actually transferred.  For instance if we know that both applications expect pipe sizes based on ASME B36.10M, we would only have to transfer numerical digits, such as "6" or "12". 
    65  
    66 === Complication - The Real World is, well, Complex === 
    67  
    68 In the simplistic example above, writing the mapping software would be quite simple, as these things go.  But the real world is more complex.  One common problem (from the point of view of a humble construction engineer) is that the data is often mangled across many fields. 
    69  
    70 Take, for example, a "simple" line number such as what might be represented by the column "tag_no".  As a construction engineer you are likely to look at a line number as "a label representing the name of this run of pipe, from here to there."  You expect the line number to be one string of text, something like: 
    71  
    72 ||'''Line Number'''|| 
    73 ||150-HCL-250-1C200-35|| 
    74  
    75 But the design engineer needs to be able to sort and filter on the pieces.  Her idea of a line number probably looks more like this: 
    76  
    77 ||'''Line Number'''|| 
    78 ||Area||Fluid Code||Nom Dia||Service Class||Insulation Thk|| 
    79 ||150||HCL||250||1C200||35|| 
    80  
    81 This added complication may not reveal itself until you get "under the hood" and look at the database dump.  The design engineer could well have concatenated the components of the line number in the written report to appear as if they were a single string of text.  If you are the one transferring information from one application to another, it is up to you to figure out all the pieces and put Humpty together  for your Construction Application. 
    82  
    83 It's a good thing that the lexical scope of a line list is only a few dozen terms. 
    84  
    85 == A Confederation of Applications == 
    86  
    87 If linking the first two applications together works out well, you may be tempted to link in another one.  Figure 4 shows how the Engineering Application might be mapped to a Procurement Application with a second custom database map. 
    88  
    89 [[Image(Mapping04.JPG, 600px)]] 
    90  
    91 '''Figure 4 - A Confederation of Applications''' 
    92  
    93 This is the beginning of what might be called a "Confederation of Applications", where many applications get information from each other, with custom-built database maps between each pair of applications.  Conceivably, you could eventually link all of the Construction Applications at your company together in this manner. 
    94  
    95 [[Image(Mapping05.JPG, 400px)]] 
    96  
    97 '''Figure 5 - Enterprise-Wide Solution''' 
    98  
    99 === Benefits of a Confederation of Applications === 
    100  * You no longer have to manually rekey information to transfer it between applications. 
    101  
    102 === Complication - You Have to Redo Work Every Time === 
    103   * To make everything work together you will likely have to re-discover everything about all of the applications in your confederation ''each time you map between any two of them.''  What this means is that when you want to, for instance, create the map between the Engineering Application and the Procurement Application (see Figure 4, above) there will likely be new developers.  Unless the second mapping exercise is conducted right away, the original developers may not be available.  Thus, even if several applications need all or part of the same line list, the new developers will have to re-discover everything about the Engineering Application every time.  As the number of linked applications expands, the effort required to connect them expands exponentially. 
    104  
    105 === Complication - Short Lifetime, High Maintenance === 
    106   * For a short period of time, say, the duration of a construction project, linking software applications with custom mapping routines may work just fine.  But software tends to get revised with "new and improved" capability.  Eventually, one of the "improvements" will be to refine the definitions of the data and the names of the database columns.  Your hard-coded custom database application map will break.  And if your applications are daisy-chained together, they may all break like a row of dominos.  To keep all applications working together someone has to maintain the linking functions. 
    107  
    108 === Complication - Ossification === 
    109   * To reduce your maintenance chores, you organization may try "freezing" the design of your software.  This will give short-term relief, but at the expense of stopping progress.  (Of course some may see this as an advantage!)  But eventually your organization will be left with a portfolio of old software than no one knows how to maintain anymore. 
    110  
    111 What we need is a way of transferring information that does not require us to maintain custom maps.  This is where ISO 15926 comes in. 
    112  
    113 == Next == 
    114  
    115   * [wiki:ISO15926Primer_PeekAt15926 Primer: A Peek at ISO 15926] 
    116  
    117 ---- 
     13  * [wiki:ISO15926Primer An Introduction to ISO 15926] 
Home
About PCA
Reference Data Services
Projects
Workgroups