The UCF_Meta_Data_Type complex element components
UCF_Meta_Data_Type complex element is found in every UCF XML specification in version 2 and greater. The UCF_Meta_Data_Type complex element currently contains eight elements within its structure:
![meta-data-components-1.png [image]](http://www.unifiedcompliance.com/converted/images/meta-data-components001.png)
Figure 1 The UCF_Meta_Data_Type complex element (taken from the Authority Document XML file)
Within the list below, the "XXs" would be replaced with each XML list's appropriate identifier, such as "AD" within the Authority Document list, or "CE" for the Controls list.
Each element's definition will be presented, with the UCF's applicable editorial rules and QA methodologies following the description. The rule numbers will not appear in order. Rather, they will appear in the order in which we created the rules.
XML Licensee participation
If you would like to discuss the various UCF_Meta_Data_Type complex element components, or add to the list, click HERE.
UCF_XX_Release_Date (xs:string)
The first element within the meta data component is used to signify which release version the table belongs to. While formatted as a string in the XSD, the text will be as follows:
Q3 09 - Final
The first section "Q3" represents which quarter the release belongs to.
The second section "09" is the two year digit for the year the release belongs to.
The final section "Final" represents the status of the release. Final is just that, the final, fully QA'd release. Pre-release will signify that this table belongs to a work in progress release, such as those sent to our XML licensees when we are working through the addition of new elements. And Update represents a critical update to a final release in case there is something urgent that needs to be sent out to the licensees.
UCF Editorial rules for the ID element
To be documented.
QA rule violations and checks for the ID element
To be documented.
UCF_XX_ID (UCF_ID_Type2)
UCF_XX_ID is the unique and persistent identifier for each authority document that is restricted by the global definition of UCF_ID_Type (five digit) or UCF_ID_Type2 (seven digit). We use the UCF_XX_ID as the identifier so that if there is a discrepancy in how we any of the record's information, any linked references to the record will not change. And as obvious from the previous sentence, we use the UCF_XX_ID field as the linking field when referencing this list from other lists.
One key note about the UCF_XX_ID element is that even though the element appears to be a string of numbers, the element should be treated as text. The reason for this is twofold:
1. If treated as a number, the leading zeroes are most often deleted or at least ignored by the database.
2. Other elements, like Genealogy, which use the ID, are looking for text elements instead of number elements (the reasoning for which will be made clear under the Genealogy discussion below).
The ID element is created when the record in question is created and is always assigned the next highest non-used, non-reserved ID in the system for that particular list.
This element will always be present in all UCF XML lists.
UCF Editorial rules for the ID element
There are two editorial rules for the ID element.
|
ID |
Rule |
|
1 |
ID - Unique and Persistent Identifiers Each and every unique record within the UCF will maintain an ID in the UCF format. |
|
2 |
ID - Format All IDs within the UCF will follow the text-based numerical format. This means that all IDs will have the required number of digits (five for Type 1 of the UCF, seven for Type 2), even if the leading digits are comprised of a series of zeros (000 for example). Because the UCF ID format requires the leading zeros in the ID field, the IDs themselves must be treated as text. |
QA rule violations and checks for the ID element
Below is the table which summarizes the rules mentioned above and their potential violations.
|
Rule |
Error |
Potential Error |
|
1 |
14 |
Content - Missing The content for this field is completely missing in a field that does not allow a null value. |
|
1 |
31 |
Content - Duplicate content The content in question is a duplicate of content of the same type and therefore should not exist. |
|
2 |
24 |
ID - Malformed ID The ID, Genealogy, or Sort ID field does not fit the current ID format and a checksum calculation shows that the ID field is not correct. |
|
2 |
30 |
Field Format - Improper field type Certain data elements are looking for specific field formats. Our ID fields (also including our genealogy and sort IDs) are text-based numbers that are looking for text fields instead of number fields. Our date created is a date field, while our date modified is a date-time field. Therefore, this violation can occur when an element is entered into an improperly formatted field. |
UCF_XX_ID_CheckDigit (xs:Integer)
We humans have to use numbers. However, when entering numbers, we humans also have a tendency to screw up the entry or copying of those numbers. A Dutch mathematician named Jacobus Verhoeff conducted a study of 12,000 numerical errors and from that, proposed a check digit calculation scheme that catches all single errors as well as all adjacent transpositions and most other errors.
To ensure that the IDs assigned by the system have integrity during input as well as distribution while being transferred into various formats (such as Excel, Word, Text, XML), each ID will also have its own checksum value stored in a checksum field. Currently, the methodology for creating and verifying the checksum follows the Verhoeff calculation format.
This element will always be present in all UCF XML lists.
The CheckDigit is created along with the record's ID as a calculation by the UCF database system. As such, once assigned it should never change because the ID will never change. A sample calculation format is shown in the use case scenarios.
Editorial rules for the CheckDigit element
There is only one rule for the CheckDigit element.
|
ID |
Rule |
|
18 |
ID - Checksum To ensure that the IDs assigned by the system have integrity during distribution while being transferred into various formats (such as Excel, Word, Text, XML), each ID will also have its own checksum value stored in a checksum field. Currently, the methodology for creating and verifying the checksum follows the Verhoeff calculation format. |
QA rule violations and checks for the CheckDigit element
Below is the table which summarizes the rules mentioned above and their potential violations.
|
Rule |
Error |
Potential Error |
|
18 |
12 |
Content - Co-referenced fields do not match There are times when two fields (which might not be related in the database sense) must refer to each other or whose contents must interrelate. This rule has been broken when those fields do not co-reference each other correctly. As a case in point, the information in a record's Originator field might not match that found on the associated record's domain field. Or, the Acronym doesn't match the term in the Glossary. Or the Checksum doesn't match the ID field. |
|
18 |
14 |
Content - Missing The content for this field is completely missing in a field that does not allow a null value. |
|
18 |
30 |
Field Format - Improper field type Certain data elements are looking for specific field formats. Our ID fields (also including our genealogy and sort IDs) are text-based numbers that are looking for text fields instead of number fields. Our date created is a date field, while our date modified is a date-time field. Therefore, this violation can occur when an element is entered into an improperly formatted field. |
UCF_XX_Genealogy (xs:string)
Within the UCF, a record's genealogy is a set of UCF IDs strung together as distinct words (e.g., 0000000 0000001 0000002) that represent (from right to left) the current record's parent, grand-parent, great-grand-parent, on back to the very root element that spawned the list. At minimum, every record will have a genealogy of 0000000 which represents the root record within the list.
The genealogy element is initially created by the UCF database system when the record in question is created. If the record in question is moved lower or higher in the taxonomy, the genealogy is automatically re-calculated and the value will change to reflect the new taxonomic structure. Because the UCF editorial team does not have edit privileges for this element, the genealogy will always reflect the taxonomic position the record was last stored in. If there is a dispute about the record's genealogy, the dispute is an editorial one, and not a programming one.
This element will not always be present in all UCF XML lists. Some lists, like the Auditable Artifacts list do not require a genealogy and therefore the element will not be present in the XML schema.
Editorial rules for the genealogy element
There are two editorial rules for the genealogy element.
|
ID |
Rule |
|
3 |
Genealogy - No null values Each and every unique record within a UCF table that is a hierarchical table will have a genealogy field that is a string of UCF formatted IDs (without the checksum) followed by a space. For more information, see the full technical documentation about UCF genealogical structure. |
|
4 |
Genealogy - All records will follow a contextual taxonomic structure Taxonomies are contextual. That means that some taxonomies, such as the glossary, can follow a simple structure (dogs fall under the heading "D", cats fall under the heading "C"). Other taxonomies (such as authority documents) must follow a version structure (version 1.1 of PCI-DSS falls under the heading of PCI-DSS). And yet others, such as our control list and our audit list, must fall under a dependency structure ("Turn off FTP" falls under "Disable all unnecessary applications"). All controls and their audit questions will follow a "natural hierarchy" according to the citation structures found in the Authority Documents being mapped. |
|
19 |
Genealogy - Format All Genealogy fields will follow a set format of stringing together the record's direct taxonomic hierarchy beginning with the root record for the table (always all zeroes) and ending with the record's parent. |
QA rule violations and checks for the genealogy element
Below is the table which summarizes the rules mentioned above and their potential violations.
|
Rule |
Error |
Potential Error |
|
3 |
14 |
Content - Missing The content for this field is completely missing in a field that does not allow a null value. |
|
4 |
6 |
Genealogy - Change The record's genealogy is not correct (which is different than its sort order). This record has an incorrect parent and should be moved to become a child of a different record (which should be noted in the review comments). |
|
19 |
30 |
Field Format - Improper field type Certain data elements are looking for specific field formats. Our ID fields (also including our genealogy and sort IDs) are text-based numbers that are looking for text fields instead of number fields. Our date created is a date field, while our date modified is a date-time field. Therefore, this violation can occur when an element is entered into an improperly formatted field. |
|
19 |
24 |
ID - Malformed ID The ID, Genealogy, or Sort ID field does not fit the current ID format and a checksum calculation shows that the ID field is not correct. |
UCF_XX_Sort_ID (xs:integer)
We sort our displayed information according to a taxonomic display hierarchy (which means that the genealogy plays a vital role). For the most port, each element in any of our lists is given a three digit sort identifier. We then append the record's sort identifier to its parent's sort identifier to create its Sort ID. We treat this numeric Sort ID as a text field so that we can run our sort routine from left to right in the character string.
There are some exceptions to the numeric Sort ID field, namely in the glossary and vendor lists wherein the Sort ID is actually the genealogical name of the record's predecessors through its title. For instance, in the vendor list one of the vendors might be Sybari, which is a subsidiary of Microsoft. Therefore, its Sort ID would be "Microsoft Sybari".
The Sort ID is created and managed in the same manner as the genealogy (it is a dynamic calculation). It directly reflects the record's place within the taxonomic hierarchy and is therefore uneditable by the UCF's editorial team (although the team does set the sort order, the system handles the ID to manage the sort order). Any disputes with the validity of the sort ID are in effect a dispute with where the UCF's editorial team placed the record in question within the taxonomic structure.
This element will not always be present in all UCF XML lists. Some lists, like the Auditable Artifacts list do not require sorting and therefore the element will not be present in the XML schema.
Editorial rules for the Sort ID element
There are two editorial rules for the Sort ID element.
|
ID |
Rule |
|
10 |
Sort Order - Contextual Sort Order Sort orders must be set contextually. For example, the Glossary will always be sorted alphanumerically (because some terms are numeric). As another example, Authority Documents are sorted by their category, originator, alphabetically, and then by version history. Each UCF table will maintain its own contextual sorting rules. |
|
20 |
Sort ID - Format All records will follow a prescribed sort ID format. For lists that are not naturally alphabetical, the Sort ID format will be a numeric calculation. For further information on the Sort ID format for any given list, see that list's XML specification. |
QA rule violations and checks for the Sort ID element
Below is the table which summarizes the rules mentioned above and their potential violations.
|
Rule |
Error |
Potential Error |
|
10 |
13 |
Sort Order - Inaccurate The Sort Order of the record is incorrect, which is a separate item from the genealogy order. The reviewer or approver must annotate the record ID to which this record must follow in order to correct the sort order problem. |
|
20 |
24 |
ID - Malformed ID The ID, Genealogy, or Sort ID field does not fit the current ID format and a checksum calculation shows that the ID field is not correct. |
|
20 |
30 |
Field Format - Improper field type Certain data elements are looking for specific field formats. Our ID fields (also including our genealogy and sort IDs) are text-based numbers that are looking for text fields instead of number fields. Our date created is a date field, while our date modified is a date-time field. Therefore, this violation can occur when an element is entered into an improperly formatted field. |
|
20 |
31 |
Content - Duplicate content The content in question is a duplicate of content of the same type and therefore should not exist. |
UCF_XX_Live_Status (xs:Integer)
Because the UCF™ treats every ID as both unique and persistent, we never delete an ID once used, nor do we re-use the ID. Therefore, if we have to redact a record, we merely mark the Live Status as moving from 1 (live) to 0 (redacted).
All records are initially created and marked by the system as Live (1). There are certain scripts that the UCF's database team will run to ensure that two instances of automated deprecation takes place:
1. If an Authority Document has been deprecated, all of its citations will be deprecated.
2. If a control has no citations pointing to it, the control in question will be deprecated.
Other than the instances noted above, records must be deprecated as an editorial process and approved by both the editorial reviewer and the editorial approver. When the Live Status is set to deprecated (0), there might also be a corresponding setting for the Deprecated By element, but this is not mandatory.
This element will always be present in all UCF XML lists.
Editorial rules for the Live Status element
There is one editorial rule for the Live Status element.
|
ID |
Rule |
|
21 |
Live Status - No null values Live Status will either be 1 (live) or 0 (deprecated) at all times. |
QA rule violations and checks for the Live Status element
Below is the table which summarizes the rules mentioned above and their potential violations.
|
Rule |
Error |
Potential Error |
|
21 |
14 |
Content - Missing The content for this field is completely missing in a field that does not allow a null value. |
|
21 |
20 |
Content - Wrong restricted value type The value selected for the record is contained in the list, but is the wrong value from the list. |
|
21 |
30 |
Field Format - Improper field type Certain data elements are looking for specific field formats. Our ID fields (also including our genealogy and sort IDs) are text-based numbers that are looking for text fields instead of number fields. Our date created is a date field, while our date modified is a date-time field. Therefore, this violation can occur when an element is entered into an improperly formatted field. |
UCF_XX_Deprecated_By (xs:string)
If a record in the UCF needs to be deprecated, the record will not be deleted from the system. Instead, the record will be marked as deprecated (its "Live Status" field will be set to 0), and the Deprecated By field will be filled out with the ID(s) of the record(s) that took its place (if any).
Initially this element is blank and only a UCF editorial process can indicate a Deprecated By content change. That change is then reviewed by the editorial reviewer and editorial approver. If there are contents in this field, the Live Status field must be set to deprecated (0).
This element will not always be present in all UCF XML lists. Therefore the element will not always be present in the XML schema.
Editorial rules for the Deprecated By element
There is one editorial rule for the Deprecated By element.
|
ID |
Rule |
|
13 |
Deprecation - Records will never be deleted If a record in the UCF needs to be deprecated, the record will not be deleted from the system. Instead, the record will be marked as deprecated (its "Live Status" field will be set to 0), and the Deprecated By field will be filled out with the ID of the record that took its place (if any), and the record's genealogy will be changed to all nines (9999999) if the record is in a table that supports a hierarchical structure. |
QA rule violations and checks for the Deprecated By element
Below is the table which summarizes the rules mentioned above and their potential violations.
|
Rule |
Error |
Potential Error |
|
13 |
8 |
Deprecation - Missing The record should be deprecated. |
|
13 |
12 |
Content - Co-referenced fields do not match There are times when two fields (which might not be related in the database sense) must refer to each other or whose contents must interrelate. This rule has been broken when those fields do not co-reference each other correctly. As a case in point, the information in a record's Originator field might not match that found on the associated record's domain field. Or, the Acronym doesn't match the term in the Glossary. Or the Checksum doesn't match the ID field. |
|
13 |
15 |
Content - Does not match restricted value type The content that was expected for this field does not match the restricted value list associated with the field. |
|
13 |
16 |
Deprecation - Inappropriate The record should not be deprecated. |
|
13 |
24 |
ID - Malformed ID The ID, Genealogy, or Sort ID field does not fit the current ID format and a checksum calculation shows that the ID field is not correct. |
|
13 |
30 |
Field Format - Improper field type Certain data elements are looking for specific field formats. Our ID fields (also including our genealogy and sort IDs) are text-based numbers that are looking for text fields instead of number fields. Our date created is a date field, while our date modified is a date-time field. Therefore, this violation can occur when an element is entered into an improperly formatted field. |
UCF_XX_Deprecation_Notes (xs:string)
Deprecation notes are new to version 2.1 of the UCF, and we've done as good a job as possible back-filling them to ensure that we have covered our bases.
In a nutshell, when our mappers, reviewers, or approvers have made the decision to deprecate one of the records in the various XML tables, they will add their deprecation notes, their reasoning, to this field. There is no set format for what they are writing, so there aren't any hard and fast editorial rules, other than something has to be added to the field during deprecation.
UCF Editorial rules for the ID element
To be documented.
QA rule violations and checks for the ID element
To be documented.
UCF_XX_Date_Added (xs:date)
UCF_Date_Added is a date stamp for when the record was created.
This element is created when the record is entered into the UCF's Master Content database and not the working database. We chose this method because the UCF team's editorial process is a fluid one which allows, during the editing process, for records to be added, moved, deleted, or even "un-deleted" fluidly until the lock-date that ends the editorial process. Once the lock-date has been reached, all of the records are then finalized from the "working" list and uploaded as a batch to the Master Content database, which also triggers the change log process. Therefore, it is common to see all new records for any given quarter being added on the same date.
Because the Date Added element is controlled post-editorial process, the UCF database system manages everything automatically.
This element will always be present in all UCF XML lists.
Editorial rules for the Date Added element
There is one editorial rule for the Date Added element.
|
ID |
Rule |
|
17 |
Dates - All date added and modified fields will be programmatically controlled All date fields, such as date added, date modified, etc., will be programmatically controlled by the database and those fields will be set so that they cannot be modified by the end user. |
QA rule violations and checks for the Date Added element
Below is the table which summarizes the rules mentioned above and their potential violations.
|
Rule |
Error |
Potential Error |
|
17 |
14 |
Content - Missing The content for this field is completely missing in a field that does not allow a null value. |
|
17 |
30 |
Field Format - Improper field type Certain data elements are looking for specific field formats. Our ID fields (also including our genealogy and sort IDs) are text-based numbers that are looking for text fields instead of number fields. Our date created is a date field, while our date modified is a date-time field. Therefore, this violation can occur when an element is entered into an improperly formatted field. |
|
17 |
32 |
Date fields - Chicken and Egg All Modified Dates should be after the creation date. |
UCF_Date_Modified (xs:date)
UCF_Date_Modified is a date stamp for when the record was modified. We use this as a key field for tracking all roll forward and roll backward field calculations. The initial date reflects the date the authority document was added to the database.
This element is created and updated when the record is entered into the UCF's Master Content database and not the working database. We chose this method because the UCF team's editorial process is a fluid one which allows, during the editing process, for records to be added, moved, deleted, or even "un-deleted" fluidly until the lock-date that ends the editorial process. Once the lock-date has been reached, all of the records are then finalized from the "working" list and uploaded as a batch to the Master Content database, which also triggers the change log process, which relies on this field to trigger that a change has taken place in the record. Therefore, it is common to see all new records for any given quarter being "modified" on the same date, and all modifications for the quarter to happen on the same date as well.
We have heard from multiple XML licensees that they would rather have the exact date and time that the record was modified instead of the batch upload date. That isn't possible, given that all of the XML licensees also want us to produce a compact and digestible change log. A change log based upon the exact date of modification would have already produced several instances with over ten changes for certain records. Changes that were of no consequence to either the XML licensee or an end user, because those changes were simply a part of our internal editorial process. Therefore, to save processing time on the change log and to shorten the length (of the already very heavy) change log, we made the strategic decision to limit both date modified and date created to be the batch upload dates.
Because the Date Added element is controlled post-editorial process, the UCF database system manages everything automatically.
This element will always be present in all UCF XML lists.
Editorial rules for the Date Modified element
There is one editorial rule for the Date Modified element.
|
ID |
Rule |
|
17 |
Dates - All date added and modified fields will be programmatically controlled All date fields, such as date added, date modified, etc., will be programmatically controlled by the database and those fields will be set so that they cannot be modified by the end user. |
QA rule violations and checks for the Date Modified element
Below is the table which summarizes the rules mentioned above and their potential violations.
|
Rule |
Error |
Potential Error |
|
17 |
14 |
Content - Missing The content for this field is completely missing in a field that does not allow a null value. |
|
17 |
30 |
Field Format - Improper field type Certain data elements are looking for specific field formats. Our ID fields (also including our genealogy and sort IDs) are text-based numbers that are looking for text fields instead of number fields. Our date created is a date field, while our date modified is a date-time field. Therefore, this violation can occur when an element is entered into an improperly formatted field. |
|
17 |
32 |
Date fields - Chicken and Egg All Modified Dates should be after the creation date. |

Post a comment