Bulk Compare Tool
HL7Spy Enterprise and Professional Only
The Bulk Compare tool allows messages from 2 different Tabs (files) to be compared for differences in message structure and content.
Baseline and Candidate Message Sets
A typical use of this tool is to compare the output of 2 different Integration Engines using the same source file. In the image below a source file is being processed by a "Legacy Integration Engine" resulting in a "Baseline File" and by a "Replacement Integration Engine", resulting in the "Candidate File".
In HL7Spy we use the terminology "Baseline" to represent the expected (or correct) output, and the "Candidate" to represent the file being compared to the "Baseline".
Linking Tabs
The principle behind the operation of this tool is to "link" two Message Tabs together by using a field (called the key path) within the message that identifies the same message in each tab. Once the equivalent message from each tab has been identified, a comparison is performed, and the differences are noted. By default, the key path is MSH-10 (Message Control ID), but this can be configured to use any combination of fields within the message.
Linking two tabs is accomplished by clicking the "Bulk Compare" option in the main window and selecting the "Baseline" and "Candidate" message tabs as shown below. (Note: the message files need to be first loaded in HL7Spy before they will appear in the Bulk Compare tool.)
Running the Comparison
Once the tabs have been selected for comparison, select the "Run" button to start the comparison process. Bulk message comparison is performed by taking each message in the Baseline tab and finding a matching message in the Candidate tab and looking for any differences between the two messages. Any differences that are found are recorded and saved for later navigation and reporting. A summary of the results is displayed in the main Bulk Compare window, an example of which is shown below.
Interpreting the Results
Results are displayed with one row per message type. The "(All)" row represents the aggregate of the results for all message types.
In this example, the 2 message tabs being compared each had 100,000 messages and were very close to being identical. In fact, there were only 4 messages that had differences (shown in orange): 2 ADT^A08, 1 ORM, and 1 ORU message. The total number of fields that had differences was 10: 2 fields across all A08s, 1 ORM, and 7 fields across all R01 messages.
A definition of the columns from left to right is as follows:
Type - The type of message of HL7 message.
Baseline - the number of messages in the Baseline tab.
Candidate - the number of messages in the Candidate Tab.
Exact - the number of messages that are identical, or binary equal.
Exact % - the percentage of messages that are exactly the same.
Same - the number of messages that are semantically the same. Meaning they have the same content, but may have an extra empty field, or a field that was configured to be explicitly ignored.
Same % - the percentage of messages are semantically the same.
Diff - the number of messages that are different between the Baseline and Candidate tabs.
Not in Candidate - the number of messages in the Baseline tab that were not found in the Candidate tab.
Not in Baseline - the number of messages in the Candidate tab that were not part of the Candidate message set.
Field Diffs - the total number of fields that were different between the Baseline and Candidate message sets.
Errors - the number of parsing errors found during message comparison.
Navigating Message Differences
To view the list of messages that are Different, Exact, or the Same, click the "+" on the left side of a row in the Bulk Comparison Summary window. This list also includes the Key used to identify the messages and the index of the message in both the Baseline and Candidate message tabs.
By clicking on a row, the Baseline message will be displayed in the left window of the "Compare" tool, while the Candidate message will be displayed in the right window.
Summary of Field Differences
The Field Differences view shows an HL7 Field-centered perspective of changes across the selected message type. In the example image, all 10 different fields are displayed, with the Baseline and Candidate field values in the last two columns. The "Field Difference Summary" provides a summary of changes based on each unique HL7 Path.
Ignore Fields
If specific fields are not relevant to the comparison results, they can be ignored adding them "Paths to Ignore" setting.
Examples:
- ORC-3.2 - ignore the 2nd component of ORC-3
- ID-5[2] - ignore the 2nd repeat of PID-5
- OBX[*]-11 - ignore the 11th field of all OBX segments
- IN1, or IN1[1] - ignore the 1st IN1 segment
- IN1[*] - ignore all IN1 segments