Bulk Compare Tool

HL7Spy Enterprise and Professional Only

The Bulk Compare tool allows messages from 2 different Tabs (files) to be compared for differences in message structure and content.

Baseline and Candidate Message Sets

A typical use of this tool is to compare the output of 2 different Integration Engines using the same source file. In the image below a source file is being processed by a "Legacy Integration Engine" resulting in a "Baseline File" and by a "Replacement Integration Engine", resulting in the "Candidate File".

In HL7Spy we use the terminology "Baseline" to represent the expected (or correct) output, and the "Candidate" to represent the file being compared to the "Baseline". 

Linking Tabs

The principle behind the operation of this tool is to "link" 2 Message Tabs together by using a field (called the key path) within the message that identifies the same message in each tab. Once the equivalent message from each tab has been identified, a comparison is performed, and the differences noted. By default, the key path is MSH-10 (Message Control ID) but this can be configured to use any combination of fields within the message.

Linking 2 tabs is accomplished by clicking the "Bulk Compare" option in the main window and selecting the "Baseline" and "Candidate" message tabs as shown below. (Note the message files need to be first loaded in HL7Spy before they will appear in the Bulk Compare tool.)


Running the Comparison

Once the tabs have been selected for comparison, select the "Run" button to start the comparison process. Bulk message comparison is performed by taking each message in the Baseline tab and finding a matching message in the Candidate tab and looking for any differences between the 2 messages. Any differences that are found are recorded and saved for later navigation and reporting. A summary of the results is displayed in the main Bulk Compare window, an example of which is shown below.


Interpreting the Results

Results are displayed with one row per message type. The "(All)" row represents the aggregate of the results for all message types.

In this example, the 2 message tabs being compared each had 100,000 messages and were very close to being identical. In fact, there were only 4 messages that had differences (shown in orange): 2 ADT^A08, 1 ORM, and 1 ORU message. The total number of fields that had differences was 10: 2 fields across all A08s, 1 ORM, and 7 fields across all R01 messages.

A definition of the columns from left to right are as follows:

Type - The type of message of HL7 message.

Baseline - the number of messages in the Baseline tab.

Candidate - the number of messages in the Candidate Tab.

Exact - the number of messages that are identical, or binary equal.

Exact % - the percentage of messages that are exactly the same.

Same - the number of messages that are semantically the same. Meaning they have the same content, but may have an extra empty field, or a field that was configured to be explicitly ignored.

Same % - the percentage of messages are semantically the same.

Diff - then number of messages that are different between the Baseline and Candidate tabs.

Not in Candidate - the number of messages in the Baseline tab that were not found in the Candidate tab.

Not in Baseline - the number of messages in the Candidate tab that were not part of the Candidate message set.

Field Diffs - the total number of fields that were different between the Baseline and Candidate message sets.

Errors - the number of parsing errors found during message comparison.

Navigating Message Differences

Clicking the "+" on the left of a row in the Bulk Comparison Summary window will display the list of messages that are Different, Exact, or the Same and includes the Key used to identify the messages as well as the index of the message in both the Baseline and Candidate message tabs.



Clicking on a row will display the Baseline message and the Candidate message in the "Compare" tool with the Baseline displayed in the left window and the Candidate message being displayed in the right window.


Summary of Field Differences

The Field Differences view provides an HL7 Field centred view on the changes across the selected message type. In the image below all 10 fields that were different are displayed with the Baseline and Candidate field values shown in the last 2 columns. A summary of changes based on each unique HL7 Path can be found in the "Field Difference Summary".

Ignore Fields

If specific fields are not relevant to the comparison results, they can be ignored adding them "Paths to Ignore" setting.

Examples:

  • ORC-3.2 - ignore the 2nd component of ORC-3
  • ID-5[2] - ignore the 2nd repeat of PID-5
  • OBX[*]-11 - ignore the 11th field of all OBX segments
  • IN1, or IN1[1] - ignore the 1st IN1 segment
  • IN1[*] - ignore all IN1 segments