Data Diffing
Data diffing with radiff2 allows you to compare binary data between files of different sizes. This is useful for identifying differences at the byte level, regardless of file length.
For example, comparing two files with radiff2 -x
shows the differences in two column hexdump+ascii format:
$ cat 1
hello
$ cat 2
hallo
$ radiff2 -x 1 2
offset 0 1 2 3 4 5 6 7 01234567 0 1 2 3 4 5 6 7 01234567
0x00000000! 68656c6c6f0a hello. 68616c6c6f0a hallo.
Also in hexII format:
$ radiff2 -X 1 2
0x00000000! .h.e.l.l.o0a .h.a.l.l.o0a
or even the unified diff format using the -U
flag:
$ radiff2 -U 1 2
--- /tmp/r_diff.61dd4e41da041 2024-07-22 14:07:37.682683431 +0200
+++ /tmp/r_diff.61dd4e41da06b 2024-07-22 14:07:37.682683431 +0200
@@ -1 +1 @@
-hello
+hallo
Let's understand the output because in your terminal you'll see some green and red highlighting the added or removed bytes from the byte-to-byte comparison.
!
sign after the offset explains if the block is equal or not- hexdump portion of file 1
- hexdump portion of file 2
When comparing files of different sizes, we will need to use the -d
flag which performs a delta-diffing algorithm, trying to find the patterns of bytes that has been added or removed when a specific change is found.
$ cat 1
hello
$ cat 3
helloworld
$ radiff2 1 3
INFO: File size differs 6 vs 11
INFO: Buffer truncated to 6 byte(s) (5 not compared)
0x00000005 0a => 77 0x00000005
$ radiff2 -d 1 3
INFO: File size differs 6 vs 11
0x00000000 68656c6c6f0a => 68656c6c6f776f726c640a 0x00000000
$
For JSON output, use radiff2 -j -d to get detailed diff information in JSON format:
$ radiff2 -j -d 1 3 |jq .
INFO: File size differs 6 vs 11
{
"files": [
{
"filename": "1",
"size": 6,
"sha256": "5891b5b522d5df086d0ff0b110fbd9d21bb4fc7163af34d08286a2e846f6be03"
},
{
"filename": "3",
"size": 11,
"sha256": "8cd07f3a5ff98f2a78cfc366c13fb123eb8d29c1ca37c79df190425d5b9e424d"
}
],
"changes": [
{
"addr": 0,
"from": "68656c6c6f0a",
"to": "68656c6c6f776f726c640a"
}
]
}
$
These examples demonstrate how radiff2 can effectively highlight differences in files of varying lengths, providing clear insights into changes at the binary level.