In this example I’m using hashdeep. I’m redirecting the output of two hash sets to two different files. I am doing that with the following command:
hashdeep -rj0 /path-to-drive-1 > hashes.drive1
and
hashdeep -rj0 /path-to-drive-2 > hashes.drive2
I have those running in their own terminal windows. I then optionally have another two windows open running a tail on them so I can monitor the files:
tail -f hashes.drive1
The hard drives are located in an external multi-bay enclosure and all hard drive LEDs are flashing away like mad. A good sign. But every now and then I’ll run an ‘ls’ to see where the files are at (checking file size) or alternatively (and usually better but more resource intensive) a line count of the hash files. Given I know how many files there should be, the line count gives a fair indication of the progress of the whole process.
wc -l hashes.drive*
In today’s example I was simply doing a file size comparison of the two hashes vs a known hashset of one of the drives that was a month old. The sizes should be relatively similar. I was getting results similar to:
madivad@server:~$ ls -al hashes* -rw-rw-r-- 1 madivad madivad 330483319 Feb 11 09:26 hash.drive1.1602 -rw-rw-r-- 1 madivad madivad 341570757 Mar 23 12:09 hash.drive1.1603 -rw-rw-r-- 1 madivad madivad 243344728 Mar 23 11:18 hash.drive2.1603
The fact that drive1.1603 is larger is of no consequence, there are just more files to consider.
After running the above check for sometime, I realised that one of the files (in this case drive1.1603
) had stalled for several hours. I’m not exactly sure when it seemed to stop growing, but doing a tail of the file confirmed it was stopped. The last output was an inconsequential .DS_Store file roughly 6K in size. After physically monitoring it for some time I began to get concerned about this. I could see the all 4 RAID drives getting activity, but nothing was being recorded. The 5th drive, the backup, was hashing away without a problem and the log file was growing as expected.
After some quick research I came across this stack exchange Q&A ( How do I know which file a program is trying to access? )
The first answer provided a solution that worked best with my scenario:
lsof -c hashdeep
I’d never seen this output before but very quickly I could see the important pieces of information it had dumped out. Namely:
madivad@server:~$ lsof -c hashdeep COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME hashdeep 2539 madivad 1w REG 252,0 243344728 5535319 /home/madivad/hash.drive1.1603 hashdeep 2539 madivad 3r REG 259,0 499418030080 113639426 /path1/largeFiles/a-very-big-image-of-500GB.img hashdeep 2552 madivad 1w REG 252,0 341611062 5535320 /home/madivad/hash.drive2.1603 hashdeep 2552 madivad 3r REG 8,33 3152347139 126025746 /path2/misc/random.file
The ‘w’ of FD with ‘1w’ signifies the file is being written and that the file being written was hash.drive1.1603
The ‘r’ of FD with ‘3r’ signifies the file is being read for hashing purposes, and that file is a very large file that I know is around 500GB. Running the command again shows me the second file being read in had changed, yet the first had stayed the same.
Given the file is very large and will take considerable time to hash and that the hard drive LEDs are flashing, I realised all was good in the world and I could move on with the days activities.
UPDATE: after reading the man page on lsof
I found a better way to monitor the continual progress of it was to run it with the -r
“repeat” switch which defaults to 15 seconds, which could be updated more or less frequently by adding a numerical component:
lsof -r 5 -c hashdeep