CCA-Public / diskimageprocessor

Tool for automated processing of disk images in BitCurator

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Disks with multiple volumes

tw4l opened this issue · comments

Disks with multiple volumes/partitions (see test images) may only have files exported from one volume, depending on what tsk_recover is able to do.

I think it should be possible to make this work for multiple partitions. Here' s what I did manually for a large image of a hard disk:

disktype testsata.dd

Result:

--- testsata.dd
Regular file, size 298.1 GiB (320072933376 bytes)
DOS/MBR partition map
Partition 1: 350 MiB (367001600 bytes, 716800 sectors from 2048, bootable)
Type 0x07 (HPFS/NTFS)
NTFS file system
    Volume size 350.0 MiB (367001088 bytes, 716799 sectors)
Partition 2: 297.7 GiB (319703482368 bytes, 624420864 sectors from 718848)
Type 0x07 (HPFS/NTFS)
NTFS file system
    Volume size 297.7 GiB (319703481856 bytes, 624420863 sectors)

So in this case my disk image contains 2 partitions:

  • A 350 MiB boot partition
  • A 298 GiB partition

The second partition starts at sector 718848 (byte offset 512*718848 = 368050176). To export the files from the second partition:

tsk_recover -a -o 718848 ./image/testsata.dd ./fsOut

It should be pretty straightforward to automate this by some additional parsing of the disktype output, and then iterating over all partitions. I might give this a try myself ...

Hi Johan. This would be a great addition! I wonder if Alex Nelson's disktype_to_dfxml might come in handy here. It's been on my list to add better handling of multi-partition disks in Brunnhilde as well, so I'd be interested in your approach.

From a logistical perspective: I no longer have access to the CCA-Public account, so I would recommend forking the codebase and doing your development work there. If you'd want to merge changes back into the master, I could put you in touch with folks at the CCA who might be able to help with that (but I can't speak for them).

There's also disktype_json, which spits out JSON instead of plain text files. That might make parsing/handling a bit easier.