log2timeline / plaso

Super timeline all the things

Home Page:https://plaso.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Merge: AttributeError: 'NoneType' object has no attribute 'path_spec'

trashg0blin opened this issue · comments

Attempting to parse out the contents of a browser profile, have tried multiple times to change the parameters of a run and each time I inevitably receive the below error on random files with no observable pattern. Memory of the wsl2 instance has been increased to 8GB and no resource limitations have been observed.

Running from Docker log2timeline/plaso image d00889bbcf2a

docker version

Client: Docker Engine - Community
Cloud integration: v1.0.35+desktop.5
Version: 24.0.7
API version: 1.43
Go version: go1.20.10
Git commit: afdd53b
Built: Thu Oct 26 09:08:02 2023
OS/Arch: linux/amd64
Context: default
Server: Docker Desktop
Engine:
Version: 24.0.7
API version: 1.43 (minimum version 1.12)
Go version: go1.20.10
Git commit: 311b9ff
Built: Thu Oct 26 09:08:02 2023
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.25
GitCommit: d8f198a4ed8892c764191ef7b3b06d8a2eeb5c7f
runc:
Version: 1.1.10
GitCommit: v1.1.10-0-g18a0cb0
docker-init:
Version: 0.19.0
GitCommit: de40ad0

wsl --version

WSL version: 2.0.9.0
Kernel version: 5.15.133.1-1
WSLg version: 1.0.59
MSRDC version: 1.2.4677
Direct3D version: 1.611.1-81528511
DXCore version: 10.0.25131.1002-220531-1700.rs-onecore-base2-hyp
Windows version: 10.0.22631.2715

Error received

2023-12-07 15:01:20 Traceback (most recent call last):
2023-12-07 15:01:20 File "/usr/lib/python3/dist-packages/plaso/multi_process/extraction_engine.py", line 731, in _ProcessEventSources
2023-12-07 15:01:20 self._MergeTaskStorage(storage_writer, session_identifier)
2023-12-07 15:01:20 File "/usr/lib/python3/dist-packages/plaso/multi_process/extraction_engine.py", line 632, in _MergeTaskStorage
2023-12-07 15:01:20 self._task_merge_helper.Close()
2023-12-07 15:01:20 AttributeError: 'NoneType' object has no attribute 'Close'
2023-12-07 15:01:20
2023-12-07 15:01:20 During handling of the above exception, another exception occurred:
2023-12-07 15:01:20
2023-12-07 15:01:20 Traceback (most recent call last):
2023-12-07 15:01:20 File "/usr/bin/log2timeline.py", line 103, in
2023-12-07 15:01:20 if not Main():
2023-12-07 15:01:20 File "/usr/bin/log2timeline.py", line 77, in Main
2023-12-07 15:01:20 tool.ExtractEventsFromSources()
2023-12-07 15:01:20 File "/usr/lib/python3/dist-packages/plaso/cli/extraction_tool.py", line 763, in ExtractEventsFromSources
2023-12-07 15:01:20 processing_status = self._ProcessSource(session, storage_writer)
2023-12-07 15:01:20 File "/usr/lib/python3/dist-packages/plaso/cli/extraction_tool.py", line 553, in _ProcessSource
2023-12-07 15:01:20 processing_status = extraction_engine.ProcessSourceMulti(
2023-12-07 15:01:20 File "/usr/lib/python3/dist-packages/plaso/multi_process/extraction_engine.py", line 1160, in ProcessSourceMulti
2023-12-07 15:01:20 self._ProcessSource(
2023-12-07 15:01:20 File "/usr/lib/python3/dist-packages/plaso/multi_process/extraction_engine.py", line 799, in _ProcessSource
2023-12-07 15:01:20 self._ProcessEventSources(storage_writer, session_identifier)
2023-12-07 15:01:20 File "/usr/lib/python3/dist-packages/plaso/multi_process/extraction_engine.py", line 757, in _ProcessEventSources
2023-12-07 15:01:20 '{0!s}').format(exception), event_source.path_spec)
2023-12-07 15:01:20 AttributeError: 'NoneType' object has no attribute 'path_spec'

Main log

2023-12-07 20:19:17,515 [DEBUG] (MainProcess) PID:7 <extraction_engine> Scheduled task: d3b6276c4a7a475ab152781640a3a8c4 for path specification: type: OS, location: /data/Edge/Default/Cache/Cache_Data/f_0002e0
2023-12-07 20:19:17,515 [DEBUG] (MainProcess) PID:7 <zeromq_queue> main_task_queue sending item
2023-12-07 20:19:17,516 [DEBUG] (MainProcess) PID:7 <zeromq_queue> main_task_queue sent item
2023-12-07 20:19:17,527 [DEBUG] (MainProcess) PID:7 <task_manager> Task fd4b520d2dfd497ba635fd92323b6046 was queued, now merging.
2023-12-07 20:19:17,625 [DEBUG] (MainProcess) PID:7 <task_manager> Task d3b6276c4a7a475ab152781640a3a8c4 was queued, now processing.
2023-12-07 20:19:17,702 [DEBUG] (MainProcess) PID:7 <task_manager> Completed task e6334be4da794ad7b43cb0fc54e63783.
2023-12-07 20:19:17,703 [DEBUG] (MainProcess) PID:7 <task_manager> Checking for pending tasks
2023-12-07 20:19:17,710 [DEBUG] (MainProcess) PID:7 <task_manager> Created task: 570659c2952f4d32afd27371512d1a99.
2023-12-07 20:19:17,711 [DEBUG] (MainProcess) PID:7 <extraction_engine> Scheduled task: 570659c2952f4d32afd27371512d1a99 for path specification: type: OS, location: /data/Edge/Default/Cache/Cache_Data/f_0002e1
2023-12-07 20:19:17,711 [DEBUG] (MainProcess) PID:7 <zeromq_queue> main_task_queue sending item
2023-12-07 20:19:17,713 [DEBUG] (MainProcess) PID:7 <zeromq_queue> main_task_queue sent item
2023-12-07 20:19:17,720 [DEBUG] (MainProcess) PID:7 <task_manager> Task 59c3fbabb51d4f878c93b6074cd8c0c7 was queued, now merging.
2023-12-07 20:19:17,721 [ERROR] (MainProcess) PID:7 <extraction_engine> Unable to merge results of task: 59c3fbabb51d4f878c93b6074cd8c0c7 with error: Merge task storage path is not a file.

Worker Log

2023-12-07 20:19:17,358 [DEBUG] (Worker_02 ) PID:15 <extraction_process> Started processing task: 59c3fbabb51d4f878c93b6074cd8c0c7.
2023-12-07 20:19:17,509 [DEBUG] (Worker_02 ) PID:15 [ProcessFileEntry] processing file entry: OS:/data/Edge/Default/Cache/Cache_Data/f_0002de
2023-12-07 20:19:17,510 [DEBUG] (Worker_02 ) PID:15 [ProcessFileEntryDataStream] processing data stream: "" of file entry: OS:/data/Edge/Default/Cache/Cache_Data/f_0002de
2023-12-07 20:19:17,510 [DEBUG] (Worker_02 ) PID:15 [AnalyzeDataStream] analyzing file: OS:/data/Edge/Default/Cache/Cache_Data/f_0002de
2023-12-07 20:19:17,527 [DEBUG] (Worker_02 ) PID:15 <hashing_analyzer> Processing results for hasher sha256
2023-12-07 20:19:17,527 [DEBUG] (Worker_02 ) PID:15 [AnalyzeFileObject] attribute sha256_hash:c7ce4347e9445eeb67fbbeb5f14824da71c167d5ece780c49245d15e1fb729bd calculated for file: OS:/data/Edge/Default/Cache/Cache_Data/f_0002de.
2023-12-07 20:19:17,528 [DEBUG] (Worker_02 ) PID:15 [AnalyzeDataStream] completed analyzing file: OS:/data/Edge/Default/Cache/Cache_Data/f_0002de
2023-12-07 20:19:17,542 [DEBUG] (Worker_02 ) PID:15 [ExtractMetadataFromFileEntry] processing file entry: OS:/data/Edge/Default/Cache/Cache_Data/f_0002de
2023-12-07 20:19:17,565 [DEBUG] (Worker_02 ) PID:15 Skipping content extraction of: OS:/data/Edge/Default/Cache/Cache_Data/f_0002de
2023-12-07 20:19:17,566 [DEBUG] (Worker_02 ) PID:15 [ProcessFileEntry] done processing file entry: OS:/data/Edge/Default/Cache/Cache_Data/f_0002de
2023-12-07 20:19:17,586 [DEBUG] (Worker_02 ) PID:15 <extraction_process> Completed processing task: 59c3fbabb51d4f878c93b6074cd8c0c7.
2023-12-07 20:19:17,587 [DEBUG] (Worker_02 ) PID:15 <zeromq_queue> Pop on Worker_02 task queue queue, port 41709

file output
..Edge/Default/Cache/Cache_Data/f_0002de: JPEG image data, JFIF standard 1.01, resolution (DPI), density 96x96, segment length 16, baseline, precision 8, 612x304, components 3

Running from Docker log2timeline/plaso image d00889bbcf2a

Which version is this? I don't see d00889bbcf2a in any of the recent versions https://hub.docker.com/r/log2timeline/plaso/tags

Grabbed Image id instead of digest, digest is 06a97ccdc9fc3a6dc84aa8a82e60d10d42c155bc20c629baedfa7f684353e21e

Did some additional testing, my source when the error occurs is a zip file so I suspect the issue lies in dfvfs enumeration of the path. I ran log2timeline against a source directory and the ran completed successfully. Did another run with the directory in a zip and got the error outlined above. Did a subsequent test run after unzipping the zip and received the same error. I'm assuming it's something with zip compression that's mangling the files.

Further testing yields that the issue is with drvfs. Moved my archive over to the native filesystem in wsl and everything runs as expected.

@trashg0blin I have a hard time following your write up if I'm not able to reproduce the issue I cannot address it. Can you please provide an easy to follow description of the issue.

Apologies for the scattered thoughts.

Within WSL, invoke log2timeline via the docker container where the source target is any folder or file from a drvfs mounted folder/drive.

Expected result:
Data is parsed

Actual Result:
Consistent path_spec resolution errors at storage file merge time.

That high level description I get, but what is this zip file? The plaso issue tracker has a template, could you please use that. The reason for having this template is to make sure we have the necessary information to assess the issue to begin with.

**Describe the problem:**

Please provide a clear and detailed description of what the problem is.

**To Reproduce:**

The version of Plaso you used:

For example: 20171231

The operating system you are running Plaso on (Not the operating system of the image/files you're trying to analyze):

For example: Ubuntu 22.04

Steps to reproduce the behavior including command line and arguments and output:

First I ran `log2timeline.py --help` that provided me the following output `...`

Please provide the source data you used when you experienced the problem. For publicly available data please provide a URL or path of the source data.

For example: individual ChromeOS syslog file

The method you used to install Plaso:

For example:
* installed from [GiFT PPA][https://launchpad.net/~gift] stable track
* installed from [GiFT COPR][https://copr.fedorainfracloud.org/coprs/g/gift/] stable track
* installed from [l2tbinaries][https://github.com/log2timeline/l2tbinaries] main branch
* built using [l2devtools][https://github.com/log2timeline/l2tdevtools]
* other, namely ...

If multiple installation methods were used please indicate.

**Expected behavior:**

A clear and concise description of what you expected to happen.

**Debug output/tracebacks:**

You can run log2timeline tools with "-d" to generate debug output, and include anything relevant. Also see: [Producing debug logs][https://plaso.readthedocs.io/en/latest/sources/Troubleshooting.html#producing-debug-logs]

Please DO NOT provide screenshots, they can be hard to read.

For more information see the [troubleshooting guide][https://plaso.readthedocs.io/en/latest/sources/Troubleshooting.html]

**Additional context**

Any other context about the problem here.