Skip to content

sumeshi/ntfsfind

Repository files navigation

ntfsfind

MIT License PyPI Version

ntfsfind logo

A command-line tool for efficiently searching files, directories, and alternate data streams directly from NTFS image files.

Overview

ntfsfind allows digital forensic investigators and incident responders to search NTFS file system records in disk images using regular expressions, without mounting the images. By leveraging powerful backend libraries, it supports common forensic image formats such as RAW, E01, VHD/VHDX, and VMDK, and reliably parses NTFS structures.

Features

  • Direct Search: Search files directly from NTFS partitions without mounting the image.
  • Multiple Image Formats: Read RAW, E01, VHD, VHDX, and VMDK images.
  • Regex Queries: Search file paths with regular expressions. Partial matching is used by default, similar to grep.
  • Alternate Data Streams (ADS): Find hidden alternate data streams.
  • CLI and Python Module: Use it from the command line or integrate it into your own automation tools.

Execution Environment

  • Python: Compatible with Python 3.13+.
  • Precompiled Binaries: Available for both Windows and Linux in the GitHub releases section.

Installation

# From PyPI
pip install ntfsfind

# From GitHub Releases (Precompiled Binaries)
chmod +x ./ntfsfind
./ntfsfind --help

# On Windows
ntfsfind.exe --help

Supported Input

  • Image formats: RAW, E01, VHD, VHDX, VMDK.
  • File system: NTFS.
  • Partition tables: GPT is supported. MBR may be auto-detected depending on the image.

Usage

Command Line Interface

You can pass arguments directly to the CLI. Search queries are matched against normalized NTFS paths using forward slashes (/).

ntfsfind [OPTIONS] <IMAGE> [SEARCH_QUERY]

Options:

  • --help, -h: Show help message.
  • --version, -V: Display program version.
  • --volume, -n: Target specific NTFS volume number (default: auto-detects main OS volume).
  • --format, -f: Image file format (default: raw). Options: raw, e01, vhd, vhdx, vmdk.
  • --ignore-case, -i: Enable case-insensitive search.
  • --fixed-strings, -F: Interpret search query as a literal fixed string instead of a regular expression.
  • --multiprocess, -m: Enable multiprocessing for the operation.
  • --out-mft: Export the parsed $MFT raw bytes to the specified file path.

Examples

Find Eventlogs:

$ ntfsfind ./path/to/your/image.raw '.*\.evtx'
/Windows/System32/winevt/Logs/Setup.evtx
/Windows/System32/winevt/Logs/Microsoft-Windows-All-User-Install-Agent%4Admin.evtx
/Logs/Windows PowerShell.evtx
/Logs/Microsoft-Windows-Winlogon%4Operational.evtx
/Logs/Microsoft-Windows-WinINet-Config%4ProxyConfigChanged.evtx
...

Find the original $MFT file and files in its path:

$ ntfsfind ./path/to/your/image.raw '\$MFT'
/$MFT
/$MFTMirr

Find alternate data streams:

$ ntfsfind ./path/to/your/image.raw '.*:.*'

Export $MFT and search it directly for faster repeated queries: A dumped $MFT file can also be used as input for faster repeated searches.

# 1. Export MFT from the image (search query can be omitted)
$ ntfsfind --out-mft /tmp/my_mft.bin ./path/to/your/image.raw

# 2. Later you can query the dumped MFT file instead of the heavy image!
$ ntfsfind /tmp/my_mft.bin '.evtx'

Working with ntfsdump

When combined with ntfsdump, matching files can be dumped directly from the image via standard input. ntfsfind and ntfsdump are compatible if they share the same major and minor versions (e.g. they can be used together if both are version 3.0.x).

$ ntfsfind ./path/to/imagefile.raw '.*\.evtx' | ntfsdump -o ./dump ./path/to/imagefile.raw

Python Module

You can incorporate ntfsfind logic into your own scripts.

from ntfsfind import ntfsfind

# image: str
# search_query: str
# volume: Optional[int] = None
# format: Literal['raw', 'e01', 'vhd', 'vhdx', 'vmdk'] = 'raw'
# multiprocess: bool = False
# ignore_case: bool = False
# fixed_strings: bool = False
# out_mft: Optional[str] = None
#
# -> List[str]

records = ntfsfind(
    image='./path/to/your/imagefile.raw',
    search_query=r".*\.evtx",
    volume=2,
    format='raw',
    multiprocess=False,
    ignore_case=True,
    fixed_strings=False,
    out_mft='/tmp/dumped_mft.bin'
)

for record in records:
    print(record)

Contributing

We welcome bug reports, issues, and feature requests. Please do so on the GitHub repository. 🍣 🍣 🍣

License

Released under the MIT License.

Powered by:

Third-party licenses

The standalone binaries distributed via GitHub Releases bundle the following third-party libraries. The libyal libraries (libewf, libvhdi, libvmdk) and pytsk3 are pulled in transitively via ntfsdump, but they are physically bundled inside the ntfsfind binary, so their notices are reproduced here as well.

LGPL-3.0-or-later

The following libyal libraries are licensed under the GNU Lesser General Public License v3.0 or later (LGPL-3.0-or-later). You may obtain, modify, and rebuild them from their upstream sources in accordance with the LGPL.

Apache-2.0

MIT

About

A command-line tool for searching files, directories, and alternate data streams directly from NTFS image files.

Topics

Resources

License

Stars

Watchers

Forks

Contributors