dvalter / p0f

Fork of fork of p0f, the passive fingerprinter

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool



CAUTION:  This is not the original p0f software. This repository contains a highly modified version of p0f.

Please check the original author's website if you're in search for the original p0f application.









                        =============================
                        p0f v3: passive fingerprinter
                        =============================

                    http://lcamtuf.coredump.cx/p0f3.shtml

         Copyright (C) 2012 by Michal Zalewski <lcamtuf@coredump.cx>


---------------
1. What's this?
---------------

P0f is a tool that utilizes an array of sophisticated, purely passive traffic
fingerprinting mechanisms to identify the players behind any incidental TCP/IP
communications (often as little as a single normal SYN) without interfering in
any way.

Some of its capabilities include:

  - Highly scalable and extremely fast identification of the operating system
    and software on both endpoints of a vanilla TCP connection - especially in
    settings where NMap probes are blocked, too slow, unreliable, or would
    simply set off alarms,

  - Measurement of system uptime and network hookup, distance (including
    topology behind NAT or packet filters), and so on.

  - Automated detection of connection sharing / NAT, load balancing, and
    application-level proxying setups.

  - Detection of dishonest clients / servers that forge declarative statements
    such as X-Mailer or User-Agent.

The tool can be operated in the foreground or as a daemon, and offers a simple
real-time API for third-party components that wish to obtain additional
information about the actors they are talking to.

Common uses for p0f include reconnaissance during penetration tests; routine
network monitoring; detection of unauthorized network interconnects in corporate
environments; providing signals for abuse-prevention tools; and miscellanous
forensics.

A snippet of typical p0f output may look like this:

.-[ 1.2.3.4/1524 -> 4.3.2.1/80 (syn) ]-
|
| client   = 1.2.3.4
| os       = Windows XP
| dist     = 8
| params   = none
| raw_sig  = 4:120+8:0:1452:65535,0:mss,nop,nop,sok:df,id+:0
|
`----

.-[ 1.2.3.4/1524 -> 4.3.2.1/80 (syn+ack) ]-
|
| server   = 4.3.2.1
| os       = Linux 3.x
| dist     = 0
| params   = none
| raw_sig  = 4:64+0:0:1460:mss*10,0:mss,nop,nop,sok:df:0
|
`----

.-[ 1.2.3.4/1524 -> 4.3.2.1/80 (mtu) ]-
|
| client   = 1.2.3.4
| link     = DSL
| raw_mtu  = 1492
|
`----

.-[ 1.2.3.4/1524 -> 4.3.2.1/80 (uptime) ]-
|
| client   = 1.2.3.4
| uptime   = 0 days 11 hrs 16 min (modulo 198 days)
| raw_freq = 250.00 Hz
|
`----

A live demonstration can be seen here:

http://lcamtuf.coredump.cx/p0f3/

--------------------
2. How does it work?
--------------------

A vast majority of metrics used by p0f were invented specifically for this tool,
and include data extracted from IPv4 and IPv6 headers, TCP headers, the dynamics
of the TCP handshake, and the contents of application-level payloads.

For TCP/IP, the tool fingerprints the client-originating SYN packet and the
first SYN+ACK response from the server, paying attention to factors such as the
ordering of TCP options, the relation between maximum segment size and window
size, the progression of TCP timestamps, and the state of about a dozen possible
implementation quirks (e.g. non-zero values in "must be zero" fields).

The metrics used for application-level traffic vary from one module to another;
where possible, the tool relies on signals such as the ordering or syntax of
HTTP headers or SMTP commands, rather than any declarative statements such as
User-Agent. Application-level fingerprinting modules currently support HTTP
and SSL. Before the tool leaves "beta", I want to add SMTP and FTP. Other
protocols, such as FTP, POP3, IMAP and SSH may follow.

The list of all the measured parameters is reviewed in section 5 later on.
Some of the analysis also happens on a higher level: inconsistencies in the
data collected from various sources, or in the data from the same source
obtained over time, may be indicative of address translation, proxying, or
just plain trickery. For example, a system where TCP timestamps jump back
and forth, or where TTLs and MTUs change subtly, is probably a NAT device.

-------------------------------
3. How do I compile and use it?
-------------------------------

To compile p0f, try running './build.sh'; if that fails, you will be probably
given some tips about the probable cause. If the tips are useless, send me a
mean-spirited mail.

It is also possible to build a debug binary ('./build.sh debug'), in which case,
verbose packet parsing and signature matching information will be written to
stderr. This is useful when troubleshooting problems, but that's about it.

The tool should compile cleanly under any reasonably new version of Linux,
FreeBSD, OpenBSD, MacOS X, and so forth. You can also builtdit on Windows using
cygwin and winpcap. I have not tested it on all possible varieties of un*x, but
if there are issues, they should be fairly superficial.

Once you have the binary compiled, you should be aware of the following
command-line options:

  -i iface   - asks p0f to listen on a specific network interface. On un*x, you
               should reference the interface by name (e.g., eth0). On Windows,
               you can use adapter index instead (0, 1, 2...).
               
               Multiple -i parameters are not supported; you need to run
               separate instances of p0f for that. On Linux, you can specify
               'any' to access a pseudo-device that combines the traffic on
               all other interfaces; the only limitation is that libpcap will
               not recognize VLAN-tagged frames in this mode, which may be
               an issue in some of the more exotic setups.

               If you do not specify an interface, libpcap will probably pick
               the first working interface in your system.
               
  -L         - lists all available network interfaces, then quits. Particularly
               useful on Windows, where the system-generated interface names
               are impossible to memorize.
               
  -r fname   - instead of listening for live traffic, reads pcap captures from
               the specified file. The data can be collected with tcpdump or any
               other compatible tool. Make sure that snapshot length (-s
               option in tcpdump) is large enough not to truncate packets; the
               default may be too small.

               As with -i, only one -r option can be specified at any given
               time.
               
  -o fname   - appends grep-friendly log data to the specified file. The log
               contains all observations made by p0f about every matching
               connection, and may grow large; plan accordingly.

               Only one instance of p0f should be writing to a particular file
               at any given time; where supported, advisory locking is used to
               avoid problems.
               
  -s fname   - listens for API queries on the specified filesystem socket. This
               allows other programs to ask p0f about its current thoughts about
               a particular host. More information about the API protocol can be
               found in section 4 below.

               Only one instance of p0f can be listening on a particular socket
               at any given time. The mode is also incompatible with -r.

  -d         - runs p0f in daemon mode: the program will fork into background
               and continue writing to the specified log file or API socket. It
               will continue running until killed, until the listening interface
               is shut down, or until some other fatal error is encountered.

               This mode requires either -o or -s to be specified.

               To continue capturing p0f debug output and error messages (but
               not signatures), redirect stderr to another non-TTY destination,
               e.g.:
               
               ./p0f -o /var/log/p0f.log -d 2>>/var/log/p0f.error
               
               Note that if -d is specified and stderr points to a TTY, error
               messages will be lost.

   -u user   - causes p0f to drop privileges, switching to the specified user
               and chroot()ing itself to said user's home directory.

               This mode is *highly* advisable (but not required) on un*x
               systems, especially in daemon mode. See section 7 for more info.

More arcane settings (you probably don't need to touch these):

  -p         - puts the interface specified with -i in promiscuous mode. If
               supported by the firmware, the card will also process frames not
               addressed to it. 

  -S num     - sets the maximum number of simultaneous API connections. The
               default is 20; the upper cap is 100.

  -m c,h     - sets the maximum number of connections (c) and hosts (h) to be
               tracked at the same time (default: c = 1,000, h = 10,000). Once
               the limit is reached, the oldest 10% entries gets pruned to make
               room for new data.

               This setting effectively controls the memory footprint of p0f.
               The cost of tracking a single host is under 400 bytes; active
               connections have a worst-case footprint of about 18 kB. High
               limits have some CPU impact, too, by the virtue of complicating
               data lookups in the cache.

               NOTE: P0f tracks connections only until the handshake is done,
               and if protocol-level fingerprinting is possible, until few
               initial kilobytes of data have been exchanged. This means that
               most connections are dropped from the cache in under 5 seconds;
               consequently, the 'c' variable can be much lower than the real
               number of parallel connections happening on the wire.

  -t c,h     - sets the timeout for collecting signatures for any connection
               (c); and for purging idle hosts from in-memory cache (h). The
               first parameter is given in seconds, and defaults to 30 s; the
               second one is in minutes, and defaults to 120 min.

               The first value must be just high enough to reliably capture
               SYN, SYN+ACK, and the initial few kB of traffic. Low-performance
               sites may want to increase it slightly.

               The second value governs for how long API queries about a
               previously seen host can be made; and what's the maximum interval
               between signatures to still trigger NAT detection and so on.
               Raising it is usually not advisable; lowering it to 5-10 minutes
               may make sense for high-traffic servers, where it is possible to
               see several unrelated visitors subsequently obtaining the same
               dynamic IP from their ISP.

Well, that's about it. You probably need to run the tool as root. Some of the
most common use cases:

# ./p0f -i eth0

# ./p0f -i eth0 -d -u p0f-user -o /var/log/p0f.log

# ./p0f -r some_capture.cap

The greppable log format (-o) uses pipe ('|') as a delimiter, with name=value
pairs describing the signature in a manner very similar to the pretty-printed
output generated on stdout:

[2012/01/04 10:26:14] mod=mtu|cli=1.2.3.4/1234|srv=4.3.2.1/80|subj=cli|link=DSL|raw_mtu=1492

The 'mod' parameter identifies the subsystem that generated the entry; the
'cli' and 'srv' parameters always describe the direction in which the TCP
session is established; and 'subj' describes which of these two parties is
actually being fingerprinted.

Command-line options may be followed by a single parameter containing a
pcap-style traffic filtering rule. This allows you to reject some of the less
interesting packets for performance or privacy reasons. Simple examples include:

  'dst net 10.0.0.0/8 and port 80'
  
  'not src host 10.1.2.3'
  
  'port 22 or port 443'

You can read more about the supported syntax by doing 'man pcap-fiter'; if
that fails, try this URL:

  http://www.manpagez.com/man/7/pcap-filter/
  
Filters work both for online capture (-i) and for previously collected data
produced by any other tool (-r).


== TCP signatures ==

For TCP traffic, signature layout is as follows:

sig   = 4:128:mss,nop,nop,sok:df,id+
sig   = 4:128:mss,nop,ws,nop,nop,sok:df,id+
sig   = 4:128:mss,nop,ws,nop,nop,sok:df,id+
sig   = 4:128:mss,nop,ws,sok,ts:df,id+

sig = ver:ittl:olayout:quirks

  ver        - signature for IPv4 ('4'), IPv6 ('6'), or both ('*').

               NEW SIGNATURES: P0f documents the protocol observed on the wire,
               but you should replace it with '*' unless you have observed some
               actual differences between IPv4 and IPv6 traffic, or unless the
               software supports only one of these versions to begin with.

  ittl       - initial TTL used by the OS. Almost all operating systems use
               64, 128, or 255; ancient versions of Windows sometimes used
               32, and several obscure systems sometimes resort to odd values
               such as 60.

               NEW SIGNATURES: P0f will usually suggest something, using the
               format of 'observed_ttl+distance' (e.g. 54+10). Consider using
               traceroute to check that the distance is accurate, then sum up
               the values. If initial TTL can't be guessed, p0f will output
               'nnn+?', and you need to use traceroute to estimate the '?'.

               A handful of userspace tools will generate random TTLs. In these
               cases, determine maximum initial TTL and then add a - suffix to
               the value to avoid confusion.

  olayout    - comma-delimited layout and ordering of TCP options, if any. This
               is one of the most valuable TCP fingerprinting signals. Supported
               values:

               eol+n  - explicit end of options, followed by n bytes of padding
               nop    - no-op option
               mss    - maximum segment size
               ws     - window scaling
               sok    - selective ACK permitted
               sack   - selective ACK (should not be seen)
               ts     - timestamp
               ?n     - unknown option ID n

               NEW SIGNATURES: Copy this string literally.

  quirks     - comma-delimited properties and quirks observed in IP or TCP
               headers:

               df     - "don't fragment" set (probably PMTUD); ignored for IPv6
               id+    - DF set but IPID non-zero; ignored for IPv6
               id-    - DF not set but IPID is zero; ignored for IPv6
               ecn    - explicit congestion notification support
               0+     - "must be zero" field not zero; ignored for IPv6
               flow   - non-zero IPv6 flow ID; ignored for IPv4

               seq-   - sequence number is zero
               ack+   - ACK number is non-zero, but ACK flag not set
               ack-   - ACK number is zero, but ACK flag set
               uptr+  - URG pointer is non-zero, but URG flag not set
               urgf+  - URG flag used
               pushf+ - PUSH flag used

               ts1-   - own timestamp specified as zero
               ts2+   - non-zero peer timestamp on initial SYN
               opt+   - trailing non-zero data in options segment
               exws   - excessive window scaling factor (> 14)
               bad    - malformed TCP options

               If a signature scoped to both IPv4 and IPv6 contains quirks valid
               for just one of these protocols, such quirks will be ignored for
               on packets using the other protocol. For example, any combination
               of 'df', 'id+', and 'id-' is always matched by any IPv6 packet.

               NEW SIGNATURES: Copy literally.

NOTE: The TCP module allows some fuzziness when an exact match can't be found:
'df' and 'id+' quirks are allowed to disappear; 'id-' or 'ecn' may appear; and
TTLs can change.

To gather new SYN ('request') signatures, simply connect to the fingerprinted
system, and p0f will provide you with the necessary data. To gather SYN+ACK
('response') signatures, you should use the bundled p0f-sendsyn utility while p0f
is running in the background; creating them manually is not advisable.

== HTTP signatures ==

A special directive should appear at the beginning of the [http:request]
section, structured the following way:

ua_os = Linux,Windows,iOS=[iPad],iOS=[iPhone],Mac OS X,...

This list should specify OS names that should be looked for within the
User-Agent string if the string is otherwise deemed to be honest. This input
is not used for fingerprinting, but aids NAT detection in some useful ways.

The names have to match the names used in 'sig' specifiers across p0f.fp. If a
particular name used by p0f differs from what typically appears in User-Agent,
the name=[string] syntax may be used to define any number of aliases.

Other than that, HTTP signatures for GET and HEAD requests have the following
layout:

sig = ver:horder:habsent:expsw

  ver        - 1.0 for HTTP/1.0, 1.1 for HTTP/1.1

  horder     - comma-separated, ordered list of headers that should appear in
               matching traffic. Substrings to match within each of these
               headers may be specified using a name=[value] notation.

               The signature will be matched even if other headers appear in
               between, as long as the list itself is matched in the specified
               sequence.

               Headers that usually do appear in the traffic, but may go away
               (e.g. Accept-Language if the user has no languages defined, or
               Referer if no referring site exists) should be prefixed with '?',
               e.g. "?Referer". P0f will accept their disappearance, but will
               not allow them to appear at any other location.

               P0f automatically removes some headers, prefixes others with '?',
               and inhibits the value of fields such as 'Referer' or 'Cookie' -
               but this is not a substitute for manual review.

  expsw      - expected substring in 'User-Agent' or 'Server'. This is not
               used to match traffic, and merely serves to detect dishonest
               software. If you want to explicitly match User-Agent, you need
               to do this in the 'horder' section, e.g.:

               User-Agent=[Firefox]

Any of these sections sections except for 'ver' may be blank.

There are many protocol-level quirks that p0f could be detecting - for example,
the use of non-standard newlines, or missing or extra spacing between header
field names and values. There is also some information to be gathered from
responses to OPTIONS or POST. That said, it does not seem to be worth the
effort: the protocol is so verbose, and implemented so arbitrarily, that we are
getting more than enough information just with a simple GET / HEAD fingerprint.

== SSL signatures ==

P0f is capable of fingerprinting SSL / TLS requests. This can be used
to verify if a browser using https is legitimate, and in some cases
can leak some details about client operating system.

SSL signatures have the following layout:

sig = sslver:ciphers:extensions:sslflags

  sslver     - two dot-separated integers representing SSL request version,
               may be something like 2.0 or 3.1.

               NEW SIGNATURES: Copy literally.

  ciphers    - comma-separated list of hex values representing SSL
               ciphers supported by the client.

               NEW SIGNATURES: Review the list and feel free to
               substitute parts that bring no information with the
               match all sign '*'.  For efficiency avoid using star in
               the beginning of the signature.

               You may also use '?' to make particular value optional.

  extensions - comma-separated list of hex values representing
               SSL extensions.

               NEW SIGNATURES: Same as for ciphers.

  sslflags   - comma-separated list of SSL flags:

               v2    - client used an older SSLv2 handshake frame, instead
                       of more recent SSLv3 / TLSv1

               Flags specific to SSLv3 handshake:

               ver   - requested SSL protocol was different on a protocol
                       (request) than on a record layer
               stime - gmt_unix_time field has unusually was small value,
                       most likely time since boot
               rtime - gmt_unix_time field has value that is very far off
                       local time, most likely it is random
               compr - client supports deflate compression

               NEW SIGNATURES: Copy literally.

Any of these sections except for the 'sslver' may be blank.

For a fingerprint to match signature an exact fit must happen - sslver
and flags must be exactly equal, ciphers and extensions must match
with respect to '*' and '?' signs.

Note that the fingerprint is matched against signatures in order. You
may take advantage of this and keep broader signatures closer to the
bottom of the list.

---------------------------
9. Acknowledgments and more
---------------------------

P0f is made possible thanks to the contributions of several good souls,
including:

  Phil Ames
  Jannich Brendle
  Matthew Dempsky
  Jason DePriest
  Dalibor Dukic
  Mark Martinec
  Damien Miller
  Josh Newton
  Nibbler
  Bernhard Rabe
  Chris John Riley
  Sebastian Roschke
  Peter Valchev
  Jeff Weisberg
  Anthony Howe
  Tomoyuki Murakami
  Michael Petch

If you wish to help, the most immediate way to do so is to simply gather new
signatures, especially from less popular or older platforms (servers, networking
equipment, portable / embedded / specialty OSes, etc).

Problems? Suggestions? Complaints? Compliments? You can reach the author at
<lcamtuf@coredump.cx>. The author is very lonely and appreciates your mail.

About

Fork of fork of p0f, the passive fingerprinter


Languages

Language:C 94.8%Language:Shell 4.5%Language:Makefile 0.4%Language:Dockerfile 0.3%