pixie-io / pixie

Instant Kubernetes-Native Application Observability

Home Page:https://px.dev

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Populate local IP address from BPF space

oazizi000 opened this issue · comments

Is your feature request related to a problem? Please describe.
With #1808, we collect local IP addresses from eBPF space when doing server-side tracing. For consistency, we should investigate adding support for collecting the local IP address with client-side tracing. This will provide more consistency in understanding of our tables for all users.

Describe the solution you'd like
Collect and populate local IP address in any tables that contain the local IP address.

Describe alternatives you've considered
Alternative solutions could try to populate it from user-space, but BPF-based collection is preferred for performance and reliability reasons.

Additional context
For background see #1807 #1808 and #1809

I believe we can merge http_events with tcp_stats_events using the tcp stats connector.

Ran a quick test in PxL and this seems to do the trick.

import px

# Load http events data
df = px.DataFrame(table='http_events', start_time='-300s')
df = df['time_', 'local_port', 'local_addr', 'remote_port', 'remote_addr', 'trace_role']
df = df[df.trace_role == 1]
df = df[df.local_port == -1]

# Load tcp stats data
tcp_stats_df = px.DataFrame(table="tcp_stats_events", start_time='-300s', select=["time_","local_addr", "local_port", "remote_addr", "remote_port"])

# Group by remote_addr and remote_port and aggregate to get one local_addr per remote_addr:port tuple (drop local port column, which presumably contains several local ports).
tcp_stats_df = tcp_stats_df.groupby(['local_addr', 'remote_addr', 'remote_port']).agg()

# merge on connection tuple
# note that there may be duplicates because multiple Pods configured with hostNetwork: true may be using their respective host node's IP addresses, leading them to share the same local IP addresses as those of their nodes.
merged_df = df.merge(tcp_stats_df, how='inner', left_on=["remote_addr", "remote_port"], right_on=["remote_addr", "remote_port"], suffixes=['_http', '_tcp'])

px.display(df, "http events")
px.display(tcp_stats_df, "tcp")
px.display(merged_df, "http merged with tcp")

For this to work, the TCP stats source connector needs to enabled in stirling. Perhaps it would be worth adding it to kProd?