Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new: add Troubleshooting Guides + Dedicated Metrics section #1246

Merged
merged 10 commits into from
Jan 30, 2024

Conversation

incertum
Copy link
Contributor

What type of PR is this?

Uncomment one (or more) /kind <> lines:

/kind bug

/kind cleanup

/kind design

/kind user-interface

/kind content

/kind translation

/kind event

Any specific area of the project related to this PR?

Uncomment one (or more) /area <> lines:

/area blog

/area documentation

/area community

What this PR does / why we need it:

Add new help guides.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

@incertum
Copy link
Contributor Author

/assign @leogr

weight: 10
---

## Action Items (TL;DR)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Andreagit97 and @mikegcoleman is this complementing the planned revamp of the Install and Operate Guides. Would you have additional suggestions?

@@ -0,0 +1,196 @@
---
title: Help, Falco Is Dropping Syscalls Events!
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FedeDP 😅 we finally have it ...


Furthermore, sometimes Linux may not operate exactly as expected. One concrete example is that shell built-ins like `echo` do not cause a new spawned process, and the `echo` command does not get logged with Falco. Similarly, if a base64 encoded string gets interpreted during decoding, you do not have the original base64 blob in the command args unless the command was passed with the `sh -c` flag. Lastly, some fields only work for certain kernel versions or system configs (e.g. [proc.is_exe_upper_layer](https://falco.org/docs/reference/rules/supported-fields/#field-class-process) requires a container overlayfs).

## Missing Container Images
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leogr I really hope we make this clearer, I'll post in one of the open issues that perhaps we should reconsider adding host to the image if it's a host process and for Falco 0.38 even pod_sandbox_container once the relevant libs PR is merged ...

The `k8s.*` fields are extracted from the container runtime socket simultaneously as we look up the `container.*` fields from the CRI API calls responses.
{{% /pageinfo %}}

Carefully read the field description documentation:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We updated them in libs for improved technical clarity already a while ago.


Here is a brief glossary of the currently supported metrics. The snippet was retrieved from a more or less idle test `x86_64` Linux machine. Therefore, counters and event rates are very low, and note that `aarch64` will have slightly different kernel tracepoints.

```yaml
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Andreagit97 😅 finally ... it's up!

@leogr
Copy link
Member

leogr commented Jan 22, 2024

cc @LucaGuerra @mikegcoleman

- Ensure the DKMS package is installed for the `kmod` driver, and your system may require custom-signed kernel modules. Also, verify the availability of the host `/dev` mount (e.g. `/dev:/host/dev` when running Falco over a container).
- In general, check that Falco has all host mounts when running from a container or as a daemonset in Kubernetes. Critical mounts for running Falco, assuming the kernel driver is available, include: `/etc:/host/etc`, `/proc:/host/proc`, `/boot:/host/boot`, `/dev:/host/dev`.
- For `ebpf` and `kmod` drivers, the kernel object code needs to be available for the exact kernel release (`uname -r`) of your system. This invites a wide range of possible issues:
- Assuming you use Falco's open-source artifacts and open-source kernels, one source of error can be that the pre-built kernel driver (published by The Falco Project) is not available for download.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

User can explore this site to know which drivers are available https://download.falco.org/driver/site/index.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Issif added, btw nice suggestion!

@leogr leogr added this to the falco-0.37.0 milestone Jan 23, 2024
incertum and others added 2 commits January 23, 2024 10:25
Co-authored-by: Thomas Labarussias <[email protected]>
Signed-off-by: Melissa Kilby <[email protected]>
Signed-off-by: Melissa Kilby <[email protected]>
@incertum
Copy link
Contributor Author

Re-did the metrics fields explanations section, so that it is less crowded and more organized.

Signed-off-by: Melissa Kilby <[email protected]>
Signed-off-by: Melissa Kilby <[email protected]>
@incertum
Copy link
Contributor Author

@leogr and @mikegcoleman changed my mind and create 2 new top level sections, namely "Metrics" and "Help!". Is it better?

@incertum
Copy link
Contributor Author

As we are deciding to start deprecating the old drop stats / alerts we could consider removing https://falco.org/docs/event-sources/kernel/dropped-events/ once the metrics page is up. WDYT?

@incertum
Copy link
Contributor Author

Also https://falco.org/docs/event-sources/kernel/dropped-events/#actions-rate-throttling is actually already removed CC @Andreagit97

@leogr
Copy link
Member

leogr commented Jan 24, 2024

As we are deciding to start deprecating the old drop stats / alerts we could consider removing https://falco.org/docs/event-sources/kernel/dropped-events/ once the metrics page is up. WDYT?

Keep this on hold for now.

Signed-off-by: Melissa Kilby <[email protected]>
@mikegcoleman
Copy link
Contributor

@incertum should this be /area docs instead of /area blog?

@mikegcoleman
Copy link
Contributor

@incertum I think it'd be better if the top level topic was "troubleshooting" instead of "help" - when I see help I think of in-app help, which is basically what documentation is so it seemed a little odd.

@incertum
Copy link
Contributor Author

@incertum I think it'd be better if the top level topic was "troubleshooting" instead of "help" - when I see help I think of in-app help, which is basically what documentation is so it seemed a little odd.

Happy to change it if everyone is on board. On the other hand you often have help slack channels. Not sure what the best naming would be.

@leogr
Copy link
Member

leogr commented Jan 25, 2024

@incertum I think it'd be better if the top level topic was "troubleshooting" instead of "help" - when I see help I think of in-app help, which is basically what documentation is so it seemed a little odd.

Happy to change it if everyone is on board. On the other hand you often have help slack channels. Not sure what the best naming would be.

I usually assume the whole document is already a "help" guide. So "troubleshooting" sounds a bit better for me. No strong opinion anyway.

@incertum incertum changed the title new: add Help Guides new: add Troubleshooting Guides + Dedicated Metrics section Jan 25, 2024
Co-authored-by: Mike Coleman <[email protected]>
Signed-off-by: Melissa Kilby <[email protected]>
@poiana poiana added the lgtm label Jan 30, 2024
@poiana
Copy link

poiana commented Jan 30, 2024

LGTM label has been added.

Git tree hash: af1a75a78fdf8bcdc253054ca244afbe8f91541d

@poiana
Copy link

poiana commented Jan 30, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: incertum, leogr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@poiana poiana merged commit 40bda03 into falcosecurity:master Jan 30, 2024
3 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants