Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(bpfassets): Fix object file lookup #1419

Merged
merged 1 commit into from
May 13, 2024

Conversation

dave-tucker
Copy link
Collaborator

For local development the instructions suggest running:

./_output/bin/linux_amd64/kepler

This checks for the bytecode in /var/lib/kepler/bpfassets However, for local development this directory doesn't exist.

The fallback was to look in ../../../bpfassets/libbpf/bpf.o Running the recommended command was causing kepler to look in strange locations (i.e /bpfassets) for bytecode.

This PR fixes the lookup for local development to use a glob pattern starting at the current directory. This works well for local development, with the added bonus of making it easier to test kepler builds on remote systems since you can also copy the binary and bytecode files together, without having to place the bytecode in a special path.

@vimalk78
Copy link
Collaborator

how will be the developer workflow after this?
earlier needed to run kepler from _output/bin/linux_amd64 to load ebof program, but now doesnt seem to load either from root or from _output/bin/linux_amd64

@dave-tucker
Copy link
Collaborator Author

With this PR it works from the repository root.

@vimalk78
Copy link
Collaborator

@dave-tucker can you please add some usage examples in PR description
i am still getting

vimalkum (dave-pr-1419) kepler $ sudo ./_output/bin/linux_amd64/kepler 
[sudo] password for vimalkum: 
I0513 15:46:04.659038  544866 gpu.go:47] Trying to initialize GPU collector using dcgm
W0513 15:46:04.659135  544866 gpu_dcgm.go:104] There is no DCGM daemon running in the host: libdcgm.so not Found
W0513 15:46:04.659163  544866 gpu_dcgm.go:108] Could not start DCGM. Error: libdcgm.so not Found
I0513 15:46:04.659168  544866 gpu.go:54] Error initializing dcgm: not able to connect to DCGM: libdcgm.so not Found
I0513 15:46:04.659173  544866 gpu.go:47] Trying to initialize GPU collector using nvidia-nvml
I0513 15:46:04.659219  544866 gpu.go:54] Error initializing nvidia-nvml: failed to init nvml. ERROR_LIBRARY_NOT_FOUND
I0513 15:46:04.659222  544866 gpu.go:47] Trying to initialize GPU collector using dummy
I0513 15:46:04.659225  544866 gpu.go:51] Using dummy to obtain gpu power
I0513 15:46:04.660086  544866 exporter.go:85] Kepler running on version: v0.7.2-247-g93ad7716
I0513 15:46:04.660101  544866 config.go:284] using gCgroup ID in the BPF program: true
I0513 15:46:04.660131  544866 config.go:286] kernel version: 6.8
I0513 15:46:04.660209  544866 config.go:311] The Idle power will be exposed. Are you running on Baremetal or using single VM per node?
I0513 15:46:04.660215  544866 exporter.go:103] EnabledBPFBatchDelete: true
I0513 15:46:04.660242  544866 redfish.go:169] failed to get redfish credential file path
I0513 15:46:04.660815  544866 acpi.go:71] Could not find any ACPI power meter path. Is it a VM?
I0513 15:46:04.661252  544866 watcher.go:66] Using in cluster k8s config
I0513 15:46:04.661258  544866 watcher.go:73] failed to get config: unable to load in-cluster configuration, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT must be defined
I0513 15:46:04.661269  544866 watcher.go:125] k8s APIserver watcher was not enabled
I0513 15:46:04.661314  544866 prometheus_collector.go:92] Registered Container Prometheus metrics
I0513 15:46:04.661333  544866 prometheus_collector.go:97] Registered VM Prometheus metrics
I0513 15:46:04.661345  544866 prometheus_collector.go:101] Registered Node Prometheus metrics
I0513 15:46:04.661750  544866 attacher.go:172] failed to attach bpf with libbpf: failed to find ebpf bytecode: failed to find bpf object file: no matches found
I0513 15:46:04.661760  544866 exporter.go:155] failed to start : failed to attach bpf assets: failed to find ebpf bytecode: failed to find bpf object file: no matches found
I0513 15:46:04.661785  544866 exporter.go:180] starting to listen on 0.0.0.0:8888
I0513 15:46:04.661790  544866 exporter.go:186] Started Kepler in 1.718091ms
^C
vimalkum (dave-pr-1419) kepler $ pwd
/home/vimalkum/src/powermon/kepler
vimalkum (dave-pr-1419) kepler $ ls -al ./bpfassets/libbpf/bpf.o/kepler.bpfe*
-rw-r--r--. 1 vimalkum vimalkum 29352 May 10 18:21 ./bpfassets/libbpf/bpf.o/kepler.bpfeb.o
-rw-r--r--. 1 vimalkum vimalkum 29352 May 10 18:21 ./bpfassets/libbpf/bpf.o/kepler.bpfel.o

@dave-tucker
Copy link
Collaborator Author

@vimalk78 ah! there was a bug. It turns out golang's filepath.Glob doesn't work as I expected (** doesn't work the same was as it does in a shell). I didn't notice it as a I had a tmp/kepler.bpfel.o. I've fixed the lookup and re-tested.
sudo ./_output/bin/linux_amd64/kepler seems to work for me now

@vimalk78
Copy link
Collaborator

seems to work for me now

yupp, for me too.

I0513 17:16:46.475647  550834 attacher.go:288] Successfully load eBPF module from libbpf object

For local development the instructions suggest running:

./_output/bin/linux_amd64/kepler

This checks for the bytecode in /var/lib/kepler/bpfassets
However, for local development this directory doesn't exist.

The fallback was to look in ../../../bpfassets/libbpf/bpf.o
Running the recommended command was causing kepler to look in
strange locations (i.e /bpfassets) for bytecode.

This PR fixes the lookup for local development to use a glob
pattern starting at the current directory. This works well
for local development, with the added bonus of making it easier
to test kepler builds on remote systems since you can also
copy the binary and bytecode files together, without having
to place the bytecode in a special path.

Signed-off-by: Dave Tucker <[email protected]>
@rootfs rootfs merged commit 5050297 into sustainable-computing-io:main May 13, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants