What Is eBPF? An Introductory Guide

Navigate to:

This post was written by Suleiman Abubakar Sadeeq. Scroll down to view the author’s bio.

Whats-Is-eBPF

It’s sometimes necessary to modify the core of an OS — for example, to create more flexibility or allow custom code to run on the OS. Even though this is possible, it can lead to some needless issues like security risks, performance problems, or even damage to the OS. Let’s say you modify the core function of an OS and everything works fine.

Now the problem here is, the OS gets regular updates with new patches and version releases. In this case, a new release might take out a function or feature your modification solely relies upon, which will leave you with a damaged OS. To avoid this, you can use a tool that can affect the core of an OS without actually modifying it. In this post, we’re going to discuss eBPF, what it is, how it works, its pros and cons, and some ways it can help.

Let’s begin by defining eBPF.

What is eBPF?

Extended Berkeley Packet Filter (eBPF) is a technology that creates a sandbox environment that allows for bytecode. programs to run in the core of an OS kernel, using its resources without actually modifying or changing the default behavior of the kernel. Emerging from the Linux kernel, eBPF was created to prevent needless packet transfers from the kernel space to the user space.

The website analogy is a good way of describing eBPF’s role with the kernel. Let’s say, for instance, you have a website crafted with pure HTML, and we all know HTML to be a static markup language, which means your website will have no flexibility or programmability whatsoever. So you thought, “OK, maybe adding some flexibility might be helpful to the user.” You then decided to add JavaScript, which will bring programmability and flexibility to the website. Now, similar to HTML, the Linux kernel is not very customizable; even though it’s possible, it might leave you with too many bugs or even a broken kernel. So eBPF here plays a similar role to JavaScript with the OS kernel.

How does eBPF work?

You will normally have a program written in some kind of language, let’s say C++, to perform an event in the kernel. This program might be accessing files or manipulating network traffic. eBPF accesses this program from the user space and compiles it into a bytecode, which is the standard way programs are executed in eBPF. It then attaches the bytecode program to a specified hook and loads it into the kernel.

But before a program executes in the eBPF flow, it has to go through a verifier. The verifier runs a set of security and assurance checks on the bytecode to ensure it’s safe to run on the kernel, and in the event of unsecured code, it terminates the execution of the program. Immediately after all verifier checks pass, eBPF compiles the bytecode program into a native code with the help of its built-in Just-In-Time (JIT) compiler. It then binds it to the specified location in the kernel to wait for the occurrence of an event. After the event occurs, eBPF writes the resulting data to a map, where it stores and shares the state within the program with the user space.

Let’s look at how each of these works within eBPF in a little more detail.

Hooks

Because of the event-driven nature of eBPF, programs execute only when the application triggers a hook. This will then enable the manipulation of data. Predefined hooks available in eBPF include system calls, tracepoints, network events, Kprobes, and uprobes.

eBPF-kernel

Helper calls

To access the kernel functionality like memory, eBPF uses helper functions. These functions are set aside to provide the eBPF program with a way to access and modify the function of a kernel. After eBPF attaches a program to a hook, for the hook to execute the program, it relies on helper calls to carry out the task of accessing the kernel, just like an API. Some of the tasks helper calls usually carry out include network packet manipulation, generating random numbers, access to data and time, accessing socket data, performing tail calls, etc.

Verifier

This is a step eBPF takes to ensure the total safety of the program loading and execution. At this point, the eBPF program passes through a set of conditional inspections to enforce compliance and security policies being followed, and in the case of a failed verification, it terminates the execution. Some of the verification checks the eBPF program goes through include the following:

  • It ensures that the process loading the program to the kernel has that privilege.
  • It checks the program has a finite execution period to prevent it from continuously running and leading to kernel damage.
  • It checks to see if the code has any harmful impact on the kernel.

Just-In-Time (JIT)

JIT compiles the bytecode program to a native machine code so as to strengthen the execution speed of the eBPF program in the kernel.

Map

It’s possible for eBPF programs to share information from the kernel to the user space through maps. A map helps store and share statistical reports or data retrieved into a different set of data structures, which includes arrays, hash tables, stack traces, and ring buffers.

Where can eBPF be used?

  • Management of resources: You can use eBPF to monitor and administer the usage of some system resources — for instance, memory, input, and output disks to ensure that there is the appropriate management of resources across the system.

  • Networking: Networking is one of the major uses of eBPF. You can use it to create custom packet filters to filter network packets based on, let’s say, what the packet is carrying, IP address, or type of protocol. With this custom filter, you can enhance your network security and improve your traffic or load balancing.

  • Security: You can utilize eBPF to implement runtime security policies. Policies to monitor and control how an application behaves in real time can be created with eBPF so as to prevent applications from executing unauthorized actions such as making system calls, executing malicious codes, and bridging access policies. This can help in improving the performance and efficiency of the application, and it can ensure you have control over the security of your application.

  • Observability and debugging: You can leverage eBPF to create custom tracing systems that can assist in identifying complex issues that might not be easy to track or identify with traditional tracing tools in runtime. You can use this to fix bugs and issues early in your program and improve performance and stability.

With the uses for eBPF in mind, let’s now cover some of the pros and cons of using it.

Pros

  • Rest assured that any program you pass through eBPF runs in a sandbox environment where it is verified and executed with a safe method. This ensures that no code actually affects the kernel and provides a safe environment for the program to execute.

  • Using eBPF doesn’t require further customization of the kernel module to interact with or affect the kernel itself. This is a more reliable and flexible way of writing programs that interact with the kernel.

  • eBPF has proven to be the go-to technology that tech giants use to handle kernel manipulation. For instance, Meta uses eBPF in Katran for network load balancing, and other products using eBPF include Cilium, Falco, etc.

Cons

  • Because of how much eBPF enforces security and verification in the execution of programs, it might not be the best technology to go for if your aim is to have more flexibility and control over the way your program executes.

  • It’s better to use a native benchmarking tool like Iptables for applying network policies with a massive number of IP addresses rather than using eBPF, which might result in high CPU usage.

Alternatives to eBPF

Here’s a brief overview of some technologies similar to eBPF:

  1. Iptables: This is a network address translation (NAT) and packet filtering tool for Linux.

  2. SystemTap: This is an open-source tool that provides developers with a command line interface and scripting language that enables them to collect data from a Linux system, which can help in diagnosing and solving issues.

  3. LTTng: Linux Trace Toolkit Next Generation is a tracing framework that provides developers with a way to trace a system’s activities to optimize and improve its performance. It is also open source.

Do you need eBPF?

Just as we have mentioned above, eBPF might not be your best option in some use cases, such as, for example, when you want to have more flexibility with the way your program runs in the kernel. The better thing to do in this case might be to create your own kernel module. But in instances where your program needs to extract some metrics from the kernel without changing or breaking the kernel itself, then eBPF flourishes without a doubt in this kind of use case.

eBPF-Guide

Conclusion

In this post, we’ve discussed what eBPF is, how it works, some of its core components, and the pros and cons of eBPF. Thus, hopefully this post has given you an introductory insight into eBPF. For further reading, you can refer to the official documentation of eBPF.

Author bio

This post was written by Suleiman Abubakar Sadeeq. Suleiman Abubakar Sadeeq is an ambitious react developer learning and helping to build enterprise apps. In his free time, he plays football, watches soccer, and enjoys playing video games.