eBPF Summit Day 1 Recap

teaser

The first day of the eBPF Summit is a wrap and it certainly was an amazing
day full of information about eBPF, the technology that is changing the shape
of Linux networking, observability, and performance. If you missed the keynote
talks from Day 1 you can watch a replay
of the event in its entirety. Individual talks, along with their links,
will be made available within the next week.

Thomas Graf started by greeting attendees with a warm message of “Be kind,
be human.” He went on to say that the team expected hundreds of attendees, but
the event turnout far surpassed any original expectations. Thomas then set the
stage for the keynote and lightning talks to follow which will prove to be a
very diverse set of topics from a diverse group of presenters ranging from
college graduate students to industry leaders. Whether you are a newcomer to
eBPF, or you’re a veteran with years of experience, there’s sure to be something
new for you.

The opening keynote talk, titled “Beginners Guide to eBPF Programming” was
presented by Liz Rice from Aqua Security. Liz started us off with a brief
explanation of what an eBPF program is, along with explaining the difference
between user and kernel space. She went on to provide an overview of what an eBPF
map is, and how an eBPF program can be triggered based on an event. She then
demonstrated how to easily and quickly build an eBPF program that can print “Hello, World” based on a kprobe. A kprobe is commonly used for debugging or
monitoring of production systems. After showing us how to do this with the bcc
library, Liz then demonstrated how to store this data in an eBPF map, which
creates a more scalable solution, as well as how to retrieve that data from a
user space application. If you are new to eBPF, Liz provided an amazing
introduction in this opening keynote talk.

Next up, Daniel Borkmann presented a talk title “BPF as a Fundamentally Better
Dataplane”. Daniel started with an overview of BPF and how it functions as an
execution engine. Daniel cited the improvements that have been
made to the kernel’s support for eBPF, including support for BPF to BPF functions
calls, bounded loops, global variables, static linking, BTF, and support for up to
1 million instructions per program. All of this allows for solving a lot of
interesting use cases. Daniel also discussed reducing the kernel’s attack surface
with eBPF and showed how the workflow to implement an eBPF-based fix can be much
faster and easier to deploy than waiting on a backported kernel from a specific
distribution. He went on to talk about improving kernel scalability/extensibility
with BPF and how XDP can outperform DPDK in some use cases. In addition, he
discussed how implementing eBPF-based policy can perform better and faster than a
traditional firewall. Then he discussed how an eBPF-based solution can be used
in the case of pod to pod networking to improve performance, as well as in the
case of service load balancing to a backend application. Next, he discussed how
recently merged support for bpfredirectpeer() and bpfredicrectneigh() can
provide pod-to-pod connectivity which completely eliminates the host stack,
resulting in significant performance gains. Finally, he showed how eBPF can be
used to implement additional controls related to bandwith, tcp congestion and more.

After a short break, Tabitha Sable and Laurent Bernaille presented “Our eBPF
Journey at Datadog”. Datadog runs tens of thousands of hosts, dozens of Kubernetes
clusters and operates on multiple cloud providers. Their cloud architecture
relies on per pod routable IP addresses, ensuring that each cluster is directly
accessible using a unique range. This has the added overhead of managing IP
address space and cross-cluster discovery becomes more challenging. Initially,
Datadog relied on various CNI plugins for each provider but there were
differences between providers. Network policy support was lacking in many cases,
and there was no easy way to implement end-to-end encryption. For service load
balancing, the initial design relied on kube-proxy and iptables, however at scale
this overhead became challenging as well. While IPVS was able to alleviate some
of the initial pain, it brought with it the additional challenges of connection
tracking, not at one layer but two, and IPVS lacked feature parity with iptables.
Inherently, neither solution was designed to be a client-side load balancer,
and especially not for Kubernetes. As a result of these requirements, Datadog
selected Cilium as their CNI plugin. With Cilium, Datadog was able to
completely remove kube-proxy and also begin enforcing network policies using eBPF.
Cilium was a good fit because it is a universal CNI for multiple cloud providers,
and has the ability to easily enable end-to-end encryption. In addition to looking
at internal traffic, Datadog is also exploring eBPF for network edge filtering,
DDOS mitigation, routing, and even more as they continue to build eBPF into their
own product offerings around security, compliance and network performance.

Next up, KP Singh from Google presented “Security Auditing and Enforcement using
BPF”. His talk focused on his motiviations for building Linux Security Modules on
eBPF. In 2019, KP was presented with a request for some audit data which was not
available and his work in this area led him to building an all new way to do
auditing and enforcement in Linux. Kernel Runtime Security Instrumentation, or
KRSI, is responsible for both monitoring what is taking place on a system along
with the enforcement. Around 200 LSM hooks provide all the data needed allowing
LSMs to make appropriate decisions. He then showed us the code – or rather,
walked us through an eBPF program line-by-line. To close, he presented an overview
of new BPF features recently merged into the kernel such as BPF ring buffers,
bpfdpath helper, storage blobs aka bpflocalstorage, Sleepable BPF, and loading
bpf progams at boot time. Work on atomic operations is presently in progress. KP
believes that, while eBPF may not replace other LSMs, the two solutions can
peacefully co-exist.

Finally, there was a great selection of lightning talks, each one just 5 minutes
in length but full of useful information for the eBPF community. Here’s a quick
list of the presenters and topics:

  • Bryce Kahle (Datadog) – How and When You Should Measure CPU Overhead of eBPF Programs
  • Brandon Cook (Adobe Systems) – eBPF at Adobe
  • Andreas Gersmeyer (RedHat) – Using BCC and bpftrace with Performance Co-pilot
  • Bradley Whitfield (Capital One) – Building a Secure and Maintainable PaaS
  • Lorenzo Fontana (Sysdig) – Debugging the eBPF Virtual Machine
  • Jianlin Lv (ARM) – Enabling eBPF Superpowers on ARM64 with Cilium
  • Beatriz Martinez (Isovalent) – Zero Instrumentation Monitoring with Your First Steps in eBPF
  • Javier Honduvilla Coto (Facebook) – Building rbperf: a Ruby BPF Profiler
  • Pablo Moncada (MasMovil) – Scaling a Multi-tenant K8S Cluster in a Telco
  • Jakub Sinicki (Cloudflare) – Steering Connections to Sockets with BPF Socket Lookup Hooks
  • Manali Shukla (Cisco) – Implementation of Hardware Breakpoint in BCC
  • Sam White (Gitlab) – Securing Kubernetes Clusters with DevSecOps and Gitlab
  • Lou Xun – CCP Games – Traffic Control the Rabbit with Rust using RedBPF

Thomas Graf wrapped up the day by thanking all of the presenters as well as the team working in the background to help support the event and make it a success. Join us tomorrow for Day Two of the eBPF Summit and also be sure to join us on Slack.

If you want to attend eBPF Summit Day 2 on Oct 29,
register for the free live stream.