WEBVTT 00:00.000 --> 00:10.000 Please be quiet, we need to start now, thank you. 00:10.000 --> 00:15.000 Quiet, please. 00:15.000 --> 00:17.000 All right, next time we've got access, 00:17.000 --> 00:19.000 going to be talking to us about reducing continually 00:19.000 --> 00:24.000 major sizes with EBPF and Podman. 00:24.000 --> 00:29.000 Hello everyone, so my name is Access Fennany. 00:29.000 --> 00:31.000 I'm also French and you're right for that. 00:31.000 --> 00:34.000 And today I'm going to talk about how can you tackle 00:34.000 --> 00:37.000 the problem of bloated images. 00:37.000 --> 00:42.000 And we will see how Podman and EBPF can be used 00:42.000 --> 00:45.000 to target this problem. 00:45.000 --> 00:51.000 So, why would you see today why you should 00:51.000 --> 00:56.000 reduce your container images, how Podman and the OCI 00:56.000 --> 01:00.000 or container initiative, spec and be used. 01:00.000 --> 01:04.000 EBPF and containers have it can work together. 01:04.000 --> 01:08.000 And we'll do a demo because I think that 01:08.000 --> 01:11.000 showing my code running is probably way clearer 01:11.000 --> 01:15.000 on the slide that I'm going to show. 01:15.000 --> 01:20.000 So depending on the context and what you are doing 01:20.000 --> 01:24.000 with your images, on surprise context, 01:24.000 --> 01:27.000 usually you want to reduce your container size 01:27.000 --> 01:29.000 just to avoid CVs. 01:29.000 --> 01:33.000 You don't want to be called at 2 a.m. in the morning 01:33.000 --> 01:37.000 because there is one image tag with 10 CVs 01:37.000 --> 01:39.000 whenever it is. 01:39.000 --> 01:43.000 And especially sometimes your images tag that 01:43.000 --> 01:46.000 this is just shared library that is not even used. 01:46.000 --> 01:50.000 But how can you determine what's being used and what's 01:50.000 --> 01:53.000 not inside your image? 01:54.000 --> 01:57.000 There is obviously the network bandwidth. 01:57.000 --> 02:00.000 The bigger is your image, more network 02:00.000 --> 02:02.000 it will be, it will use. 02:02.000 --> 02:04.000 And also the starting time. 02:04.000 --> 02:06.000 Five megabytes image will obviously start 02:06.000 --> 02:09.000 faster than the five gigabyte one. 02:09.000 --> 02:13.000 So how do you determine what is going to be 02:13.000 --> 02:15.000 using your container? 02:15.000 --> 02:18.000 But so pretty tricky problem and static analysis 02:18.000 --> 02:20.000 may work in some languages. 02:20.000 --> 02:24.000 But when you have a very big image with different 02:24.000 --> 02:27.000 component, iterative range of binaries, 02:27.000 --> 02:29.000 utility stuff, different programming language, 02:29.000 --> 02:33.000 you may want to go into runtime analysis. 02:33.000 --> 02:37.000 But yeah, there is different approach 02:37.000 --> 02:39.000 but I tried for this problem because it's something 02:39.000 --> 02:42.000 that I was thinking about pretty long time. 02:42.000 --> 02:46.000 I tried first to create my own five system 02:46.000 --> 02:49.000 of whose five stemming user space trying to intercept 02:49.000 --> 02:52.000 every file which is open that this has 02:52.000 --> 02:54.000 way too much overhead and performance. 02:54.000 --> 02:59.000 But it does not make my application work here. 02:59.000 --> 03:03.000 But I recently came across like a few months ago 03:03.000 --> 03:06.000 an article from the Long-time Wentberg and Dan Wars. 03:06.000 --> 03:10.000 They main idea was how can you limit 03:10.000 --> 03:13.000 a container system called access. 03:13.000 --> 03:17.000 So the occur when they first created the containers. 03:17.000 --> 03:21.000 They limit the number of system call that the 03:21.000 --> 03:24.000 processes inside your container can call. 03:24.000 --> 03:28.000 They created a subset of something like 240 system 03:28.000 --> 03:30.000 calls, but that's still a lot. 03:30.000 --> 03:33.000 And most applications don't need that much. 03:33.000 --> 03:36.000 So they made a tool combining 03:36.000 --> 03:39.000 the same way. 03:39.000 --> 03:43.000 So they created a code called the IPPF and 03:43.000 --> 03:45.000 Podman to be able to attract 03:45.000 --> 03:49.000 every system call that the container is doing. 03:49.000 --> 03:53.000 And then, later on in production, you can use this list 03:53.000 --> 03:56.000 to restrict the container and if it's called 03:56.000 --> 03:59.000 remote execution and attacker, if they try to use 03:59.000 --> 04:04.000 the system call but it's not in the list, it's just 04:04.000 --> 04:08.000 to do the same for file access. 04:08.000 --> 04:12.000 So Podman, so Podman is just a container 04:12.000 --> 04:15.000 on giant, very similar to Docker, almost a 04:15.000 --> 04:16.000 drop in replacement. 04:16.000 --> 04:19.000 You can run Podman, run Podman, pull everything. 04:19.000 --> 04:23.000 And they implement the open container initiative 04:23.000 --> 04:27.000 spec which is spec to define how containers 04:27.000 --> 04:29.000 should work. 04:29.000 --> 04:30.000 And why Podman? 04:30.000 --> 04:33.000 Because I'm working at that and I'm working on 04:33.000 --> 04:36.000 Podman desktop, so I would have 04:36.000 --> 04:39.000 issue if I had not used that. 04:39.000 --> 04:42.000 So you want to, to be able to do your 04:42.000 --> 04:43.000 runtime analysis. 04:43.000 --> 04:46.000 You want to be able to be called, 04:46.000 --> 04:49.000 which was your program, which is going to run the 04:49.000 --> 04:52.000 profiting called before the container is running. 04:52.000 --> 04:54.000 So you can use the pre-start, 04:54.000 --> 04:59.000 how hooks in containers are a bit tricky. 05:00.000 --> 05:03.000 So you don't want to have your container running, 05:03.000 --> 05:06.000 when this is used, when you want to 05:06.000 --> 05:09.000 container, you call my binary in a synchronous 05:09.000 --> 05:12.000 manner until I return, you do not start the 05:12.000 --> 05:13.000 container. 05:13.000 --> 05:15.000 Because you don't want to have your container running, 05:15.000 --> 05:16.000 do you want to run it? 05:16.000 --> 05:21.000 Because you would probably miss some loading, 05:21.000 --> 05:26.000 live-reason. 05:26.000 --> 05:30.000 over things. And for example, you can define very simple 05:30.000 --> 05:36.000 hook and your hook, your binary, we'll have some information, 05:36.000 --> 05:42.000 which is the PID of your container. And the annotation value 05:42.000 --> 05:48.000 that you said that is required. But containers, 05:48.000 --> 05:52.000 you can create some processes, you can have tons of 05:52.000 --> 05:56.000 thing happening in it. So how do you 05:56.000 --> 06:00.000 provide everything that's happening inside one container, 06:00.000 --> 06:04.000 you check the mountain space, in the Linux world for 06:04.000 --> 06:08.000 containers, for isolation, the online is usually 06:08.000 --> 06:12.000 created mountain space, this mountain space as an ID, 06:12.000 --> 06:16.000 and every process inside your container, without 06:16.000 --> 06:18.000 privileged or without not go outside the container, 06:18.000 --> 06:24.000 we'll have the same mountain space. So with this ID, 06:24.000 --> 06:28.000 you now have a way to identify process, but you want to be 06:28.000 --> 06:32.000 able to capture everything that is happening in 06:32.000 --> 06:38.000 the performance wise manner. So EPPF is the solution. So 06:38.000 --> 06:40.000 to go quickly, there is a room at first 06:40.000 --> 06:44.000 them on EPPF, it's a very big subject, very nice. 06:44.000 --> 06:48.000 It's allow you, it allows you in the big line, to run code 06:48.000 --> 06:52.000 in a privileged manner inside the Linux kernel. You don't need to 06:52.000 --> 06:56.000 recompile your kernel to run some custom logic. Why it's 06:56.000 --> 07:00.000 important? Because it's very, you can hook in 07:00.000 --> 07:04.000 any place in most of places inside your Linux kernel, and you can 07:04.000 --> 07:08.000 access the strict internal data structure in 07:08.000 --> 07:12.000 some way of it. So you should do that because it's very efficient, 07:12.000 --> 07:16.000 there is almost no overhead if you do your EPPF program 07:16.000 --> 07:20.000 pretty nicely. And it allows you a lot of 07:20.000 --> 07:24.000 flexibility. So there is tons of EPPF 07:24.000 --> 07:28.000 program, there is for everything, you can hook into 07:28.000 --> 07:30.000 system course, you can hook into five system, you can 07:30.000 --> 07:34.000 hook into drivers, but there is one which is very interesting, 07:34.000 --> 07:38.000 is that you can hook in through a Linux security module, and the specific 07:38.000 --> 07:42.000 one is the file open. I tried first to hook into a 07:42.000 --> 07:46.000 system called open, but some new application now are using open 07:46.000 --> 07:48.000 hat, and there is, when you use exact, it's not 07:48.000 --> 07:52.000 mandatory opening some five. So it was very hard to 07:52.000 --> 07:56.000 be sure that everything that is read, opened, accessed, 07:56.000 --> 08:02.000 execute, I want to catch this information. So this one, this Linux 08:02.000 --> 08:06.000 security module, file open hook, which is a list of 08:06.000 --> 08:10.000 applications that you can attach your EPPF program to, is what 08:10.000 --> 08:16.000 has been used. So now you have your, you have 08:16.000 --> 08:20.000 podman, before starting the container, it's called your binary. 08:20.000 --> 08:24.000 Your binary will be able to load your EPPF program. 08:24.000 --> 08:28.000 You, just in your binary, you will be able to 08:28.000 --> 08:32.000 identify the mountain space, so you now have a way to determine 08:32.000 --> 08:36.000 your processes from within your container, and inside the 08:36.000 --> 08:40.000 EPPF program, every time a file is open, it's an 08:40.000 --> 08:44.000 event, it's a task, and in this task, you can access 08:44.000 --> 08:50.000 the mountain space. So now every time a file is open on everything 08:50.000 --> 08:54.000 on your system, you will get a pull, you can filter 08:54.000 --> 08:56.000 of a pull saying, okay, this is not relevant to my 08:56.000 --> 09:00.000 problem, and then you can just send back your 09:00.000 --> 09:04.000 information, so you receive an event, you see your 09:04.000 --> 09:08.000 file, this file is from within the container, and 09:08.000 --> 09:12.000 EPPF allows you to define some map that has 09:12.000 --> 09:16.000 structured, communicate between the kernel space, and you 09:16.000 --> 09:20.000 just trim the data, and in your user program, because you 09:20.000 --> 09:24.000 use the annotation, the nice value, you just put an 09:24.000 --> 09:28.000 absolute path where you want this data to be, to be 09:28.000 --> 09:36.000 different, so next time, 09:36.000 --> 09:42.000 okay, can I, okay, I may need to 09:46.000 --> 09:52.000 just let me roll. 09:52.000 --> 09:56.000 Yeah, we're a bit there, so 09:56.000 --> 09:58.000 you could, I don't have, yeah, thanks 09:58.000 --> 10:00.000 you, I don't have much time, so in an 10:00.000 --> 10:04.000 idea world, you want to have your production container, 10:04.000 --> 10:06.000 and at least you want to produce 10:06.000 --> 10:08.000 production like use case, because let's say you have 10:08.000 --> 10:10.000 two end points, one end point is reading a 10:10.000 --> 10:12.000 config file, the over is not, if you only test 10:12.000 --> 10:16.000 the one that is not opening, or 10:16.000 --> 10:18.000 fully covering your use case, you will just get data, which 10:18.000 --> 10:22.000 are not relevant, or not representative of what's 10:22.000 --> 10:24.000 happening inside the container, so you could say, okay, 10:24.000 --> 10:28.000 we could use it in the CI, you could use it with your 10:28.000 --> 10:32.000 end-to-end test to at least have an idea. 10:32.000 --> 10:38.000 So, for example, we could run, we could run just for 10:38.000 --> 10:40.000 the demonstration, we have the federal 10:40.000 --> 10:44.000 image, which is utility-based image, just like 10:44.000 --> 10:48.000 Ubuntu, but it does, like, tons of binaries, there is a lot of things 10:48.000 --> 10:52.000 in it, but it's expected, because it's a utility-based image. 10:52.000 --> 10:56.000 What I want to do is, here, no overhead, I'm not using 10:56.000 --> 11:00.000 my annotation, so nothing is happening, I'm just going to 11:00.000 --> 11:08.000 copy paste the annotation. 11:08.000 --> 11:14.000 So, now I'm just doing the same, but I'm adding the annotation, 11:14.000 --> 11:20.000 and the path where I want the content to be, to be 11:20.000 --> 11:24.000 doing the same, let's use some of the binary that we 11:24.000 --> 11:28.000 have in it, so we can use dates, I can use 11:28.000 --> 11:36.000 rep, I can use cat, we can see the profile, 11:36.000 --> 11:40.000 what types could we do, we can use some binary 11:40.000 --> 11:44.000 here, but I think that's all, we can see now in our 11:44.000 --> 11:48.000 disk folder that we have a profiling file that has been created, 11:48.000 --> 11:52.000 and it's not really nice to represent, but I 11:52.000 --> 11:56.000 made a quick UI tool, which allow you 11:56.000 --> 12:02.000 to take this file, and what it does, it just 12:02.000 --> 12:06.000 go to the pan manager's tree, dump the image, 12:06.000 --> 12:10.000 check all the layers, and create a tree structure, just 12:10.000 --> 12:14.000 file tree of your system, and it combined it with the 12:14.000 --> 12:18.000 done file that you got, and it just tells you if 12:18.000 --> 12:22.000 a file has been opened or not, and tells you how many 12:22.000 --> 12:24.000 percentage of the content is used or not. 12:24.000 --> 12:28.000 The percentage are using the file size, not the number of 12:28.000 --> 12:32.000 files, so for very small file you will not see, but 12:32.000 --> 12:34.000 all of this could be configured. So, let's go into 12:34.000 --> 12:38.000 our bin folder, and we can see here that we are 12:38.000 --> 12:42.000 all in all of the binaries, the batch one is obviously used, 12:42.000 --> 12:46.000 because it's our entry point, so we use the cat to 12:46.000 --> 12:50.000 see the profile, we see the date, we see 12:50.000 --> 12:54.000 zero colors, okay, that's the used, I probably 12:54.000 --> 12:56.000 for something, and yeah, over things like 12:56.000 --> 13:00.000 reps, so with this method you can just see everything that has 13:00.000 --> 13:02.000 been used in your container, and when everything which has 13:02.000 --> 13:06.000 not been used, here also we can't do 13:06.000 --> 13:14.000 it, we can see it, one slide, why is it empty? 13:14.000 --> 13:20.000 Next, this operation is not spotted, is it? 13:20.000 --> 13:24.000 But the next slide, the next slide is a 13:24.000 --> 13:30.000 thing, so probably this thing, just, okay, it's not working, 13:30.000 --> 13:32.000 so let's name. 13:54.000 --> 13:58.000 Hello, thank you for that presentation. 13:58.000 --> 14:02.000 Sure, is this tool already integrated in 14:02.000 --> 14:06.000 format, sorry, I don't know. 14:06.000 --> 14:10.000 Hello, it does, just, it's just very, 14:10.000 --> 14:16.000 is this tool already integrated in the 14:16.000 --> 14:20.000 Fedora image or import map? 14:20.000 --> 14:24.000 I don't know, I'm sorry, I can't, I don't see. 14:24.000 --> 14:26.000 Is it already integrated in the Fedora image or import map? 14:26.000 --> 14:32.000 Oh no, no, this is just a thing that I work on my 14:32.000 --> 14:36.000 free time, I linked the repository, and you need to 14:36.000 --> 14:38.000 install it yourself. 14:50.000 --> 14:52.000 Any other question? 14:56.000 --> 15:00.000 Thank you.