Plenary session
Main hall
20th May 2024
4 p.m.

BRIAN NISBET: Okay. Good afternoon. If you could all take your seats, please, we will kick off with the second Plenary session.

So, my name is Brian Nisbet and along with Clara Wade, we'll be chairing this Plenary session and we have got some really cool talks for you all. If you could please take seats or your conversations outside, that is your binary choice. So, first up, we have Kemal from ThousandEyes talking about the surprising effect of 1% packet loss.

KEMAL SANJTA: Hello everyone, my name is Kemal I work at Cisco ThousandEyes, it's great to be back at RIPE. It's been a few years since I was here last time, and every single time it's much better and better.

So, today I am going to speak about the surprising impact of 1% packet loss which is the research that we did which we thought might be interesting for specifically this audience.
So, TCP has been with us for a couple of decades at this stage and its intricacies are very well searched. As part of that, we know how it works, we know what the problems are, with he know some of the solutions and sufficient. In fact it's so very well researched that today if you are going for network engineering it's very likely that one of the loops is going to be dedicated to TCP.

So now, as part of all of that experience that we have with the protocol itself and its research, we realised that packet loss has negative effects on flows and at Cisco ThousandEyes we often published various outage analysis as part of which more often they are spiking packet loss closely correlates with a drop in application performance. It's pretty obvious.

However, even with all of that research and experience that we gained over the years, the impact of the packet loss is not that something that we quantify. It's hard to quantify the impact of packet loss. And within the network engineering community at large, we tend to look past the small amounts of packet loss, say 1% or 2%, especially if it's something that just spikes and goes away like temporary spikes, right.

Now, if countless TCP dumps sessions taught us anything it's that TCP has multiple different ways that it uses to handle packet loss part of which we are speaking about duplicate acknowledgements, timeouts, explicit congestion notifications, selective acknowledgements, but pretty large part of that has been dealt by congestion avoidance algorithms. And those are going to be one of the main topics today.

So, when it comes to that, given increased popularity of the Internet and rapid ethernet speed growth that we have observed over the last few decades, network engineers observed that earlier congestion avoidance algorithms are quite slow at utilising available bandwidth especially in the higher bandwidth networks, right. And something needed to be done about it, and CUBIC was invented effectively. And CUBIC at this time represents the full congestion of algorithm on much much every single major operating system, beta Linux Windows, and so on.

So, how it works:
There are a couple of different aspects when it comes to congestion avoidance algorithm that we ned to take into consideration and the first one is congestion Windows adjustment. CUBIC employs the CUBIC function, which researchers found that it has nice characteristics when it comes to that to adjust the congestion window size. And initially the congestion window increased aggressively during the slow start phase which is pretty ironic when you think about it, it's called slow start but it's growing, but that's what we call it and then it goes cautiously during the congestion avoidance. It reduces the congestion window sharply on the detection of the packet loss, and it's quite, when I say sharply reduces congestion Windows, that's probably an understatement as we'll see.

One of the functions is Windows scaling, it utilises TCP time stamp measurements on route trip time which helps it to estimate the available bandwidth and adjustment of the congestion Windows accordingly. The main function is obviously congestion avoidance. And CUBIC switches to congestion avoidance, once it switches to that mode, it increases congestion Windows slowly and gradually probing for the additional bandwidth trying to avoid undue congestion, and once packet loss happens, it reduces the congestion sharply and it implements this, increase to adjust it dynamically.

What did we test it as part of our research? Well, we were interested in real life scenarios, and unlike bandwidth which represents the maximum capacity of the channel, the throughput represents the performance of the efficiency in the data transmission process and that's what we were interesting in measuring, therefore we measured throughput. For this particular research, we took five different Linux hosts, we deployed Ubuntu 2204 on each of them, and using system control we configured them for packet forwarding. Each of these hosts had 1 gigabit network interface card and we tested throughput in two different topologies and we'll show those. Given the fact that devices had single network interface, we needed to utilise the concept of sub‑interfaces to achieve what we wanted to achieve. Which meant that some additional configuration on the switch itself was required in a form of VLANs and for this particular purpose we utilised Cisco 3750.

And then we measured the throughput using I Perf 3.

We had a symmetric and asymmetric toll on, part of which was affording packet is exactly the same to the reverse path. Why did we test? Enterprise networks today often implement symmetric topologies and they are doing for the simple reasons of troubleshooting. Because if there is a problem on the forwarding path it's going to equity that the return path on exactly the same point. And the other thing is that implementing security features such as firewalls and IDPs and stuff like that is significantly easier in the symmetric topologies. However, we also wanted to test stuff in the asymmetric topology so we had this top tolling down below as part of which return path is completely different to forwarding path and for this purpose, we utilised that concept of sub‑interfaces and VLANs on that particular switch.

Now, this closely resembles the majority on the flows on the open Internet.

The first thing we did was measure the throughput on ‑‑ in those two topologies without any packet loss. So we wanted to establish a baseline, and as part of that in symmetric topology we achieved 804.67 megabits per second and in the asymmetric we achieved 864.13 megabits per second. It's reasonable enough. Those were 1 gigabit Internet phases, so it worked as expected.

Given the fact that we utilised Linux, we used TC or traffic controlling utility on Linux boxes and TC has capability such as shaping, scheduling policing and dropping but it also has this enhancement called net M, which calls for network emulation, which allows us to add delay, packet loss and stuff like that. On the packets leaving a specific egress interface and that's exactly what we did.

So the first thing that we did was to introduce 1% packet loss and on this particular plot, what we are looking at with the orange line ask the baseline throughput that we achieved, which is 806 megabits per second in symmetric topology. As indicated by the blue line, we can see that we were getting 235 .5 megabits of throughput at 1% packet loss. So on average, 1% of packet loss goes as 07 plus percent of the decrease in throughput which is remarkable regardless of topology you are testing.

More specifically, what we achieved is 235 .5 in symmetric and 222.49 megabits per second in asymmetric topology which gives that 1% of packet loss, 70.7 percent decrease in throughput in symmetric network, while asymmetric topology it resulted in 74.2 decrease in throughput, which is pretty remarkable.

We tested up to 10%, and as visible here, compounding negative effect of packet loss actually in CUBIC enabled topologies happens really early on part of which is we see even at 1 perspective we were down 07 plus percent in throughput, which is really remarkable.

This shows two different plots part of which we are visualising achieving throughput in these two topologies at various levels of packet loss, and here, as visible, most notable thing is actually this white space in between the orange line and pretty much everything else at the bottom.

So, the next thing that we asked ourselves is whether there is anything else that we can use, and fortunately enough there is BBR. BBR stands for bottleneck bandwidth in round‑trip time, is the congestion avoidance algorithm developed by Google, and it's designed to optimise network utilisation and throughput by continuously probing for available bandwidth. If you think about it, over the last ten years, there was this pretty large trend at hyper scaler such as Google to go towards the white boxes, so to move from the vendor equipment towards the white boxes. It's not surprising that this came out of Google given the fact that white boxes usually have a hard bandwidth interfaces. However, they have the shallow queues which represent some of the challenges.

So, how BBR works. There are a couple of different things that it does. Most importantly, it's estimating the bandwidth by measuring the lower rate of packets and it uses the concept of pacing to ensure a steady flow of packets without causing undue congestion. Probably the most important functionality of the algorithm is round trip estimation because it maintains the estimate of minimum RTT of connection. And this is important from the fact that BBR utilises RTT as a sign that queue is getting filled, and whenever that happens, essentially the algorithm itself scales back when it comes to sending rate and drops ‑‑ maximising the amount of throughput that can be achieved through that particular channel.

And when it comes to bottleneck detection, it identifies the bottleneck in a link a patch of such as probing for increased delivery rate and utilising RTT feedback, just what I said. Obviously as any congestion avoidance algorithm, it utilises sending rate by maintaining parameters and probing gain and pacing gain and it prefers the low latency operation over everything else.

Now, when it comes to differences between CUBIC and BBR, we categorise it in a few different areas. The first one is congestion Windows adjustment. So CUBIC adjusts congestion window based on the CUBIC function. It reacts pretty sharply at the detection of the packet loss. When I said pretty sharply, it literally cuts down the congestion window side by one third, whenever it detects a packet loss. BBR, on the other hand, dynamically adjusts sending rate based on the bandwidth and RTT estimations avoiding unnecessary loss.

When it comes to bandwidth estimation, CUBIC relies on the packet losses as an indicator of the congestion, which is ‑‑ that's a pretty much legacy thing from way back then. When it comes to BBR it probes for available bandwidth and minimise latency. Latency wise, CUBIC prioritises throughput over latency as a result of which sometimes there is increased latency under heavy congestion while BBR maintains low latency continuously monitoring network conditions and adjusts congestion control parameters accordingly.

Lastly implementation wise, CUBIC is, default congestion avoidance algorithm on pretty much every operating system these days. BBR is developed Google for their use in their data centres. And as I said, it does not really surprised that it came out of one of these hyper scaling that moved towards the white boxes given the fact that they usually coming with a shallow queues.

When it comes to enabling it, the good news, it's fairly easy to enable BBR. It needs to be done just on the sending side of the connection. So only on sender. And in this particular case, we are show casing a few commands that you need to execute to enable it on Linux. First one is getting problematic factual system as part of which are we are finding out what's default congestion avoidance algorithm that's configured in this particular case we can see that's CUBIC. And then we are echoing two commands, one for fear queue and the other for the congestion avoidance algorithm to enable BBR, we are applying changes by sis control minus B command and lastly we are verifying that it has actually been configured successfully by getting the proc file system. It's quite easy to do it on Windows as well. It requires additional step of reboot.

We did the same thing with BBR as we did with CUBIC. We first established a baseline and we achieved 868.50 for the symmetric topology and 827.20 for asymmetric topology. Which is pretty much in line with what we were getting for the baseline in the topology that was using CUBIC.

Now, this is weirdly pretty significant difference happens. So, on averages, once we started intro us doing the packet loss using TC or net em functionality or enhancement from the TC, on average 1% packet loss caused only 8 .5 decrease in throughput while using BBR, which is a stark difference of 70.7% decrease that we achieved using CUBIC. So it's order of magnitude difference.

So, more specifically, on average, we were getting 794.06 megabits per second for, in symmetric topology and 763.4 it in asymmetric, topology, which gives at 1% packet loss in symmetric network topology using BBR caused 8 .5 percent throughout decrease compared to 70.70% throughput decrease in the same topology on while using CUBIC. In asymmetric network we saw 7.7% throughput decrease compared to 74.2% decrease while using CUBIC. It's an order of magnitude difference when using those two algorithms.

This is probably the most striking plot that I'm going to show today. In fact if you look at this, what we are indicating here is achieved throughput using BBR and CUBIC. BBR is shown with the blue colour and BBR and CUBIC with the orange colour. And it's almost looks like that we are comparing CUBIC at the baseline and 1% packet loss which is not the case. In fact this is, these are two congestion avoidance algorithms applied in the same topology at the same level of packet loss, which is pretty remarkable.

So, we also tested up to 10%, and interesting thing here to note is that even at 10% packet loss, we were getting 751 megabits per second regardless of the topology type that we were testing in, which indicates that even at 10% packet loss CUBIC is so much more effective dealing with packet loss compared to CUBIC at any kind of percentage of loss.

When it comes to production testing, there are some really good articles that were published by different vendors out there. So for example Dropbox actually published a document indicating a performance of BBR in single POP. They effectively went and enabled, which is the NRT POP, and interestingly enough, they compared performance between BBR Version 1 and BBR Version 2. Now, at RIPE 76, Geoff Huston from APNIC spoke about some of the problems that were observed from BBR v1 as part of which BBR v1 was crowding out order sessions such as CUBIC and Reno enabled sessions which was playing to the unfairness of the protocol and it was actually a significant problem. So if you mixed BBR with Reno and CUBIC it would lead out to ) starvation of the other flows which was a significant problem.

When it comes to those results that Dropbox actually published from their production level tests, it indicates that bandwidth allocation in BBR v2 was scaled down, which made it much more fair compared to BBR v1, and, you know, even though we are speaking about smaller amount of throughput achieved, that's actually a positive thing given the fact that it was more fear to the rest of the flows.

There were other problems with the protocol such as the fact that in Version 1 it did not have the ECN explicit congestion notification, Google actually went ahead and fixed that by implementing DC ECN with protocols. There were problems such as problems with performance with the aggregated links. But that you will pretty much put to bed in Version 2 of the protocol.

So, Dropbox test also compared the performance, production performance with CUBIC and Reno and similarly to what we observed in our research, they're pretty remarkable as well.

So overall results indicate production readiness. Spotify went ahead and tested protocol on the subset of their customers. The good thing here is that it's, as I said previously, it's fairly easy to enable BBR, and you can do it on the operating system level such as we did in this particular research, or you can do it actually for the sessions. So effectively, that's what exactly Spotify guys did as part of which they just enabled it on a subsection of the already established sessions. And their results also indicate production readiness.

Google, well, there is no surprise thereafter all they invented the algorithm itself. Google went ahead and publicly announced that they are using it for all of the airflows in GCP. They are using at search. If it's ready for them, I guess like it's pretty much ready for any any one. Google also spoke about working closely with NetFlix to porting the code. It was even Version 1, by this time that has been done. There are multiple different reports that indicate that NetFlix is using BBR but NetFlix itself, they never published any kind of article indicating that explicitly.

When it comes to networking vendors, Cisco catalyst SD‑One on the inadmissibility of TCP opmisation enables BBR on both ends of SDN tunnels, which is really interesting because in this particular case, you are getting benefits of BBR by directionally (bio) and that's really the proper way to go.

So in conclusion, what we saw is that even the smallest amount of packet loss has extremely negative consequences on throughput. We saw that at 1% regardless of the topology, we saw 70 plus percent decrease in throughput regardless of the topology it's configured in.

This indicates the importance of monitoring and addressing even the minor levels of packet loss, especially given the fact that CUBIC, as of right now, is the congestion avoidance algorithm by default enabled on pretty much every single modern operating system, and we know that these are the things that network operators don't usually change.

Packet loss outcomes significantly differ based on congestion avoidance algorithm. As proved in this research. We saw that at 1% we had 70% decrease in throughput using CUBIC and only 8 .5, or even 7% decrease using BBR, which is an order of magnitude difference.

And BBR shows significantly better results at any packet loss percentage. As part of it it's interesting to note here that it was so much better at dealing with loss that at 10%, it out shined a CUBIC at 1% packet loss by enormous margin.

So the last question is what's preventing you from deploying BBR either for testing or production?
Thank you so much.

AUDIENCE SPEAKER: Jen Linkova. Someone who hasn't being participating attention to transport protocols. I am curious ‑‑ I am curious, so you tested it with packet loss, have you tried adjunct return? Because I expect ‑‑ I remember that sometime ago BBR used to have problem with that. So, it would be interesting how both of them behave and is there any conditions when actually jitter would affect maybe BRR badly?

KEMAL SANJTA: I haven't tested it with jitter. I haven't played with latency where it really shines. So the latency is really where BBR out performs everything. The purpose of this test was to test BBR in a very controlled environment to exclude noise. So, we actually kept like complete environment completely vanilla so to say just to test like how it's going to perform in this like isolated case. But that's something that we could research for sure.

JEN LINKOVA: Because I was thinking there are, like I would expect two cases of jitter is if you have infrastructure and second, actually, your latency might start change and give you like LSP optimisation going in other parts your latency will change and so on. Thank you, I am looking forward to another presentation. KEMAL SANJTA:. The other that I think is that when speaking with jitter, it will be actually quite interesting to see the effects on choppy voice in the voice‑over IP application using BBR, so it would be quite interesting for potentially any kind of researcher in this room to figure out the best methodology to figure out like what's the direct correlation of jitter and choppiness in voice‑over IP applications using CUBIC and BBR.

AUDIENCE SPEAKER: Alexander Azimov. So, I have one comment and one question. You mentioned that to enable BBR, you at Linux you have to use two kinop, one that I think ‑‑ and another one to change the congestion control. In modern Linux concerns you need to change only one because basically it was moved to the TCP stack itself and so in a modern kernel you just need to enable BBR you just need to change the congestion control settings. That's all.

And the second question is: Have you measured the performance of BBR 3? Because 2 is already outdated.

KEMAL SANJTA: We haven't. We kept it by default on the kernel itself that was deployed with 2204 and that was purposely done. We just wanted to see it like ‑‑ so there are other kernel enhancements with the newer kernels. Obviously BBR v3 being just one part of the story. But also if you think about the other measurements tools such as SS minus TNI for example to see like sockets and performance of the sockets you get significant benefits of doing that. We wanted to see what out of the box performance given default kernel version in Ubuntu version 2204.

AUDIENCE SPEAKER: Just another comment. I will be very glad to see your measurements for version 3 it in both flavours because it has two flavours for that environment and external connectivity. Thank you for your effort.

AUDIENCE SPEAKER: Daniel Karrenberg: Thank you very much for providing measurements. You only know things when you measure them. What I'm curious is about the interactions between multiple flows using different congestion control protocols. Are you aware of any work where you basically have a link that's shared by multiple flows and the link has degradations just like packet loss and jitter, or are you planning to do work like that?

KEMAL SANJTA: For sure we can do that. The Dropbox research that we referenced during this talk actually has exactly that. It was known that BBR Version 1 had the problem as part of which it used to crowd out Reno and CUBIC flows and it was really bad, right. It would just like completely shutter those flows. So that was the one of the most important implements of BBR Version 2. And it's already taken care of. When it comes to flows, it's still significantly out performs, as I have shown here, CUBIC, but at least it's much more fair when it comes to bandwidth allocation, it does not kill them.

DANIEL KARRENBERG: It would be really nice to have published research on BBR 2 and 3 because, yeah, we can all be selfish, but you crowding out the other people is not a quite name server.

KEMAL SANJTA: Perhaps that's a good talk idea for the next RIPE.

DANIEL KARRENBERG: By all means, please.

AUDIENCE SPEAKER: Antonio Prado: Congrats on your research. I just would like to know if you did consider or maybe use the matis formula which relays the probability of losing a TCP segment during your test?

KEMAL SANJTA: Thank you for your question, it's a good one. I think matis actually contributed to the development of BBR. Other notable people includes Jan Jakobsen which is the original one as far as I know. I haven't applied the formula on BBR. I know that it has ‑‑ so, at the start of the talk we actually spoke about for CUBIC congestion Windows being maintained by the CUBIC function, I suspect that matis formula applies given that mathematical probability. I'm not sure how well it's going to translate to BBL but that's something to check.

BRIAN NISBET: I just have one, it's a comment, not a question from Meetecho from Randy Bush.
"BBR competing with BBR and competing with CUBIC tells a much more nuanced story."

KEMAL SANJTA: Sure, sure it does. So, CUBIC versus CUBIC Version 1 to 2, as I said, Version 1 was really unfair to the rest of the flows. Like, it was just crowding out. When compared, BBR Version 2, we saw it achieves lower throughput but in this particular case, and this is one of these rare occasions in our world where less is better, we saw the less acheived throughput but it's much more fair to CUBIC and Reno. Awesome to see that Mr. Bush is looking at these meetings.

CLARA WADE: All right. Thank you, Kemal.

And next up is Asbjorn with hardware off loaded IP forwarding in the NIC.

ASBJORN SLOTH TONNESEN: I will talk about some stuff I have done with off loading routing of normal Internet access in hardware.

So, we are a smaller ISP in Copenhagen, where we do FTTB. We have normal 10 big assess switches in the basement and we have VLANs going out from there.

And we have 35,000 connected at this time.

So if you look at the small model of a simple ISP network, then we have some PNIs, IXPs and transits, where in this setup we used alt one armed router. So we have it going into the Layer2 switch and then up to a router because router ports are expensive. And then we have internal Layer3 Cloud. Then users are connected and we also have internal caches.

So we used to have some big equipment. Then we replaced it with some other big equipment. And then we just did a soft router. And we could get away with that for sometime. And then in 2021, we upgraded N IC, and then we started doing some static off loading.

So, this is just a short slide about one router is make. If anyone is unfamiliar. Basically, we have all the different external links going into the router as a VLAN tag. So the router basically just shuffles the packets around and therefore, we can get away with just having one port on the NIC. So, we have this feature that invidia developed as part of a solution for OES, and containers and so on, so that when you have virtual ports on NIC with SRI OV, that there is then has its own hardware queue, then you can programme the NIC to say that this IP address or this MAC address should go straight into this queue for this virtual machine. And thereby bypass going through the hypervisor and go straight to the virtual machine.

So, this is the main case for this feature set. However, it's also possible to use it with sending the packet back out again of the network port. In future, it might also be possible to go between the ports if there is a dual port NIC but at the moment it's not possible, but there are some hints that Invidia might make that possible in the future. It's possible but each driver can limit if they want.
So this is done using a default Linux API that they have up streamed. And that's done through Linux TC, traffic control, as we also saw in the earlier presentation.

So, TC, contains chains and priorities and then each rule can also have handles, so it can have multiple versions of the same rule installed in the same priority in a certain chain. The hardware off loading can be installed in three different ways. It can be installed opportunistic, which is default. Or it can be skip software or skip hardware.

So, in the opportunistic case, it tries to install some hardware and if it cannot, if the hardware says it's not supported, then it will just do it in /O software.

And then the skip software version is just where it's only installed on hardware and if it can the not be installed in the hardware, then it will just fail with insertion command and give you an error of why it couldn't be installed.

And then there is the skip hardware part is where it's only done in software.

The opportunistic part is also used for some of the offloads. For instance, if you have some tunnel encapsulation off load, then it might be able to not do it on the first packet, but then it will fallback to the software and process it there, and then it will make sure that the hardware have the rest of the components that it can do it on a second package.

So, that's the three main points of the different types.
It's very vendor agnostic‑ish, because it's like you have to look at the source code, what is supported in each driver. It is mostly current, but as always ‑‑ differences in the limitations unfortunately.

And for instance, some drivers, they only allow you to have off load rules in chain 0 and otherwise they will just give you an error.

So, just to give more graphic overview of how these chains and rules tie together. That you can ‑‑ in a rule, you can go to another chain and continue processing there. I haven't seen any performance degradation from doing that, but it might differ with other hardware. I have only tested on Invidia cards. You can also drop a packet or you can trap it, which means just send it to the CPU, or you can redirect, which is used for sending it back out of the incoming or sending it out for another queue to a virtual port.

So, if we are going to use this for forwarding, what do we actually need to do in these rules? So, we need to change the VLAN tag to that it goes back out as part of another Layer2 network. We need to change the MAC addresses. And decommand the TTL. Fortunately there are commands for all these things. The update checksum is optional in most of the implementations, it's not optional in the software path, but in the hardware path, my experience it's optional most of the time because it's always done either way.

And then you can push it back out with a pre‑direct target.

So, if you look on how these rules look. Then you can ‑‑ this is a simple example of how to add a TC flow rule where it checks on the LCV LAN packet and changes the things we have just talked about with the VLAN tag, it updates MAC addresses and recommences the TTL and then updates checksum, because this is an IPv4 check, and then it mirrors is out of the egress port of the incoming device.

If we then look at how this looks, if we show this rule, we can see that it's online 6 it's in hardware, and it has reference count of 1. If we had bonding and that, then we could have it installed in multiple places.

And we see these three actions. And then if we look at the statistics for this

That's the next one. Here, we just ‑‑ this is on the previous one the actions were compressed into a single lines to fit on the slide. Here, they are expanded so you can see what the actual output of the command is, with the details of these actions, with how it does the macro placement and so on.
This is one with the, where it shows the statistics output, where we can see that it has access to the hardware counters of how many packets and bytes has this been applied to. And that could also use this for prioritising which rules is installed or remove some if they are not needed, if you rather do some processing in hardware or some in software path.

We have been using this for three years or something like that, where we have done this just purely statistically because all of our addresses of our own address space is known. We know that the next destinations a Layer3 router, so that has a static address. We know that if there is power to the system, then it's always going to be true. And therefore, we could do this statistically for sometime. For half the traffic or half the packets.

And so, this is where we just set‑up the rule set if in this manner, and simply match on ‑‑ if it's going towards our own network space but it's not used for any link nets, then we can just send it directly to the Layer3 router without going through the CPU.

We ‑‑ so the thing we do here with TTL is that if the TTL is expiring, we do send it to the CPU so that CPU can send back an ICMP packet.

Because we could not really do this any more, or didn't scale well enough, we wanted to also be able to do it on some of the outbound traffic where we content just do it statistically, where we needed BGP integration with it. Then we have developed a small team that can talk net link and can take in all of the knowledge from the host system with an interfaces, enables the routes and matches it all up into TC rule set. So we have written in words like it only does that and just talks to net link, and learns it from a separate routing table so that we don't have to offload the full TFC but we can just off load the minimal set of, or we can do it in stages so we still have some control over how much we are depending on it and what can go wrong at what time.

So, we have ‑‑ with this, we have used BIRD so all of our BGP is running BIRD by the way, so, we have set that up to inject a small subset of the routes, and in so that all of the hard paths can get hardware off loaded.

So, if we have a block diagram, then we get all the links, neighbour, and routing table information in to the daemon, and then it can rate the rule set and it has BGP and BIRD feeding into it. It will look at the BIRD configuration for this.

Then we have all the BGP sessions feeding into the main table in BIRD. For this example, I have just done for one address family. You can think of it being duplicate. Because we also have it for v6 and v4. But all the BGP sessions and the routers sessions go into the BIRD table, they also go to the main kernel table to the router can do the full table and note where everything is at. And then we have a pipe protocol in BIRD that then has a filter of which route should go into the off load table in BIRD, and then we synchronise the off load table into the kernel with net link, so that we can then inflow route take the table back out again from the kernel and feed it back into the kernel with TC.

In order to get stability, we also actually have two pipe protocols, just so that we can update the filter on one and refresh that one and before we refresh the other one, so that we don't have flapping routes in the off load table.

So, I was a bit puzzled when I first tried to deploy this. I saw that the more rules we put in to the hardware off load, the more the performance dropped, or the more ‑‑ this graph is just showing the pace process in software. So the rules were being processed but we got higher CPU usage on the software path, because all of the rules that were not processed in the hardware used more CPU because that whenever a packet comes into a Linux kernel, it checks this flag for whether or not this rule should be used or if it should skip the rule because it says skip software. But that flag was on the individual rule and had to do all of the matching in software to figure out that it was this rule that it should look at the flag for.

So, what I have done in Linux 6.10, which is coming out in July, is that I have made a bypass so that if the amount of rules installed in the kernel is the same as the amount of rules that have the skip software flag, then it can assumed that any rule will have the skip software flag, and therefore it can be bypassed.

And then we could reach 3 million packets per second on this software setup.

The reason that the graph still goes down around 70 rules here, is because that is the point where the hardware cannot keep up any more.
So if we look on the scaleability of how hardware rules we can install, then if we look at the hardware forwarding performance, then we can do this test I was only able to generate around 37 million packets per second, and therefore the graph tops out at 37 million packets per second. But I could still see that the more rules that ‑‑ and these were installed with a single chain and then a lot of rules that doesn't match with different priorities and then a final rule that matches, then it's clear that it's expensive forwarding hardware to go onto the next rule, next rule, next rule that doesn't match. But if, instead of doing this with priority, I did it with handles, then it could keep up and I haven't done the full testing on this but this graph was just to have the same similarity to the previous slide. It's limited to 100, but I have gone up to around 900,000 before I saw it starting to drop, but I need to do more testing on that to see if there's some cache stuff playing into that. But, yeah... so the numbers can go higher, the issue is on these Linux cards, whenever it runs out of memory on the NIC, it does DMA over PCI express to the host. So, therefore, it will report to the kernel that it can do a full 32‑bit in the ranges and in both rules and handles and so on, and chains, but, in reality, there is a performance cost at some point. But I haven't had enough time to find what the top end of it is, because those tests were running quite slow when I had to set up so many hundred thousands rules.

So, future steps, and this code base is to get some better tooling. So it's easier to drop some net link ‑‑ dump the net link state of proxy server and then assimilate how the rules that would be generated if you were to deploy it on that production server.

Some other things is that I haven't done reverse path filtering on it yet. In our setup, we do that at the earlier stage, and therefore we don't need it right at the edge. And we also have done off load directly connected hosts but we want that tested up in our address space, so that will come soon. Then we also need to use TC handles so the rules can scale better.
And then we also need ECMP support. Invidia had a proposal at NetDEF in 2020, to introduce a new flow API for ECMP called hash, but there was some push back on the mailing list for that, and therefore they have dropped that more or less at this point.

And we'll see if we can get them resurrected but on that, we also need some API support for MTU differences because we have some Internet Exchange connections that have higher MTU, but we need the routing needs to know if it's safe to throw a big packet out to a small VLAN, but at the moment there is no way in TC flow to check the size of a packet. So, we have a small project of getting that in there.

And then we also have stuff like verifying that bonding works from the code that I have written in the kernel it should just work, but let's see.

So the setup that we have planned in our address space is based on one of these implants, they might be familiar, and it has, this is meant for some wi‑fi or something, but it can be converted into talk to 8 X dual 25 gig NIC. And then it can run, since this client uses 3 watt normally and can use 5 watt, and then the 7 watt from the NIC, and then 1 watt for a fan or something, then we can reach something like 25 gig wire speed routing at around 15 watt of power usage. So, I think that's nice for running in the higher space, or if you have a 5 or 7 connection something that you need a router for.

So, that's more or less it. Patches of course are welcome. This is one the first implementation of this API outside of IP route 2. The next updates on this will be we'll do a blog post about the setup in our address space when we get that done. Hopefully that will be around the release of Linux 16.10 in mid‑July. And then I'll have a talk at born hack as well to update on that.

BRIAN NISBET: Okay. Thank you very much.

So, do we have any questions? We do.

AUDIENCE SPEAKER: Hi, Ben from BGP tools. I didn't quite catch the initial motivations for doing this. Is it energy efficiency? I am curious because it mostly implies my next question. Have you looked into things like VPP and sort of quite not hardware off loaded but it gets quite close to hardware off loaded?

ASBJORN SLOTH TONNESEN: Sure. The reason that ‑‑ so we started doing this statistically in 2021 or something like that, and so we have had the idea of doing it this way since then. But also in the, right now we use this connect S 6 cards but with newer versions we can also do 400 gig or 800 gig, so it scales quite well. And it is a bit of an under appreciated feature of these cards, because there is more or less no documentation. The only available thing is the OvS documents which then also have solar examples of TV commands that have never worked. Like wrong key words and stuff like that.

AUDIENCE SPEAKER: It's probably also worth pointing out that the Linux switches or they are now Invidia switches also have these same functions. I am a user of the exact same TC flower APIs as you are now. Not on the routing side... they are particularly good.

ASBJORN SLOTH TONNESEN: There is two different main diverse. MLX fires which that is these connect 6 cards and then there is the switch which follows some of the same things. It's a bit different what is supported in flow API in each of them. But... but is definitely power consumption that we can do it without spending too much on electricity.

AUDIENCE SPEAKER: That makes sense. Cool. Thank you.

AUDIENCE SPEAKER: Hi. Michael. Thank you, it's quite an interesting presentation. Actually, the PC router that we built using BIRD or maybe like other platform like feeat power or something else is also quite popular now in Indonesia. My question is: How is the resilience of the router itself? You mentioned before that you started it from 2021, and I mean how the uptime of the router, and I mean how efficiency of the latency is? Can you see it's pretty good or maybe in some issue with the BGP updates sometime or...? How long it will take if we have the flower routing of the BGP table and I think one more thing, it's an interesting topic, and how big the potential of the traffic that we can carry based on the kernel limitation? I mean we don't care about the hardware, how it will be, the potential of this kind of projects?

ASBJORN SLOTH TONNESEN: When I did the testing for the bench marks here, it was before there was someone that found lock contention issue in the bypass patch that has been fixed in the kernel since then. But therefore, it was a bit slow when I ‑‑ and that's why I haven't done all the bench marks yet on how many hundreds thousands of rules I can install using handles. I mean there is no information about these kind of things, so it's good to do a presentation about that. But it's also ‑‑ and I have seen some papers that say that you can install quite a lot of rules quite quickly, but I couldn't really reproduce that. But then there is also someone that found a bug in the code. It might be fixed now. I haven't done the testing yet.

BRIAN NISBET: Okay. We have time for really only I think one question, which was over here first, sorry. You'll be around. You can ask for questions.

AUDIENCE SPEAKER: Sebastian. Have you looked at also the Invidia build field NICs because they offer something very similar which they call host base networking which they basically like they do something similar that they off load to the same internal connect NIC but it's proprietary, your stuff is obviously open source and nicer.

ASBJORN SLOTH TONNESEN: Basically, blue field is just an embedded computer that has eight arm codes and then it has the same NIC. So, it has a connect X 6 DX embedded in blue field 2 and blue field 3 I think is a connect X7. But it is just an embedded computer that runs Linux in the blue field NIC as well. So you might be able to manage it from Windows and run Windows in the main host. I don't know.

AUDIENCE SPEAKER: Have you looked at what they are doing? Are they using flower or...?

ASBJORN SLOTH TONNESEN: There is a port using flower underneath because it is just a straight up the NIC sakes some mam calls, so they are using those kind of APIs internally and it is just Linux running on the arm and therefore because Linux supports Clover API then they can do it over the NIC as well.

BRIAN NISBET: Okay. Thank you very much.

So, now we move into our lightning talks and remember there may or may not be time for questions and answers after these. And please, Lai Yi from MLabs.

LAI YI OHLSEN: I am going to start by lowering the mic because ‑‑ hello. Hi everyone. I am here on behalf of Measurement Lab where I lead the research and data programme and today I'm going to be talking about how to, how you can measure your network openly with Measurement Lab.

So, so the main takeaways. If you walk away from this with anything, know that 1, MLab is seeking new server side vantage points for our open global measurement platform and 2, or new options may get easier to do. So to contribute it has been in the mast.

First of all just to give a bit of background about M‑Lab for those of who you are not familiar.

So our mission is to measure the Internet, save the data and make it universally acceptable and useful. We were founded in 2008 as way to think about how to scale Internet research from the so‑called relevant Internet. We run a platform of servers globally and inter‑connection or off net networks and Cloud networks worldwide.

But on these servers, we host what we call measurements services. Each of which measure a different aspect of Internet performance. Our most well known and used service is network diagnostic tool or NDT which measures a single TCP stream bulk transfer capacity, and this is more, you know, commonly known as a speed test, which I have feelings about, I'm sure we all have feelings about, there are multiple definitions of speed of course but this is the way that NDT measures it. And NDT is integrated into Google search, and so from that, we are able to collect a lot of different measurements from different users all around the world. And with all of that data, we collect it, archive it and publish it for open use in BigQuery, so it's free and open to access and as of today we have about 6 billion rows of measurement data and about 4 million new measurements come in per day. I should also mention that for every NDT test run a traceroute from the server back to the client and in the next year or so we are going to launch a reverse traceroute being run as well. So, lots of data.

This data is then used by researchers and policy makers so one example of this is that one of the telecommunications agencies in the US, NTIA, using our or integrates NDT into their indicators of broadband need map. This is an along Google and Microsoft to demonstrate areas that need more investment. There is also, smaller regional community led organisations that are using M‑Lab's data and tools to create maps of their communities and use data driven evidence of where connectivity is insufficient.

So, there are also a number of academic publications that site M‑Lab data. It may be interesting in relation to the first presentation. We ran CUBIC for sometime until 2020 and then we transitioned to BBR v1, so there is potentially some considering comparisons to be done there. We have a tool that actually called, I don't have a slide for T but measurements Swiss army knife that allows you to configure the congestion control algorithm as well as the number of streams. So there's some interesting opportunities there as well.

That was just a crash course in M‑Lab. It sort of pains me to not go into a lot of details about some of those things but I'm going to focus on the evolution of the platform. As I mentioned different opportunities for contributing infrastructure.

So, again measurements come from all around the world. They are integrations was our open source clients and they test through our servers. This map shows the geographic element of that where the servers geographically, but of course the question is also: Where do they sit within the network because as we know that fundamentally changes what we are measuring.

Historically, as I have mentioned we have placed our servers in inter‑connection points, or in off net networks outside of the users access network. In 2022, we started measuring in Cloud networks and we did this primarily to provide an updated perspective of the user's experience of the Internet and pathway to content. So, with these new options that I'll be talking about for contributing vantage points we are now look to go measure both in what we would call off net or on net and off net networks, and again with the goal to diversify the vantage points that we are providing and the vantage points that are essentially representing the Internet within our data.

Our original motivation to place servers in inter‑connection points was to measure the deregulate to which ISPs are peering with one another and we still remain invested in measuring that component of the network. But are also motivated to provide new vantage points and a more diverse representation of the user's experience and to be able to compare and contrast participate ways. I am particularly interested about the idea of being able to measure from a single user to different points in the network and maybe be able to identify bottlenecks in that way.

Here's an overview of the options that are now available for contributing to vantage point. I'll go through these. Essentially there are now options for M‑Lab managing the hardware and the deployments. And then there are host managed options and within both of these this can be physical, virtual within a Cloud network, full scale or minimal.

So, the full scale deployments are our sort of bread and butter. So this includes four servers and a switch. IPv4 space, IPv6 space and a 10 gigabit uplink. There is historically again managed by M‑Lab and this is how we built out the initial platform. This is still available since 2019, but this is one of the options now instead of the only one.

As I mentioned we are also now offering Cloud deployment, so if you wanted to contribute a server within a Cloud network, if you are a Cloud network, you could donate credits and allow us to run servers within your network that way. This is also a good option, often we get contacted by folks who want some preparation in a specific geographic region and so this is an option, they can just sponsor a Cloud server in a network that has a presence this in that region.

So this is pretty straightforward way to contribute to the M‑Lab network and have more presence in a Cloud network of choice. And this is, for us, we're interested in diversifying the Cloud networks that are represented, and being able to have again representation al view of the user's perspective of the Internet.

Now, I am particularly excited about. This is a long time in the making. This is a minimal site deployment option. We would get interest from networks that want to have their network represented in our dataset but not able to provide the full deployment. We have now enabled and option where you can just donate a single server, and so this is a lot less of a lift, and as particularly relevant so networks in more developing environments where it's just again going to be more research intensive to donate that servers and a switch. You can donate a single server and this can be managed by M‑Lab or host managed.

Lastly, this is also a development where basically if you are already running an NDT server or any one of the measurement services, you can continue running that yourself, but that can be pushing data to our archive and being published publicly. So this option works well if you are already running an NDT server and want to contribute data to the open M‑Lab dataset.

Here is the options. Host managed, M‑Lab, within that you can contribute virtual or physical. There are full sight deployments. The take away really is these options are kind of the way forward into being able to diversify our vantage points, and be able to have a more nuanced and representative perspective of what the Internet is for the user, which of course involves many different pathways.

Our new options enable that. If you are interested, there is a form you can fill out to express your interest in donating infrastructure and being added to our platform. You can also go on website Measurement and there is a tab that says "Contribute" and all that information is there in much more detail than ten minutes I have today. And there is my e‑mail if you'd like to reach out. Thank you.


CLARA WADE: You thank you. We have no time for questions right now, but next up is Christian, who is going to tell us about Community‑IX.

CHRISTIAN SEITZ: This talk is about Community‑IX and I tried to explain to you how it can simplify sponsoring and how it speeds up communities. I am Chris, I am working for and today I'm here for my non‑profit organisation individual network Berlin where I volunteer for since 24 years now.

Community‑IX is connectivity platform for non‑profit organisations. It provides free IP transit, and is powered by, in Berlin, which is a non‑profit organisation by itself, and we started it in 2016 on the RIPE 72 meeting, we, that is Theo from inter dot link and me when I was working for Stato, we both have sponsored some non‑commercial organisations since sometime now, and it's always a lot of work to connect a new peer. As, you know, you need some things. And we tried to simplify that for sponsors. We currently have about 100 to 100 gigabits of traffic. You can see the peers later. And we also talked to some commercial IXPs and some of them are also connected to our platform. So we can connect our connected communities to the peering LAN of those IXPs.

What makes it special? Well it's a Layer2 platform like every other single IXP, but it's for IP transit and for non‑profits only. And we try to reduce the effort for both: Communities and sponsors to connect, because when a sponsor tries to do things for non‑commercial communities, we need to Cross‑Connect, we need to router port, router ports are expensive, they need transfer network, they need to set up BGP sessions, they need documentation and so on. And Community‑IX, from the point of view of the sponsor and the community, everything is available behind a single port. They just connect to the Layer2 platform, and can connect to all those connected parties. And we need just one Cross‑Connect, one documentation, and only DHCP sessions to the route servers if desired. Some sponsors don't want to peer with everyone, but then they can just set‑up BGP sessions to the peers they would like to sponsor.

But there are some exceptions. Because everybody is sending full tables and there are many paths with the same length because they are all good connected carriers. And you know how big full tables are. And some sponsors are in multiple locations so we have roughly 15 million IPv4 paths and 3 million IPv6 path and if we send those paths to every peer, some of the routers might crash because of too many paths.

We are using BIRD route servers but they currently cannot limit the number of paths to peers, but there is an IETF Working Group draft that may solve this issue because when it becomes an RFC and is maybe implemented in BIRD, it can reduce the number of paths that are being sent, for example four or eight paths may be enough, but if we don't use BGP edge path, If you don't send multiple paths, we just overload single ports to sponsors.

So, this is how it currently looks like. There are two organisations. We have Community‑IX in Germany in six locations, and in Switzerland there is another organisation, it's a who copied our idea and do it in Switzerland in currently three locations. There is currently no interconnect between the countries. So we have one Layer2 in Germany and one Layer2 in Switzerland.

It could look a bit better, but that's how it currently looks like in Berlin, it's colourful and diverse, some people connect with single mode, multi‑mode, with direct copper, with bio dye, whatever they like.

This is our connected communities in Germany. You see a lot of similar logos, and also some other ones, that are here on the slide. Also measurement dot network who are also presenting here at RIPE.

Then we have a list of our awesome sponsors at Community‑IX in Germany. You can see there are several IP transit sponsors, several connected IXPs, and also some sponsors for transport, rack space, hardware and so on. I did not use logos here because of company logo policies. I do not want to violate them.

So, what are our policies in Germany to connect? We are currently just connecting non‑profit initiatives and community driven projects. We are not connecting individuals. In Switzerland they are also connecting individuals but that's their local policies, everybody can do their own policies. And you just need an ASN, some IP networks and some the ability to connect physically. We don't peer via tunnels.

What do you need to build a Community IX? First of all, someone who is willing to do the work? You need some resources. You need the projects. If you don't have any projects, you don't need a Community‑IX because for whom you are building that. And you need IP transit sponsors or data centre location, hardware, cross‑connects and so on, BGP route servers so make it easier, and you could use IXP Manager. I suggest you to do but it's not mandatory, but it helps a lot.

And we did it? Germany, but we don't know the situation in other countries. Maybe it's good to build something similar in your country because you know who the non‑profits are in your country, where the data centres are, where they are connected and who the sponsors are, and maybe it's an idea to take this initiative to start a Community‑IX in your country for your community and support the local Internet community in your country.

So, we built it for Germany, and maybe you build it in your country.

That's it, thank you.


CLARA WADE: Thank you Christian, we have two minutes for questions. The first one is online from Nico. He says: "Great work. Did you look at the Dragon route aggregation algorithm, to avoid sending downstream users massive routing tables?"
"Did you look at the Dragon route aggregation algorithm which is on to avoid sending downstream users massive routing tables?"

CHRISTIAN SEITZ: No, we didn't look at it yet.

CLARA WADE: Okay. All right. So no more questions. Thank you, Christian.

And next up we have an online presenter, Vladimir.

VLADIMIR VASSILEV: I'll be quick. I'll just try to see if it's easy to manipulate this.

Okay, so, I'm presenting mostly the work that's done in the IETF in the benchmark Working Group, aiming at standardising a young NETCONF figure interface for network tester. It's mostly Layer2 and 3 network tester and we have specified young models for conferring the transactions and this tester is useful for running testing and validation of things, and standardised bench marks like RFC 2544. So, there is a link in the presentation. You can find the draft with the contact information, and if you are interested in the topic, you are very welcome to participate in this work.

So, I'll go to the next slide. This is what network testers look like today. And they are not compatible. You cannot take one of these testers and replace with another one and reuse your script. That's the problem.

So, I have a quiz question to get some attention like: Which two of the five network testers we all have standard based measurement interface and protocols? And well the A, B and E have quite a lot of work behind and they are good support, they are not based on the standard, like the models, the interfaces, they are all documented proprietary by the companies that stay behind them. C, is like using Skippy which is seriously configuration algorithm, very similar for what was standardised back in the eighties, like it's still very good. And it's still used. And this D picture, this has like acrylic box and 1 gigabit SP interfaces, this is like open source and open hardware that implements the drafts that we are trying to standardise. So this is using only Yang and NETCONF according to that draft.

So, network tests are simple ‑‑ the testers are simple in their intent. It's like something that you should be able to send packets in defined way that you can control and time stamp. And you should be able to receive and find out if packets were lost and delayed. Pretty much that's it.

Like, a more complicated case when you have asymmetric network then you have two testers on both sides. Like you see in this picture.

And so network tester management solutions that are tests in the picture here is like these here. Cisco, TRex, this is something new, mostly for software testers that generate more or less not very much traffic but it is open source.

Key site, just open source, a lot of their like models and software. I did that in 2023. And then you have Cisco and Spirant which have a high level test application programming interface which is pretty much calling functions, doing complicated things. And there are other interfaces like on the picture as well. Now. So now we are introducing something which I think is a good solution, at least for a subset of the problems. It's using YANG and NETCONF 2 to control your network testing the same way you can control your routers or any other equipment. And it saves a lot of time, and if you start using it you will never go back to anything else. That's what I say at least.

So this is like a set of wish list requirements that I just had in my mind before starting with that work. Like what should a test measurement instrument have an information and automation interface? It should have interchangeable interface. That's very important. That would allow us to buy either of these tests generators and run the same test and see if you get the same result. Because test generators get complex, and sometimes you can spend months trying to figure out the problem in your application and your implementation, and it turns out the test generator has strange behaviour.

So, this was done for the like the general instrumentation measurement instruments start code IVI, back in the seventies or eighties maybe, it was based on the device that had BGP interface as and they made Colossus and companies made tonnes of money because they could guarantee that you can buy different signal generators, and they will be replaceable, interchangeable.

And the same thing can be done in YANG/NETCONF interface, if you have this you have a thing called SDN, that allows you to buy different switches from different vendors and replace them and have the same software running, in most cases we probably have some issues but it is good, that's where we're going.

The second is be scalable instead of complex. I'll go faster because we don't have that much time.

Have a transactional management model with hierarchical tree like configuration. Most of the generators actually do that already. They just don't use YANG/NETCONF.

The model that does not put constraints on the precision of the implementation. So, if you have to choose if your traffic generator will take packets per second as an argument or bandwidth or inter frame gaps, if you look into the implementation, it's best to use inter frame gaps, because a second doesn't contain integer management of packets, and there is a clock shifting and things like that. And so there are different reasons. But actually if you make a model, you have solve some experience and know why this implementation is much simpler and the design is ‑‑ you have to choose that.

So if I have a model that does not limit reusability of implementations.

We have a model with optional features and a mechanism to announce them. This is handled by the YANG library so you can connect to your instrument, see all your supports automatically.
6. Measurement model has to have compatible and simple command line manipulation syntax. This is because you don't want people to do the testing been XML objective programmers. They use command line. And YANG is good because it can be converted to command line. And someone has already done that.

So, 7. Model has to be deterministic, as it can be. So, there are some things that come up with NETCONF and YANG like NMDA, which is not good for instruments. Like if you have something that you do a commit and you expect it to generate packets but at the same time it delays that, and after one second tells you it's doing the process of configuring it, this is not going to work. It has to be fast and it has to happen immediately. So, a bit different than what routers do today.

So there are some other requirements. And just skip a bit but you can read if you are interested. And 12, a commit should take 10 milliseconds. This is not a route with thousands of routes. It is doable. We managed to do is. So others should also manage.

And there are some chronology, which I'm not going to go into detail. It's an interesting read. You can check it out in the presentation.

One date that is interesting is that in 2022, the IETF benchmarking Working Group adopted the draft that we presented after several years, and some reference implementations, and we ‑‑ the draft was supported by the Chair at the time, the late Owen Morton and even with some Cisco representatives like Rob Wilton, so with this being checked by people who understand the fabric.

And it's a big project with small different domain tasks which we have taken over, and come pretty far. What you are seeing is even open source hardware. So if you have this open source hardware association certification that has gone through, we have these ideas, this N O 000006, this GPS synchronisation Board, which inside so you get synchronisation, so you get this asymmetric network testing, and you have a lot of links here, so if you are into firmware you can find how you do it. It's also a useful platform for other open source things. It's like net FBGA alternative for a newer generations of FBGA, this is a stand‑alone one unit, so you can do a lot of useful things with that.

So, just to make sure that people are aware of what the RFC 2544 benchmark is, I have published the code for it, and it's only 407 lines of Python. This is how it started. It looks pretty much like what you'd pass to the proprietary person at a high level tester of API, but it is open source and it's Python and it's using NETCONF and YANG to connect ‑‑ you'll see what's built after IETF model, IETF networks. And you specify the nodes which want to address it.

BRIAN NISBET: We're going to have to stop you there unfortunately, we are out of time.

VLADIMIR VASSILEV: Yeah, so the message was sent through. So if people are interested in network testing, they can take a look at the draft, participate in the mailing list and look into the implementation.

BRIAN NISBET: Thank you very much.

All the slides and lots more information is on the website.

So, that is the end of our session. I will just ask you again to consider rating the talks. Also consider volunteering for the Programme Committee if you wish to do such a thing.

At 1800 CEST, there is a BoF from the Internet Society on the ISOC pulse piece in this room, and then at 1900, there is the welcome reception in the room downstairs. So look forward to seeing lots of you at those. Thank you all very much.