Random Linear Network Coding-FAQ

Learn more about how and why RLNC is so useful across data transport and data storage solutions.

RLNC stands for Random Linear Network Coding. It is a powerful new technology that can be used to improve the performance of many of today’s communication systems. For example, RLNC can speed up the internet, improve video quality for streaming movies and live events, and decrease size while increasing reliability for the data centers used for cloud computing.

The benefits are huge. In some cases, we see 5-fold increases in throughput over the internet, 80% improvements in data availability for streaming video, and 20-50% reductions in storage center energy consumption.

See below for more information on RLNC's unique capabilities.

FAQ

HOW DOES RLNC MORE EFFICIENTLY TRANSMIT DATA?

RLNC is versatile code and thus, can improve transmissions in a range of different implementations at different layers of the network — often with cumulative results. For explanation purposes, this section describes a typical implementation used in cloud-based applications — it is implemented at the application or TCP / IP layer.

Today’s Internet operates under very strict rules or Internet protocols like TCP / IP to effectively transmit information. Communications under these protocols are based on ‘packets’ or small, broken down pieces of the information that is being sent. In a TCP communication, packets are transmitted sequentially from their point of origin and only accepted at the destination if they arrive in order. Every packet is acknowledged by the recipient. If the Internet were set up to wait for each packet arrival to be acknowledged before transmitting the next packet, it would run way too slowly. Instead, the sender keeps sending out packets, assuming that they will get through. If, however, the sender doesn’t receive the next acknowledgment that it is expecting within some reasonable time frame or it receives the acknowledgment for a later packet before receiving the one it expects, then it treats the unacknowledged packet as lost and sends it again. The sender may be sending Packet #83 when it learns that Packet #21 didn’t arrive. It then re-sends Packet #21 and all subsequent packets. If there are a lot of missing packets, you can imagine that most of the time is spent on re-sending.

in TCP transmissions, RLNC improves communication efficiency by using linear algebra to combine the information of several packets into one, same sized packet. The same number of packets still need to be sent and received, but the packets are now interchangeable, so that if Packet #21 does not arrive, the receiver can use Packet #22 as a substitute, avoiding the need to resend Packets #22 through #83. This packet versatility from RLNC greatly reduces packet re-sends and the associated signaling needed to track ‘lost packets’.

AREN’T YOU JUST MAKING ONE CONNECTION BETTER AT THE EXPENSE OF ALL THE OTHERS?

No. Coding using RLNC doesn’t hurt anyone’s flows. Everyone who applies coding benefits from coding, and those who don’t apply coding can co-exist in the network with those who do without seeing any negative consequences. This allows a graceful introduction of RLNC into the network, with immediate benefits once a few users start to apply RLNC, and increased benefits as more users adopt the technology. Additionally, RLNC users typically free up bandwidth for other users due to reduced signaling and transmissions completing more quickly.

WHAT STANDARDS CHANGES ARE REQUIRED FOR RLNC TO BE ADOPTED?

No standards need to be changed to begin using RLNC for transport. That’s one of the great features of RLNC. It can be implemented within the existing framework. RLNC can be implemented at the application layer and run in a manner that remains compatible with existing systems.

WHAT IS THE COST OF THE EXTRA PROCESSING REQUIRED TO CODE AND DECODE RLNC?

The processing requirements of encoding and decoding vary greatly from device to device and depend on a wide variety of hardware and software characteristics (e.g., chip implementations, memory configurations, available computation libraries, etc.). However, energy consumption measurements performed in wireless devices indicate that RLNC does not impose a high processing cost.

Studies show that RLNC coding uses 2 to 3 orders of magnitude (100 to 1000 times) less energy than WiFi transmission. This small encoding cost reduces download delays by 4x in real-world wireless scenarios, hence offsetting the processing, latency, and energy involved in retrieving the missing data without coding. A mobile device using RLNC uses less energy and therefore has greater battery life.

WHEN DOES RLNC NOT PROVIDE AN ADVANTAGE?

When your connection has no data losses. However, keep in mind that some methods to make sure all data arrives, such as Skype and LTE, which send many copies of the same packet, are simply masking the losses and are inherently wasteful of bandwidth. In these situations, using RLNC in place of the wasteful approach achieves huge advantages. Also, in communications where long round trip times (RTT) can be interpreted as loses, RLNC will help.

CAN RLNC IMPROVE DATA SECURITY?

RLNC is inherently secure since data cannot be extracted from coded packets without the coding coefficients (the constants used in the coded packet equations).

In mobile devices an RLNC security solution has the added benefit of improving battery life while reducing delay for streaming content like video. Unlike traditional encryption, which encrypts the full data packet, RLNC can just encrypt the coding coefficients. Encoding the coefficients results in very low overhead (keeps the key size small) and thus, has low computational overhead, enabling faster decryption.

In cloud applications, the same RLNC characteristics that enable distributed storage and the combining of different flows of data, also enable cloud security. In short, with RLNC not all your data needs to be stored in one place. Today a hacker could break into your corporate server, Dropbox, Google Drive, Box, etc., and steal your valuable files. If you stored only a portion of each file on each site (say 35% of the data on each of the four), but encoded in RLNC, then a hacker breaking into one or two of them would not have what they needed to reconstruct your data. Furthermore, any of those sites could be offline and you would still have full access.

SOUND TOO GOOD TO BE TRUE?

We hear this a lot. Isn’t this all just too good to be true??? Most of us are trained to be doubtful of such amazing claims, so it’s an easy first reaction to fall back on. That’s why it’s good to see the results. This is the real thing. Test after test, by many different teams in different circumstances, have proven it.

HASN’T ANYONE ELSE THOUGHT ABOUT WAYS TO SOLVE THIS?

The Internet has evolved through a series of innovations, coming in recent years from commercial entities looking to gain an edge for their product offering. Since the current internet is too large for any single company to contemplate a ‘fix’, it is not surprising that the system-level ‘fix’ should originate at the top technical universities in the world. RLNC was invented in 2003 by a group of information theorists, looking at the challenges and inefficiencies of the modern internet; it was designed to address system-level inefficiencies and to be backward compatible with prior generations of coding technologies.

WHY DOES ‘RANDOM’ MATTER?

Random codes are useful for the remarkable combination of simplicity, versatility, and robustness that they provide. The codes are simple because no complicated computations are required to generate the combination to be used in each new coded block. Instead, combinations are generated at random. It’s as simple as rolling dice.

The codes are versatile in the sense that many nodes in a network can independently generate coded packets and new coded packets can be created either from uncoded packets or from other previously coded packets. This is in contrast to other forms of coding that generally require decoding before re-encoding and operate only at the “edges” of the network.

The ability to independently generate data mixtures at random allows each coding node to create new mixtures of whatever information it has available. Coding nodes can then effectively work together even if they have no means to coordinate, since each node can create data distinct from and as useful as that created at other nodes without any knowledge of what the other nodes have done.

Random codes are robust in the sense that a random mixture will almost always be distinct from other mixtures previously created in the network. In fact, the probability that any two mixtures turn out to be the same is less than your chance of winning the lottery!

ISN’T THIS JUST FORWARD ERROR CORRECTION (FEC)? WHAT MAKES THIS SPECIAL?

No. RLNC is different. Like RLNC, FEC can be used to protect information against packet loss. What differs here is that FEC is performed only at the edges of the network while RLNC can (and should!) be used all over the network. Want to gather data twice as fast off the web? If the website has a mirror site, that is a copy somewhere on the web, then you can pull the information off both the original site and its mirror simultaneously. With FEC, the sites would need to decide in advance who sends which packets or coordinate to ensure they are sending different things. With RLNC, each rolls its own dice and sends out its own coded packets; your phone can decode as soon as it gets enough different packets. Even better, anyone can now send you packets, so everyone who has downloaded (some or all of) the data can now act as a mirror site, sending coded packets of what it has received to date. RLNC can also accommodate the bursty losses, frequently found in WiFi links.

HOW DO RLNC LINEAR COMBINATIONS WORK IN TCP TRANSMISSIONS?

TCP transmissions are a bit like collecting baseball cards. If packets were sent out like baseball cards, then you would suffer the fate of collectors everywhere: collecting cards is easy, but getting the ones you want is hard. Said another way, collecting sequential packets would be similar to collecting baseball cards in alphabetical order. If we discard all cards that aren’t the next in line alphabetically, then keeping track of what we have is easy, but getting a complete collection may be hard. You certainly wouldn’t collect baseball cards this way…

To add to the challenge, TCP / IP interprets packet loss as a sign of network congestion, so the rate of packet transmission is halved when a packet loss occurs and increased again slowly after a sequence of successful packet receipts. This is a pretty good approach in wired networks where congestion is a likely explanation for packet loss, but doesn’t work well in wireless networks where dropped packets can be caused by many other factors. In these networks, TCP / IP can magnify even small error rates into large delays in downloading data.

RLNC changes the paradigm by noticing that packets are not baseball cards. Each packet is a string of zeros and ones that can be represented by a number. TCP / IP sends packets 1 and 2 by sending precisely these numbers. If the first data packet is lost or delivered too slowly, then no acknowledgment is received in time, and data packet 1 is transmitted again. This process repeats until data packet 1 is received and acknowledged. An RLNC alternative of the same protocol would calculate “coded packets” 1 and 2 and send those in place of the “data packets.” The coded packets are formed by multiplying the numbers represented by data packets 1 and 2 by constants and then adding the two together, giving

code packet 1 = random number 1 x data packet 1 + random number 2 x data packet 2
code packet 2 = random number 3 x data packet 1 + random number 4 x data packet 2

The random numbers are constants, meaning that they do not depend on the data packets. These constants can be sent to the receiver in the header (that is, at the beginning) of the coded packets in which they are used.

If the receiver gets random numbers 1 through 4, then it can easily solve for packets 1 and 2. As a result, the coded scheme is similar to the standard scheme when no packets are lost. One advantage of coding becomes evident under packet loss. As you remember from high school algebra, any two distinct equations are enough to solve for two unknowns. As a result, in the coded system, the transmitter can keep on sending out new combinations:

code packet 3 = random number 5 x data packet 1 + random number 6 x data packet 2
code packet 4 = random number 7 x data packet 1 + random number 8 x data packet 2

and so on until at least two distinct combinations are received. With coded packets, it doesn’t matter which packets are lost and which received; any two receipts are sufficient to solve for data packets 1 and 2. What’s remarkable about this approach is that there are so many different combinations possible, that we need not choose the coefficients carefully. Instead, we choose these coefficients at random, effectively rolling some dice to choose random numbers; this is the “random” in RLNC.

There are many ways to apply RLNC. One is to stop keeping track of which combinations have been received and simply acknowledge each time a new packet arrives. The transmitter just sends out a sequence of random combinations and the receiver acknowledges each time a new combination is received. The transmission is completed once the number of distinct equations received equals the number of packets to be received. No packets need ever be thrown out, and the transmitter need never know which packets have and haven’t been received. Transmission stops when the number of distinct packets received equals the number of packets x_i to be transmitted. Certainly you couldn’t do better than that.