12-04-2016, 11:14 AM
techniques of segment performance evaluation
Abstract
A turbo coded hybrid automatic repeat request (TC-HARQ) scheme is proposed based on packet combining technique and segment selective repeat. First, we demonstrate through simulation that, data combining gives substantial performance improvement over log-likelihood ratio (LLR) combining with the cost of more buffers. Second, we propose a novel retransmission strategy for TC-HARQ: segment selective repeat (SSR), which intends to select the worst corrupted part of the packet for retransmission. We show that SSR increases system throughput and has a potential to perform better with more sophisticated use of LLR.
Introduction
Targeted optimization of program segments can provide an additional program speedup over the highest default optimization level, such as -O3 in GCC. The key challenge is how to automatically search for performance sensitive program segments in a given code, to which a customized set of optimization compiler options could be applied. In this paper we propose a method for automatic detection of performance sensitive program segments based on program segment similarity. First we create a proxy segment template database trained over a set of random input programs. The compiler identifies program segments by correlating them to the pre-build proxy segment templates using the syntax structure and architecture-dependent behavior similarity. We argue that the identified program segments can be custom optimized to improve the overall program performance. The method is evaluated on the Intel XScale PXA255 platform using randomly selected benchmarks. The experimental results show that our method can provide additional speedups over the highest optimization level in GCC 3.3 (-O3) for an arbitrary set of applications.
The notion of Software-Defined Networking (SDN) has already been introduced into cloud datacenter networks for provisioning virtual network environment. Network virtualization of today is generally achieved by L2-in-L3 tunneling protocols like VXLAN (Virtual eXtensible LAN) and NVGRE (Network Virtualization using Generic Routing Encapsulation) in public cloud datacenters. Some leading production packages for network virtualization have adopted an Edge-Overlay model that performs tunnel encapsulation and decapsulation processes at high-functional virtual switches to utilize existing network equipment. However, a severe performance problem arises because of the software-based tunneling processes. Alternatively, the STT (Stateless Transport Tunneling) protocol overcomes the problem by modifying the semantics of the TCP header, but such changes in semantics raises pragmatic issues in that network middleboxes can discard STT packets as an anomaly. In this paper, we propose a novel layer 4 protocol (Segment-oriented Connection-less Protocol, SCLP) for existing tunneling protocols such as VXLAN and NVGRE. SCLP is designed to not only accelerate the throughput of tunneling protocols, but prevent the packet discarding problem by providing a single-semantic header. Specifically, SCLP can exploit GRO (Generic Receive Offload) feature supported by the Linux kernel to reduce the number of packets to be software-interrupted. We implemented the SCLP protocol and applied it to the VXLAN protocol instead of UDP. As a result, the throughput of the VXLAN over SCLP protocol was almost doubled to the original UDP-based one at maximum.
Abstract
A turbo coded hybrid automatic repeat request (TC-HARQ) scheme is proposed based on packet combining technique and segment selective repeat. First, we demonstrate through simulation that, data combining gives substantial performance improvement over log-likelihood ratio (LLR) combining with the cost of more buffers. Second, we propose a novel retransmission strategy for TC-HARQ: segment selective repeat (SSR), which intends to select the worst corrupted part of the packet for retransmission. We show that SSR increases system throughput and has a potential to perform better with more sophisticated use of LLR.
Introduction
Targeted optimization of program segments can provide an additional program speedup over the highest default optimization level, such as -O3 in GCC. The key challenge is how to automatically search for performance sensitive program segments in a given code, to which a customized set of optimization compiler options could be applied. In this paper we propose a method for automatic detection of performance sensitive program segments based on program segment similarity. First we create a proxy segment template database trained over a set of random input programs. The compiler identifies program segments by correlating them to the pre-build proxy segment templates using the syntax structure and architecture-dependent behavior similarity. We argue that the identified program segments can be custom optimized to improve the overall program performance. The method is evaluated on the Intel XScale PXA255 platform using randomly selected benchmarks. The experimental results show that our method can provide additional speedups over the highest optimization level in GCC 3.3 (-O3) for an arbitrary set of applications.
The notion of Software-Defined Networking (SDN) has already been introduced into cloud datacenter networks for provisioning virtual network environment. Network virtualization of today is generally achieved by L2-in-L3 tunneling protocols like VXLAN (Virtual eXtensible LAN) and NVGRE (Network Virtualization using Generic Routing Encapsulation) in public cloud datacenters. Some leading production packages for network virtualization have adopted an Edge-Overlay model that performs tunnel encapsulation and decapsulation processes at high-functional virtual switches to utilize existing network equipment. However, a severe performance problem arises because of the software-based tunneling processes. Alternatively, the STT (Stateless Transport Tunneling) protocol overcomes the problem by modifying the semantics of the TCP header, but such changes in semantics raises pragmatic issues in that network middleboxes can discard STT packets as an anomaly. In this paper, we propose a novel layer 4 protocol (Segment-oriented Connection-less Protocol, SCLP) for existing tunneling protocols such as VXLAN and NVGRE. SCLP is designed to not only accelerate the throughput of tunneling protocols, but prevent the packet discarding problem by providing a single-semantic header. Specifically, SCLP can exploit GRO (Generic Receive Offload) feature supported by the Linux kernel to reduce the number of packets to be software-interrupted. We implemented the SCLP protocol and applied it to the VXLAN protocol instead of UDP. As a result, the throughput of the VXLAN over SCLP protocol was almost doubled to the original UDP-based one at maximum.