This script uses the netem enhancement of the Linux traffic control facilities to create a random packet loss and iperf3 to generate a stream of bytes and measures the throughput with that loss. It uses the statistics from netstat to measure the actual retransmission rate. The netem packet loss ranges from 0.000001% to 10% and each iperf3 run is 300 seconds. It runs each test twice, once with TCP selective acknowledgement (SACK) enabled and once with it disabled. There is also a 60 second delay between runs so this script will take slightly less than 6 hours to run (12 minutes per loss rate setting * 28 packet loss settings), see example 1.
There is 1 line of output for every test run so you can keep track of the scripts progress. The lines have the format
     Loss Rate X SACK Y
where X is the current netem loss setting and Y is either 1 for SACK enabled or 0 for SACK disabled.
The output from each test run is saved in "statistics" files in the current working directory and named
     DEVICE-YYYY_MM_DD_HH_MM-sack_X-Loss_Rate_Y-.
The time stamp is based on the start of the script. These files are not deleted at the end of the script so that the raw data is available for inspection.
Following the progress lines there is a table detailing the results. The table has the columns
     X Y Measured-retransmission-rate minimum average maximum
Where
     X is 1 for SACK enabled or 0 for SACK disabled
     Y is the netem loss setting
     Measured-retransmission-rate is the actual measured retransmission rate calculated from before and after netstat statistics
After_segments_retransmitted - Before_segments_retransmitted ------------------------------------------------------------ * 100 After_segments_send_out - Before_segments_send_outminimum, average and maximum are the minimum, average and maximum bits per second counts reported by iperf3 where iper3 reports a summary once per second.
The file DEVICE-YYYY_MM_DD_HH_MM-measure-retran-effects-summary, also found in the current working dir holds the lines used to create the table
Usage
measure-retrans-effect.sh IP-ADDRESS-OF-IPERF3-SERVER
Requirements
# time # ./measure_retrans_effect.sh 172.16.1.200 Loss Rate 0.000001 SACK = 1 RTNETLINK answers: No such file or directory Loss Rate 0.000001 SACK = 0 Loss Rate 0.001 SACK = 1 Loss Rate 0.001 SACK = 0 Loss Rate 0.002 SACK = 1 Loss Rate 0.002 SACK = 0 Loss Rate 0.003 SACK = 1 Loss Rate 0.003 SACK = 0 Loss Rate 0.004 SACK = 1 Loss Rate 0.004 SACK = 0 Loss Rate 0.005 SACK = 1 Loss Rate 0.005 SACK = 0 Loss Rate 0.006 SACK = 1 Loss Rate 0.006 SACK = 0 Loss Rate 0.007 SACK = 1 Loss Rate 0.007 SACK = 0 Loss Rate 0.008 SACK = 1 Loss Rate 0.008 SACK = 0 Loss Rate 0.009 SACK = 1 Loss Rate 0.009 SACK = 0 Loss Rate 0.01 SACK = 1 Loss Rate 0.01 SACK = 0 Loss Rate 0.02 SACK = 1 Loss Rate 0.02 SACK = 0 Loss Rate 0.03 SACK = 1 Loss Rate 0.03 SACK = 0 Loss Rate 0.04 SACK = 1 Loss Rate 0.04 SACK = 0 Loss Rate 0.05 SACK = 1 Loss Rate 0.05 SACK = 0 Loss Rate 0.1 SACK = 1 Loss Rate 0.1 SACK = 0 Loss Rate 0.2 SACK = 1 Loss Rate 0.2 SACK = 0 Loss Rate 0.5 SACK = 1 Loss Rate 0.5 SACK = 0 Loss Rate 1 SACK = 1 Loss Rate 1 SACK = 0 Loss Rate 2 SACK = 1 Loss Rate 2 SACK = 0 Loss Rate 3 SACK = 1 Loss Rate 3 SACK = 0 Loss Rate 4 SACK = 1 Loss Rate 4 SACK = 0 Loss Rate 5 SACK = 1 Loss Rate 5 SACK = 0 Loss Rate 6 SACK = 1 Loss Rate 6 SACK = 0 Loss Rate 7 SACK = 1 Loss Rate 7 SACK = 0 Loss Rate 8 SACK = 1 Loss Rate 8 SACK = 0 Loss Rate 9 SACK = 1 Loss Rate 9 SACK = 0 Loss Rate 10 SACK = 1 Loss Rate 10 SACK = 0 0 0.000001 0.000093404 928 941.173 957 0 0.001 0.00194828 744 939.577 956 0 0.002 0.00324975 566 936.263 957 0 0.003 0.0065149 743 931.2 959 0 0.004 0.00736402 705 928.517 960 0 0.005 0.00828617 556 927.483 957 0 0.006 0.0122555 594 919.647 958 0 0.007 0.0165709 558 911.927 950 0 0.008 0.0134935 565 917.767 956 0 0.009 0.0175471 556 908.62 957 0 0.01 0.0198389 398 905.193 950 0 0.02 0.0338234 557 881.31 957 0 0.03 0.044234 536 866.033 946 0 0.04 0.075378 409 816.273 955 0 0.05 0.0926536 359 791.55 949 0 0.1 0.182543 229 677.98 942 0 0.2 0.370417 58.4 511.955 941 0 0.5 0.914121 16.7 288.058 694 0 1 1.77885 6.77 145.887 537 0 2 3.36569 0.00 58.6247 219 0 3 4.64272 0.00 28.29 136 0 4 6.10859 0.00 14.3949 73.0 0 5 7.70439 0.00 10.1231 82.4 0 6 8.74312 0.00 7.57743 40.7 0 7 10.1129 0.00 5.48117 25.6 0 8 11.3716 0.00 4.17613 19.8 0 9 12.2447 0.00 3.32527 19.3 0 10 13.8307 0.00 2.62407 21.4 1 0.000001 0 931 941.58 955 1 0.001 0.000738239 933 941.71 957 1 0.002 0.00147647 932 941.47 958 1 0.003 0.00258384 933 941.39 957 1 0.004 0.00424487 933 941.657 956 1 0.005 0.00535223 937 941.417 960 1 0.006 0.00535981 851 940.553 956 1 0.007 0.00701327 933 941.543 957 1 0.008 0.00775151 935 941.403 957 1 0.009 0.00645974 934 941.37 958 1 0.01 0.0112659 852 941.177 955 1 0.02 0.0171682 935 941.42 959 1 0.03 0.031367 935 941.373 957 1 0.04 0.0388971 935 941.407 959 1 0.05 0.0478588 936 941.52 958 1 0.1 0.0972876 935 941.52 961 1 0.2 0.198604 935 941.497 956 1 0.5 0.511847 377 882.787 946 1 1 1.01652 83.9 554.856 944 1 2 2.04561 31.4 203.7 514 1 3 3.13621 10.5 107.562 284 1 4 4.16288 0.00 68.0029 252 1 5 5.17257 0.00 47.05 136 1 6 6.3048 0.00 30.9149 126 1 7 7.45547 0.00 25.3401 94.5 1 8 8.64183 0.00 18.8597 108 1 9 10.0035 0.00 13.3149 94.5 1 10 11.4885 0.00 11.006 84.0 real 336m13.094s user 0m2.914s sys 1m38.070s |
This is one of the statistics files
# # cat team0-2017_09_29_19_10-sack_1-Loss-Rate_0.5- 4318328295 segments send out 5520901 segments retransmited net.ipv4.tcp_sack = 1 qdisc netem 8179: root refcnt 17 limit 1000 loss 0.2% qdisc netem 817a: root refcnt 17 limit 1000 loss 0.5% Connecting to host 172.16.1.200, port 5201 [ 4] local 172.16.1.207 port 58134 connected to 172.16.1.200 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 85.8 MBytes 719 Mbits/sec 453 141 KBytes [ 4] 1.00-2.00 sec 112 MBytes 937 Mbits/sec 428 156 KBytes [ 4] 2.00-3.00 sec 110 MBytes 927 Mbits/sec 510 76.4 KBytes [ 4] 3.00-4.00 sec 113 MBytes 944 Mbits/sec 291 165 KBytes [ 4] 4.00-5.00 sec 111 MBytes 934 Mbits/sec 356 110 KBytes [ 4] 5.00-6.00 sec 85.7 MBytes 719 Mbits/sec 293 178 KBytes [ 4] 6.00-7.00 sec 112 MBytes 939 Mbits/sec 286 188 KBytes [ 4] 7.00-8.00 sec 110 MBytes 923 Mbits/sec 519 137 KBytes [ 4] 8.00-9.00 sec 113 MBytes 945 Mbits/sec 406 192 KBytes [ 4] 9.00-10.00 sec 112 MBytes 937 Mbits/sec 386 134 KBytes [ 4] 10.00-11.00 sec 113 MBytes 945 Mbits/sec 481 146 KBytes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . [ 4] 290.00-291.00 sec 108 MBytes 902 Mbits/sec 430 218 KBytes [ 4] 291.00-292.00 sec 111 MBytes 933 Mbits/sec 574 154 KBytes [ 4] 292.00-293.00 sec 112 MBytes 944 Mbits/sec 370 148 KBytes [ 4] 293.00-294.00 sec 112 MBytes 944 Mbits/sec 366 174 KBytes [ 4] 294.00-295.00 sec 110 MBytes 923 Mbits/sec 181 290 KBytes [ 4] 295.00-296.00 sec 111 MBytes 933 Mbits/sec 362 157 KBytes [ 4] 296.00-297.00 sec 109 MBytes 912 Mbits/sec 432 109 KBytes [ 4] 297.00-298.00 sec 112 MBytes 944 Mbits/sec 229 132 KBytes [ 4] 298.00-299.00 sec 108 MBytes 902 Mbits/sec 540 36.8 KBytes [ 4] 299.00-300.00 sec 45.0 MBytes 377 Mbits/sec 327 74.9 KBytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-300.00 sec 30.8 GBytes 883 Mbits/sec 117004 sender [ 4] 0.00-300.00 sec 30.8 GBytes 883 Mbits/sec receiver iperf Done. net.ipv4.tcp_sack = 1 4341187451 segments send out 5637905 segments retransmited |
The files left in the working directory
# ls -t team* team0-2017_09_29_19_10-measure-retrans-effect-summary team0-2017_09_29_19_10-sack_0-Loss-Rate_10- team0-2017_09_29_19_10-sack_1-Loss-Rate_10- team0-2017_09_29_19_10-sack_0-Loss-Rate_9- team0-2017_09_29_19_10-sack_1-Loss-Rate_9- team0-2017_09_29_19_10-sack_0-Loss-Rate_8- team0-2017_09_29_19_10-sack_1-Loss-Rate_8- team0-2017_09_29_19_10-sack_0-Loss-Rate_7- team0-2017_09_29_19_10-sack_1-Loss-Rate_7- team0-2017_09_29_19_10-sack_0-Loss-Rate_6- team0-2017_09_29_19_10-sack_1-Loss-Rate_6- team0-2017_09_29_19_10-sack_0-Loss-Rate_5- team0-2017_09_29_19_10-sack_1-Loss-Rate_5- team0-2017_09_29_19_10-sack_0-Loss-Rate_4- team0-2017_09_29_19_10-sack_1-Loss-Rate_4- team0-2017_09_29_19_10-sack_0-Loss-Rate_3- team0-2017_09_29_19_10-sack_1-Loss-Rate_3- team0-2017_09_29_19_10-sack_0-Loss-Rate_2- team0-2017_09_29_19_10-sack_1-Loss-Rate_2- team0-2017_09_29_19_10-sack_0-Loss-Rate_1- team0-2017_09_29_19_10-sack_1-Loss-Rate_1- team0-2017_09_29_19_10-sack_0-Loss-Rate_0.5- team0-2017_09_29_19_10-sack_1-Loss-Rate_0.5- team0-2017_09_29_19_10-sack_0-Loss-Rate_0.2- team0-2017_09_29_19_10-sack_1-Loss-Rate_0.2- team0-2017_09_29_19_10-sack_0-Loss-Rate_0.1- team0-2017_09_29_19_10-sack_1-Loss-Rate_0.1- team0-2017_09_29_19_10-sack_0-Loss-Rate_0.05- team0-2017_09_29_19_10-sack_1-Loss-Rate_0.05- team0-2017_09_29_19_10-sack_0-Loss-Rate_0.04- team0-2017_09_29_19_10-sack_1-Loss-Rate_0.04- team0-2017_09_29_19_10-sack_0-Loss-Rate_0.03- team0-2017_09_29_19_10-sack_1-Loss-Rate_0.03- team0-2017_09_29_19_10-sack_0-Loss-Rate_0.02- team0-2017_09_29_19_10-sack_1-Loss-Rate_0.02- team0-2017_09_29_19_10-sack_0-Loss-Rate_0.01- team0-2017_09_29_19_10-sack_1-Loss-Rate_0.01- team0-2017_09_29_19_10-sack_0-Loss-Rate_0.009- team0-2017_09_29_19_10-sack_1-Loss-Rate_0.009- team0-2017_09_29_19_10-sack_0-Loss-Rate_0.008- team0-2017_09_29_19_10-sack_1-Loss-Rate_0.008- team0-2017_09_29_19_10-sack_0-Loss-Rate_0.007- team0-2017_09_29_19_10-sack_1-Loss-Rate_0.007- team0-2017_09_29_19_10-sack_0-Loss-Rate_0.006- team0-2017_09_29_19_10-sack_1-Loss-Rate_0.006- team0-2017_09_29_19_10-sack_0-Loss-Rate_0.005- team0-2017_09_29_19_10-sack_1-Loss-Rate_0.005- team0-2017_09_29_19_10-sack_0-Loss-Rate_0.004- team0-2017_09_29_19_10-sack_1-Loss-Rate_0.004- team0-2017_09_29_19_10-sack_0-Loss-Rate_0.003- team0-2017_09_29_19_10-sack_1-Loss-Rate_0.003- team0-2017_09_29_19_10-sack_0-Loss-Rate_0.002- team0-2017_09_29_19_10-sack_1-Loss-Rate_0.002- team0-2017_09_29_19_10-sack_0-Loss-Rate_0.001- team0-2017_09_29_19_10-sack_1-Loss-Rate_0.001- team0-2017_09_29_19_10-sack_0-Loss-Rate_0.000001- team0-2017_09_29_19_10-sack_1-Loss-Rate_0.000001- |
Example 2
This shows what it looks like if the device had some kind of traffic control settings already set. Note that there is no "RTNETLINK answers: No such file or directory" message and the DEVICE-YYY-MM-DD-HH-MM-sack_1-Loss-Rate_0.000001- file has qdisc and (in this case but not always) class and filter information before the qdisc netem line.
# ./measure_retrans_effect.sh 172.16.1.200 Loss Rate 0.000001 SACK = 1 Loss Rate 0.001 SACK = 0 Loss Rate 0.002 SACK = 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . # # cat team0-2017_09_30_06_18-sack_1-Loss-Rate_0.000001- 4383866346 segments send out 6649671 segments retransmited net.ipv4.tcp_sack = 1 qdisc htb 1: root refcnt 17 r2q 10 default 0 direct_packets_stat 0 class htb 1:1 root prio 0 rate 496bit ceil 496bit burst 1599b cburst 1599b filter parent 1: protocol ip pref 1 u32 filter parent 1: protocol ip pref 1 u32 fh 800: ht divisor 1 filter parent 1: protocol ip pref 1 u32 fh 800::800 order 2048 key ht 800 bkt 0 match ac100132/ffffffff at 16 qdisc netem 8190: root refcnt 17 limit 1000 loss 1.00117e-06% Connecting to host 172.16.1.200, port 5201 [ 4] local 172.16.1.207 port 58248 connected to 172.16.1.200 port 5201 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 4] 0.00-1.00 sec 114 MBytes 954 Mbits/sec 0 392 KBytes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . |
#!/bin/bash # measure-retrans-effect.sh begins on the previous line # # This script uses the netem enhancement of the Linux traffic control # facilities to create a random packet loss and iperf3 to generate a # stream of bytes and measures the throughput with that loss. It uses # the statistics from netstat to measure the actual retransmission rate. # the netem packet loss ranges from 0.000001% to 10% and each iperf3 run # is 300 seconds. It runs each test twice, once with TCP selective # acknowledgement (SACK) enabled and once with it disabled. There is also a 60 # second delay between runs so this script will take slightly less than 6 # hours to run (12 minutes per loss rate setting * 28 packet loss settings) # # There is 1 line of output for every test run so you can keep track of the # scripts progress. The line has the format # Loss Rate X SACK Y # where X is the current netem loss setting and Y is either 1 for SACK # enabled or 0 for SACK disabled. # # The output from each test run is saved in "statistics" files in the current # working directory and named # DEVICE-YYYY_MM_DD_HH_MM-sack_X-Loss_Rate_Y-. # The time stamp is based on the start of the script. These files are not # deleted at the end of the script so that the raw data is available for # inspection. # # Following the progress lines there is a # table detailing the results. The table has the columns # X Y Measured-retransmission-rate minimum average maximum # Where X is 1 for SACK enabled or 0 for SACK disabled # Y is the netem loss setting # Measured-retransmission-rate is the actual measured retransmission # rate calculated from before and after netstat statistics # # After_segments_retransmitted - Before_segments_retransmitted # ------------------------------------------------------------ * 100 # After_segments_send_out - Before_segments_send_out # # minimum, average and maximum are the minimum, average and maximum bits # per second counts reported by iperf3 where iper3 reports a summary # once per second. # The file DEVICE-YYYY_MM_DD_HH_MM-measure-retran-effects-summary, also found # in the current working dir holds the lines used to create the table # # Requirements # 1. Script must be run as root # 2. iperf3 must be installed # 3. There should be as little other TCP activity as possible running when the # script is running. The netstat statistics are system global so packets # sent out and retransmissions that are not related to the test will throw # the statistics off # # NOTES: # One of the first things this script does is delete any qdisc traffic control # settings. If none are set you will get the error # RTNETLINK answers: No such file or directory # which can be safely ignored. If you do not get the error it means that # something else was set. This setting has now been deleted and will need # to be reset after the script completes. See the # DEVICE-YYY_MM_DD_HH_MM-sack_1-Loss-Rate_0.000001- file to see what it was. # # If this script is interrupted BE SURE to run the command # tc qdisc del dev {DEVICE} root # To clear the netem settings that have most likely not been cleared. # # # Version 1.0 September 29, 2017 MEASURERETRANSEFFECTVERSION="1.0_2017-09-29" # # Copyright (C) 2017 Noah Davids # This program is free software: you can redistribute it and/or modify it # under the terms of the GNU General Public License as published by the Free # Software Foundation, version 3, https://www.gnu.org/licenses/gpl-3.0.html # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. if [ $# -ne 1 ] then echo "Usage:" echo " measure-retrans-effect.sh TARGET-IP" exit fi # Use the "ip route get" command to determine which device will be used to # send packets. Replace any dash characters in the device name with tilda # characters. The dash is used as a separator in the constructed file names # and things will break if there is a dash in the name. SERVER=$1 DEVICE=$(ip route get $SERVER | awk '{ for (x=1; x<=NF;x++) if ($x=="dev") \ print $(x+1)}') DEVNODASH=$(echo $DEVICE | tr "-" "~") # Get a current date-time stamp to include in the file names NOW=$(date +%Y_%m_%d_%H_%M) # for a lot of different loss rates for x in 0.000001 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009 \ 0.01 0.02 0.03 0.04 0.05 0.1 0.2 0.5 1 2 3 4 5 6 7 8 9 10 do # for SACK enabled (1) and disabled (0) for y in 1 0 do # this goes to standard out so we have some idea of the progress of the test echo Loss Rate $x SACK = $y # I could not get the block to work without this, not sure why for q in 1 do # extract out the segments out and retransmitted segment counter before the # test starts. Put both values on 1 line echo $(netstat -s | egrep "segments send out|segments retransmited" | \ tr "\n" " ") # set the SACK value sysctl net.ipv4.tcp_sack=$y # show the current traffic control settings. If something is odd we may # need this for debugging or to restore the initial settings tc qdisc show dev $DEVICE tc class show dev $DEVICE tc filter show dev $DEVICE # delete any traffic control settings on the device, this will clear class # and filter too tc qdisc del dev $DEVICE root # set the traffic control loss setting tc qdisc add dev $DEVICE root netem loss random $x% # show the new settings, again may be needed for debugging tc qdisc show dev $DEVICE tc class show dev $DEVICE tc filter show dev $DEVICE # now run the iperf test. Output is in megabits (-f m), the reporting interval # is 1 second (-i 1). This is the default but I wanted it explicit and run the # test for 300 seconds (-t 300) so we get a good average behavior even for the # very loss loss rates. iperf3 -f m -i 1 -t 300 -c $SERVER # show the SACK value -- debugging aid -- again sysctl net.ipv4.tcp_sack # extract out the segments out and retransmitted segment counter after the # test ends. Put both values on 1 line echo $(netstat -s | egrep "segments send out|segments retransmited" | tr "\n" " ") # write everything to a statistics file done > $DEVNODASH-$NOW-sack_$y-Loss-Rate_$x- # goto sleep for a minute to try to let things go back to normal before # running another test sleep 60 done done # create the summary file name OUTPUT=$DEVICE-$NOW-measure-retrans-effect-summary # output a header -- start by putting everything in a temporary file so # that is can be "columnized" echo "SACK Loss-Rate retrans-% Min Average Max" > /tmp/$OUTPUT # For each of statistics files for x in $(ls $DEVNODASH-$NOW-sack_*-Loss-Rate_*) do # extract out the SACK and Loss-Rate values from the name, then calculate the # percentage increase in the change of retransmitted segments versus segments # out. Then for each of the one second statistics lines extract out the # Bandwidth value and find the minimum, the average of all the bandwidth # values and the maximum value echo $(echo $x | tr "\-_" " " | awk '{print $8 " " $NF}') \ $(cat $x | grep segments | awk '{print $1 " " $5}' | tr "\n" " " | \ awk '{print ($4-$2)/($3-$1) * 100}') \ $(cat $x | grep sec | grep -v sender | grep -v receiver | \ awk '{print $7}' | sort -n | head -1) \ $(cat $x | grep sec | grep -v sender | grep -v receiver | awk '{sum += $7; n++} END { print sum / n;}') \ $(cat $x | grep sec | grep -v sender | grep -v receiver | awk '{print $7}' | sort -n | tail -1) done | sort -nk2 >> /tmp/$OUTPUT # group the output by SACK settings and run the output through column to make # it pretty and save it in the current dir along with everything else. (grep ^0 /tmp/$OUTPUT; grep ^1 /tmp/$OUTPUT) | column -t > $OUTPUT # output the summary file cat $OUTPUT # clean up the temporary file rm -f /tmp/$OUTPUT # clean up the last traffic control setting tc qdisc del dev $DEVICE root |