esnet / iperf

iperf3: A TCP, UDP, and SCTP network bandwidth measurement tool

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Extremely high packet loss with UDP test

clemensg opened this issue · comments

Hi,

iperf2 does not report much packet loss when receiving UDP traffic on an i.MX6 quad-core processor with fec ethernet driver on Linux (4.2-rc7):

10 Mbit/s: 0% packet loss
100 Mbit/s: 0.31% packet loss

iperf3 on the other hand:

10 Mbit/s: 93% packet loss
100 Mbit/s: 99% packet loss

These values can't be right, because both iperf2 and iperf3 TCP RX tests show a maximum bandwidth of over 200 Mbit/s. It only occurs in the UDP RX test with iperf3.

This happens on iperf3.1b3 and on iperf3.0.7 and looks like a serious bug in the iperf3 UDP test. Any idea what's going on?

Cheers,
Clemens

Frequently this (lossy or slow UDP tests) can be the result of using the socket buffer size. Try using the -w option to set a large socket buffer size.

I did not change the socket buffer size. On the same system, with the same socket buffer size, iperf2 works fine, but iperf3 shows a huge amount of packet loss, so isn’t it more likely that there is a bug in the UDP tests of iperf3? Shouldn’t iperf3 work with default parameters at least as good/accurate as iperf2 did?

Can you reproduce this issue? If not, please try on a multi-core ARM based system, for example Raspberry Pi 2 or something similar. I was using a Freescale i.MX6Q.

On 04 Sep 2015, at 18:36, Bruce A. Mah notifications@github.com wrote:

Frequently this (lossy or slow UDP tests) can be the result of using the socket buffer size. Try using the -w option to set a large socket buffer size.


Reply to this email directly or view it on GitHub.

I don't have any multi-core ARM systems available, and these are not platforms that we officially support..

Please try increasing the socket buffer size. I believe that iperf2 and iperf3 use different defaults.

Hi Bruce, I can confirm what Clemens sees. Actually it doesnt matter if I use a IMX6, OMAP or a x86 machine. Even with target bandwidth of a few MBit/s, I get packet loss of at least 3-4% up to 25%. Unfortunately setting the windows/buffer size doesnt change that (tried 500K, 1M, 2M). I can observe the behaviour even when running server and client on the same machine.
What does change the behaviour is setting the bandwidth to unlimited (-b 0). Rates go up high, e.g. 16GBit/s on a i7, and loss is 0.1%. Sort of what I expected.
As I'm otherwise well fond of using iperf3 over 2, is there anything that can be done to find the cause of the issue?
cheers,
Andreas

I observe same issue using two devices connected to switch (consumer grade, not professional) when sending device is connected through 1000BASE-T and server device is connected through 100Base-TX with bandwidth set to anything higher than 10M. Disabling autonegotiation on client side and forcing speed 100Mbit/s Full Duplex resulting in no package loss (or neglible) even with bandwidth like 90M.

Is it possible that iperf3 on fast machine sending most data at the begining of each iteration causing to buffer overflow on switch? I think iperf2 is delaying writes so that data is send more evenly in time frame.

EDIT: Sending machine has i7 onboard and I tested using both on Windows and Linux.

I have met an issue just like this. In the same network connection, use udp. iperf2 is ok, but iperf3 occurred a very high packet loss above 95%. Then I add an option -l in iperf3, it worked well. For iperf2, no -l option added.

commented

confirm. UDP test report on server side seems wrong.
Host A sending UDP stream, 1000 packets, host B (server) reports it lost 500 pakets. Testing with tcpdump/wireshark: on A sending interface packet analyzer counted 1000 pakets, on host B receiving interface packet analyzer counted 1000 packets. There is something wrong with the math inside server code.
One more thing. If host A and host B are on the same subnet - there will be no "packet loss" on iperf3 udp test. If there is any router between A and B there will be random packet loss despite tested bandwidth 20Mbps or 1Gbps.

It looks like having fq_codel as qdisc does improve the situation. Can you confirm that setting tc qdisc add dev $ETH root fq_codel reduces the UDP packet loss for you too?
But in my opinion this should work with the default pfifo_fast qdisc too.. not everybody can change the queuing discipline. And it is very confusing if you get such results with default qdisc settings.

I really like the idea from @folsson (in #386) to reduce the interval of the throttling algorithm to 1ms. At the moment it is still at 100ms: https://github.com/esnet/iperf/blob/master/src/iperf_api.c#L1203

By the way, thanks to @mpitt for the IO graphs in #386! 👍

Cannot confirm fq_codel provides an improvement. However using plain tc qdisc add dev $ETH root fq allows to raise the bandwidth considerably (without packet loss). But only with version 3.1.3.
Also, using multiple threads (-P option) helped me to get more reliable measurements and slightly higher throuhput. I'm doing UDP Rx on an imx6solo though and at ~200MBit the CPU is loaded ~90%. What I do not understand is whyTCP is able to run about twice as fast.

Whilst I haven't seen this issue with iperf 3, it amounts to a ground-up rewrite compared to iperf 2. The UDP code there is normally serviced by an independent thread. This can have an effect on how responsive the pacing of packet generation is; iperf 2 calculated its inter-packet times using floating point. I noticed bwping underfilling the pipe in each time window; it was using select() and integer math only to compute delays/mean packets to send. I suspect some of the underlying variables have changed in iperf 3 given the rewrite. In experimental setups with iperf 2, I've never used anything other than a first-come, first-served queueing discipline.

Pages 197-199 of my PhD thesis https://hdl.handle.net/10023/8681 describes some of these differences. It is known that IP multipath probably requires a different approach from what iperf has now for accurate measurements, but that's really out of the scope of this specific issue. However, if the packet sourcing is bursty, that text sheds light on EXACTLY where to look. The most annoying thing about iperf 2 is how it is written in a broad C++ style, trying to treat UDP sockets like TCP ones as regards binding.

@legraps Maybe TCP is able to run twice as fast because using the TCP sliding window protocol and Window Scale Option reduces RX FIFO overflows in the i.MX6 Ethernet MAC. There is an erratum (ERR004512) mentioning RX FIFO overflows. See http://cache.nxp.com/files/32bit/doc/errata/IMX6DQCE.pdf

@clemensg I'm aware of that errata and have actually observerved that using switches without PAUSE frames support leads to much worse troughput and loss in both TCP and UDP traffic. So apparently the Ethernet flow control (which works independently of the layer 3 protocol) improves the situation (for imx6).

But I still wonder why the TCP test appears to be less CPU intensive than simple UDP. Is counting/verifying the UDP packets in user space more expensive than the kernels TCP implementation?

@legraps Yes, I observed the same thing. We only buy switches with support for IEEE 802.3x Flow Control from now on. The pause frames reduce the RX FIFO overflows, but this is merely treating the symptoms. I am not entirely sure if the real cause is in the ENET MAC IP from Freescale or in the fec driver on Linux or both.
At least when using the right switches, the real world impact is negligible.

Hm, good question: Maybe there are less context-switches when using TCP due to the in-kernel implementation of TCP?

If you enter ethtool -S eth0 before and after the UDP test, did the IEEE_rx_macerr counter increase and by how much? It's probably worse than with TCP. Maybe this error counter leads to other driver behavior, resets, .. ? But I still don't fully understand the fec driver in that regard: http://lxr.free-electrons.com/source/drivers/net/ethernet/freescale/fec_main.c

Seeing this also, iperf3, version 3.0.11, Ubuntu14.04 + update/upgrade.

Here is an example, note the 50%+ UDP datagram loss:

-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.0.12, port 58304
[  5] local 10.5.8.20 port 5201 connected to 192.168.0.12 port 38930
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  6.57 MBytes  55.1 Mbits/sec  0.116 ms  0/841 (0%)
[  5]   1.00-2.00   sec  3.93 MBytes  33.0 Mbits/sec  0.152 ms  476/979 (49%)
[  5]   2.00-3.00   sec  3.38 MBytes  28.4 Mbits/sec  0.294 ms  567/1000 (57%)
[  5]   3.00-4.00   sec  3.39 MBytes  28.4 Mbits/sec  0.197 ms  567/1001 (57%)
[  5]   4.00-5.00   sec  3.51 MBytes  29.4 Mbits/sec  0.148 ms  553/1002 (55%)
[  5]   5.00-6.00   sec  3.64 MBytes  30.5 Mbits/sec  0.146 ms  534/1000 (53%)
[  5]   6.00-7.00   sec  3.48 MBytes  29.2 Mbits/sec  0.171 ms  555/1000 (56%)
[  5]   7.00-8.00   sec  3.38 MBytes  28.4 Mbits/sec  0.166 ms  561/994 (56%)
[  5]   8.00-9.00   sec  3.80 MBytes  31.9 Mbits/sec  0.113 ms  519/1006 (52%)
[  5]   9.00-10.00  sec  3.71 MBytes  31.1 Mbits/sec  0.198 ms  517/992 (52%)
[  5]  10.00-10.25  sec   624 KBytes  20.7 Mbits/sec  0.174 ms  108/186 (58%)
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Jitter    Lost/Total Datagrams
[  5]   0.00-10.25  sec  78.1 MBytes  64.0 Mbits/sec  0.174 ms  4957/10001 (50%)
-----------------------------------------------------------

Here is a session between the same two hosts, same network, same everything, but the TCP results are solid, while the UDP results show insane datagram loss.

-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.0.12, port 58136
[  5] local 10.5.8.20 port 5201 connected to 192.168.0.12 port 58138
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-1.00   sec  6.86 MBytes  57.6 Mbits/sec
[  5]   1.00-2.00   sec  6.88 MBytes  57.7 Mbits/sec
[  5]   2.00-3.00   sec  6.84 MBytes  57.4 Mbits/sec
[  5]   3.00-4.00   sec  6.94 MBytes  58.2 Mbits/sec
[  5]   4.00-5.00   sec  7.07 MBytes  59.3 Mbits/sec
[  5]   5.00-6.00   sec  7.03 MBytes  59.0 Mbits/sec
[  5]   6.00-7.00   sec  6.94 MBytes  58.2 Mbits/sec
[  5]   7.00-8.00   sec  7.06 MBytes  59.3 Mbits/sec
[  5]   8.00-9.00   sec  7.10 MBytes  59.6 Mbits/sec
[  5]   9.00-10.00  sec  7.05 MBytes  59.1 Mbits/sec
[  5]  10.00-10.15  sec  1.08 MBytes  59.0 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth       Retr
[  5]   0.00-10.15  sec  73.9 MBytes  61.1 Mbits/sec   43             sender
[  5]   0.00-10.15  sec  70.9 MBytes  58.5 Mbits/sec                  receiver
-----------------------------------------------------------

Lots of very good analysis here:

http://serverfault.com/questions/691723/extreme-udp-packet-loss-at-300mbit-14-but-tcp-800mbit-w-o-retransmits

Looks like we have to set both the length of the receive buffer (-l) and the bandwidth target:

ex: -l 8192 -b 1G

Trying this did improve my UDP performance quite a bit. The other option that might help is the --zero-copy option, which I have not tried yet.

commented

I am also facing the same issue. I have tried all the option like including -l, -Z, -A , but all are giving same result , a huge packet loss. IN one direction packet loss is around 30 % and in other direction it is around 97%. Really confused what is wrong here. Same scenario is working fine on iperf2.
The test is running between two ubuntu-14.04 vms which have 4 vcpus and 8Gb RAM.

Hi All
I am having the same issue. Am testing a new RGW for UDP testing and its giving me all kind of losses. Did some reading it says to set the UDP buffer size. has anyone tried it? this is making me go crazy as deadline is nearing and i cant keep changing between iperf2 and 3 as the results will be so different. please help!

This seems to be related to the iperf3 design somehow. We see thing consistently with perfSONAR, and recommend using nuttcp for UDP tests instead.

@bltierney hey there.. THANK YOU so much. that worked... nuttcp can be installed as package on linux and yes it does work very very well. ESNET uses it so its pretty cool. @purumishra - i think u should use it as well for udp testing. remember to use the -v for verbose as that will also show u the buffer length.

link - http://nuttcp.net/nuttcp/5.1.3/examples.txt

I have another question... how tocheck for jitter in nuttcp..

We note that iperf 3.1.5 and newer contain a fix for UDP sending defaults (the old defaults resulted in a too-large packets needing to be fragmented at the IP layer). That can account for some of the problems seen in this thread. Closing for now, please re-open or file a new issue if this problem persists.

This one hit me pretty hard. Massive packet loss over UDP (over the internet) with iperf3. 0 packet loss with iperf2.

I'd like to test that fix in 3.1.5. Unfortunately all the builds on the site stop at 3.1.3.

I can build it on mac but I'm not set up to build it on windows. Can someone provide a windows build of 3.1.5 or later?

Dragonfax, because Iperf3 seems to send ALL the packets every 1/10 of a second then it depends on the transmit speed of the sender/server.
If the server has a 10GBit/s connection then it might burst a large amount of packets to the destination and if the destination has for example a 100Mbit/s limit traffic-shapning, even thought you set the (-b100M) prarameter.
Then the destination network might not be able to handle thoose bursts and just fill buffers and giving your connection a conguestion problem.
Even thought the amount of packets is not larger then 100Mbit/s calculated per second it might be to many packets sent for the traffic shapning per 1/10 of a second.

Note that in iperf-3.2 (which is the current version), it doesn't do these massive bursts every 0.1 second anymore...the default is to send packets on 1ms boundaries (which should be somewhat less bursty) but you can also tune the granularity of the timer with the --pacing-timer option.

The place to ask this question (which is not a bad one BTW) is probably on the iperf-dev or iperf-users lists. (I don't do Windows, but I had the impression it's a pretty straightforward build on Cygwin.)

Try these parameters.
iperf3 -c ping.online.net -u -w10000 -l1472 -b10M -t4 -i.5 -p5203 -R

-w10000 (10k buffer size)
-l1472 (1472 is MAX payload size on ethernet WAN)
-b10M (is for 10Mbit bandwidth test)
-t4 (Test for 4 seconds)
-i.5 (to get updates every half second)
-p5203 (ports between 5201-5208 on this host)
-R (is for DOWNLOAD test from server to you)

A problem with traffic-shapning seems to be with smaller packets (for ex. -l100)
Cause there will be 10-15x more packets for the same bandwidth.

commented

Tested UDP packet loss on windows with 3.1.3 and it is still broken. Packet loss is not reported correctly. It is way too high.
Oh and TCP also shows why too many retries in summary (I suppose that RET means how many times a tcp packet had to be resent).

EDIT: Oh, I read though some comments on iperf and it seems that this tool has never worked correctly in different regards. For example this comment shows other problems (https://arstechnica.com/civis/viewtopic.php?t=1113215). So the best thing to do is to avoid this unreliable software at all.

@DennisEdlund Tried what you suggested. Doesn't work. 80-95% packet loss.

We note that iperf 3.1.5 and newer contain a fix for UDP sending defaults (the old defaults resulted in a too-large packets needing to be fragmented at the IP layer). That can account for some of the problems seen in this thread. Closing for now, please re-open or file a new issue if this problem persists.

Hi,@bmah888.

Is there an option for tuning this UDP sending packet size ?
The default value may not always be small enough.

@wangyu- : Try -l (or --length) to set the send size (for a UDP test, this sets the UDP payload size for each datagram).

If you have other questions, it's probably best to post in the iperf-users@sourceforge.net mailing list, rather than adding a comment to a closed issue.

@bmah888

Thank you very much.

If you have other questions, it's probably best to post in the iperf-users@sourceforge.net mailing list, rather than adding a comment to a closed issue.

Got it.

I encounter the same issue, and also with imx6. but running with iperf, on the exact same HW configuration, results much lower results. So I am not sure, is it a real ethernet loss packets with imx6, or is it a bug in iperf3 ? clemensg , is it an imx6 issue ?

Hi @ranshalit, there is a problem in the i.MX6 FEC that results in FIFO overflows. This can be mitigated by using a switch with IEEE 802.3x flow control enabled.
You should also take a look at your device tree and pinmuxing, if you can enable the workaround for ERR006687 (MX6QDL_PAD_GPIO_6__ENET_IRQ 0x000b1 and interrupts-extended = <&gpio1 6 IRQ_TYPE_LEVEL_HIGH>, <&intc 0 119 IRQ_TYPE_LEVEL_HIGH>;), this requires however that pad T3 (GPIO06) is not connected to anything!
Besides you should also use the fq queueing discipline instead of pfifo_fast or fq_codel and - if available - tcp_bbr as congestion avoidance algorithm.
(tc qdisc add dev eth0 root fq and sysctl net.ipv4.tcp_congestion_control=bbr)
What's also important is that you are on a current Linux kernel, because many bugs were fixed in the last releases. So the closer to mainline, the better. Also, bbr is not available before 4.9.

I am using a iperf3.6 now on both server and client and still I could see around 50~60% packet loss as below :-

iperf3 -c 11.1.201.2 -R -P 4 -u -b 0 -p 52014 -l 1440

connected to kernel driver /dev/iperf0
Connecting to host 11.1.201.2, port 52014
Reverse mode, remote host 11.1.201.2 is sending
[ 6] local 11.14.239.22 port 32934 connected to 11.1.201.2 port 52014
[ 8] local 11.14.239.22 port 41636 connected to 11.1.201.2 port 52014
[ 10] local 11.14.239.22 port 45900 connected to 11.1.201.2 port 52014
[ 12] local 11.14.239.22 port 51535 connected to 11.1.201.2 port 52014
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 6] 0.00-1.00 sec 30.8 MBytes 258 Mbits/sec 0.052 ms 18024/40429 (45%)
[ 8] 0.00-1.00 sec 30.8 MBytes 258 Mbits/sec 0.067 ms 18025/40424 (45%)
[ 10] 0.00-1.00 sec 30.8 MBytes 258 Mbits/sec 0.067 ms 18021/40424 (45%)
[ 12] 0.00-1.00 sec 30.8 MBytes 258 Mbits/sec 0.052 ms 18032/40428 (45%)
[SUM] 0.00-1.00 sec 123 MBytes 1.03 Gbits/sec 0.059 ms 72102/161705 (45%)


[ 6] 1.00-2.00 sec 30.5 MBytes 256 Mbits/sec 0.091 ms 43784/65986 (66%)
[ 8] 1.00-2.00 sec 30.6 MBytes 257 Mbits/sec 0.094 ms 43707/65986 (66%)
[ 10] 1.00-2.00 sec 30.6 MBytes 257 Mbits/sec 0.101 ms 43694/65985 (66%)
[ 12] 1.00-2.00 sec 30.5 MBytes 256 Mbits/sec 0.090 ms 43747/65986 (66%)
[SUM] 1.00-2.00 sec 122 MBytes 1.03 Gbits/sec 0.094 ms 174932/263943 (66%)


[ 6] 2.00-3.00 sec 30.6 MBytes 256 Mbits/sec 0.108 ms 44898/67146 (67%)
[ 8] 2.00-3.00 sec 30.6 MBytes 257 Mbits/sec 0.094 ms 44854/67150 (67%)
[ 10] 2.00-3.00 sec 30.6 MBytes 257 Mbits/sec 0.095 ms 44871/67151 (67%)
[ 12] 2.00-3.00 sec 30.6 MBytes 256 Mbits/sec 0.101 ms 44891/67147 (67%)
[SUM] 2.00-3.00 sec 122 MBytes 1.03 Gbits/sec 0.099 ms 179514/268594 (67%)


[ 6] 3.00-4.00 sec 30.5 MBytes 256 Mbits/sec 0.048 ms 44826/67013 (67%)
[ 8] 3.00-4.00 sec 30.3 MBytes 254 Mbits/sec 0.041 ms 44984/67013 (67%)
[ 10] 3.00-4.00 sec 30.3 MBytes 254 Mbits/sec 0.044 ms 44982/67013 (67%)
[ 12] 3.00-4.00 sec 30.5 MBytes 256 Mbits/sec 0.049 ms 44824/67012 (67%)
[SUM] 3.00-4.00 sec 121 MBytes 1.02 Gbits/sec 0.045 ms 179616/268051 (67%)


[ 6] 4.00-5.00 sec 30.6 MBytes 257 Mbits/sec 0.069 ms 44929/67220 (67%)
[ 8] 4.00-5.00 sec 30.5 MBytes 256 Mbits/sec 0.063 ms 44986/67219 (67%)
[ 10] 4.00-5.00 sec 30.5 MBytes 256 Mbits/sec 0.062 ms 44996/67219 (67%)
[ 12] 4.00-5.00 sec 30.6 MBytes 257 Mbits/sec 0.068 ms 44946/67220 (67%)
[SUM] 4.00-5.00 sec 122 MBytes 1.03 Gbits/sec 0.066 ms 179857/268878 (67%)


[ 6] 5.00-6.00 sec 29.3 MBytes 245 Mbits/sec 0.097 ms 45828/67137 (68%)
[ 8] 5.00-6.00 sec 29.3 MBytes 246 Mbits/sec 0.089 ms 45788/67138 (68%)
[ 10] 5.00-6.00 sec 29.3 MBytes 246 Mbits/sec 0.084 ms 45781/67138 (68%)
[ 12] 5.00-6.00 sec 29.3 MBytes 245 Mbits/sec 0.101 ms 45823/67137 (68%)
[SUM] 5.00-6.00 sec 117 MBytes 982 Mbits/sec 0.093 ms 183220/268550 (68%)


[ 6] 6.00-7.00 sec 28.9 MBytes 243 Mbits/sec 0.081 ms 47599/68672 (69%)
[ 8] 6.00-7.00 sec 29.0 MBytes 244 Mbits/sec 0.068 ms 47536/68673 (69%)
[ 10] 6.00-7.00 sec 29.0 MBytes 243 Mbits/sec 0.068 ms 47566/68673 (69%)
[ 12] 6.00-7.00 sec 28.9 MBytes 243 Mbits/sec 0.078 ms 47609/68672 (69%)
[SUM] 6.00-7.00 sec 116 MBytes 973 Mbits/sec 0.074 ms 190310/274690 (69%)


[ 6] 7.00-8.00 sec 30.3 MBytes 254 Mbits/sec 0.033 ms 45840/67882 (68%)
[ 8] 7.00-8.00 sec 30.3 MBytes 254 Mbits/sec 0.040 ms 45814/67876 (67%)
[ 10] 7.00-8.00 sec 30.3 MBytes 254 Mbits/sec 0.040 ms 45813/67876 (67%)
[ 12] 7.00-8.00 sec 30.3 MBytes 254 Mbits/sec 0.031 ms 45841/67882 (68%)
[SUM] 7.00-8.00 sec 121 MBytes 1.02 Gbits/sec 0.036 ms 183308/271516 (68%)


[ 6] 8.00-9.00 sec 29.8 MBytes 250 Mbits/sec 0.051 ms 46551/68218 (68%)
[ 8] 8.00-9.00 sec 29.7 MBytes 250 Mbits/sec 0.048 ms 46562/68223 (68%)
[ 10] 8.00-9.00 sec 29.8 MBytes 250 Mbits/sec 0.059 ms 46556/68223 (68%)
[ 12] 8.00-9.00 sec 29.8 MBytes 250 Mbits/sec 0.051 ms 46536/68218 (68%)
[SUM] 8.00-9.00 sec 119 MBytes 999 Mbits/sec 0.052 ms 186205/272882 (68%)


[ 6] 9.00-10.00 sec 30.5 MBytes 256 Mbits/sec 0.121 ms 46190/68423 (68%)
[ 8] 9.00-10.00 sec 30.4 MBytes 255 Mbits/sec 0.121 ms 46273/68421 (68%)
[ 10] 9.00-10.00 sec 30.4 MBytes 255 Mbits/sec 0.115 ms 46282/68421 (68%)
[ 12] 9.00-10.00 sec 30.5 MBytes 256 Mbits/sec 0.112 ms 46188/68424 (68%)
[SUM] 9.00-10.00 sec 122 MBytes 1.02 Gbits/sec 0.117 ms 184933/273689 (68%)


[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 6] 0.00-10.00 sec 898 MBytes 753 Mbits/sec 0.000 ms 0/648126 (0%) sender
[ 6] 0.00-10.00 sec 302 MBytes 253 Mbits/sec 0.121 ms 428469/648126 (66%) receiver
[ 8] 0.00-10.00 sec 898 MBytes 753 Mbits/sec 0.000 ms 0/648123 (0%) sender
[ 8] 0.00-10.00 sec 302 MBytes 253 Mbits/sec 0.121 ms 428529/648123 (66%) receiver
[ 10] 0.00-10.00 sec 898 MBytes 753 Mbits/sec 0.000 ms 0/648123 (0%) sender
[ 10] 0.00-10.00 sec 302 MBytes 253 Mbits/sec 0.115 ms 428562/648123 (66%) receiver
[ 12] 0.00-10.00 sec 898 MBytes 753 Mbits/sec 0.000 ms 0/648126 (0%) sender
[ 12] 0.00-10.00 sec 302 MBytes 253 Mbits/sec 0.112 ms 428437/648126 (66%) receiver
[SUM] 0.00-10.00 sec 3.51 GBytes 3.01 Gbits/sec 0.000 ms 0/2592498 (0%) sender
[SUM] 0.00-10.00 sec 1.18 GBytes 1.01 Gbits/sec 0.117 ms 1713997/2592498 (66%) receiver>

i have tried other options like "-w" and even setting the "-b" option to a desired bandwidth but the max i get is around 1Gbps . I am running the test between two 10G linux servers.

@gourabmajumdar :
Your sender is sending without bandwith limitation on a 3 Gbits/sec interface.
Your receiver is receiving on a 1 Gbits/sec interface.
66% packets dropped are the expected value.