I am running openfiler as a guest under 5.1, and I am presenting iSCSI LUNs to virtualize 5.1 hosts. If I place trunking vSwitch or vDS port groups on the same switch as the openfiler NICs with no uplinks, I can achieve read/write speeds of around 240 MB/s and 400 MB/s, respectively. This does not change if I add more paths on isolated vLANs/vSwitches. I have tried all kinds of different Roundrobin iops/bytes configurations with "esxcli storage nmp psp roundrobin deviceconfig set," yet when I run iftop on the openfiler guest it looks like the traffic disperses evenly across the paths instead of aggregating the bandwidth. I have tried many sysctl tweaks regarding window sizing and send/receive buffers, and I have tried changing the window size on the vHost with absoutely no difference in speeds. I have also turned off TCP_DELAY, TCP_CORK and TCP_SACK at the kernel level on openfiler.
Where is the bottleneck occurring here? I feel like I'm very close to figuring this out, but I don't want to keep wasting my time if it's not possible to break this threshold. It seems like the potential for at least 400 MB/s read is there. The storage is sliced out of a RAID0 SSD array that can achieve 1GB read/write speeds within a top level guest. I am using E1000E adapters on the vHost. This really feels like something needs to be tweaked within the hypervisor OS.
The interesting thing is that if I put an iperf binary on the vHost and on openfiler for testing, I can acheive 4 gbps over a single link from openfiler to the vHost, and if I run 4 concurrent iperf tests along 4 indivdual paths I seem to top out around 8 gbps total, so it seems like the potential is there.
Thanks!