Esxi with iSCSI SAN slows down with many multiple VMs running
As you probably already know, DAVG refers to disk latency, and yeah, greater than 30msec is usually going to give you a noticeable decrease in performance and responsiveness. Latency can be caused by a lot of issues but first and foremost your disks must be able to handle the IO load you are throwing at them.
IO load refers not only to the # of IO's per second (IOPS), but also the pattern. Random (pattern) I/O is pretty much what you expect from virtualized servers, so your disk configuration needs to do well from a random I/O perspective. Unfortunately, RAID-Z doesn't fit the bill. According to Oracle:
The situation of random inputs is one that needs special attention when considering RAID-Z.
Effectively, as a first approximation, an N-disk RAID-Z group will behave as a single device in terms of delivered random input IOPS. Thus a 10-disk group of devices each capable of 200-IOPS, will globally act as a 200-IOPS capable RAID-Z group. This is the price to pay to achieve proper data protection without the 2X block overhead associated with mirroring.
Oracle says here that a RAID-Z set can handle about the same number of random IOPS as a single disk in the set. A single 7.2k disk can do about 80 IOPS (and that may be a generous number, depending on who you ask), so that means in RAID-Z your entire array can only do 80 random IOPS. Running 5-7 servers on that few IOPS is a recipe for terrible performance.
You would see far better performance if you configured your 4 drives in a RAID-10 set. If you need more than 2TB RAW capacity (which is what you'd get in RAID-10), do RAID-5. Either will give you better random I/O performance than RAID-Z in this case.