Look at any network vendor “recommended designs” and you will invariably find large chassis based switches, multi tier switch architectures, leaf spine topologies, or any number of other designs that involve lots of switches. For the majority of businesses there is no need for more then a single pair of switches. The “recommended designs” are put together by network manufacturers and they have a vested interest in selling you more, not less.
Note: This is for a typical enterprise environment. If you run large GPU farms, or Hadoop clusters, or have a huge amount of east-west network bandwidth requirements that necessitate 100 GbE connectivity hen this falls outside the scope of this post.
Only two switches? But, but!!?! Lets do some math.
The majority of network equipment and servers are either 1Gb or 10Gb connected. 25/40/50/100 GbE do exist but in most typical enterprise style use cases 10 GbE is sufficient and the 40/100 GbE ports are usually used for interconnecting switches. Look at a typical Cisco ACI leaf/spine architecure diagram, it will almost always have 10 GbE on the access layer and 40/100 GbE on the spine switches. Servers can pretty universally have either 10 Gb-T or 10Gb-SFP+ ports while network devices (firewalls/routers) tend to stick mostly with 10Gb-SFP+. If you use third party transcievers there is not much cost difference between 10 Gb-T and 10Gb-SFP+, so:
Focus on a pair of 10Gb-SFP+ based switches.
Cisco has the Nexus 31128PQ and Arista has the 7050SX-128. Both are 96 port 10Gb-SFP+ and 8 port 40 GbE QSFP+. Both switches are line rate switching to all ports so there is no oversubscription to worry about from port to port, and list price for a pair of either is around $50k (standard discount rates are usually between 40-50% off list).
A pair of 96 port 10Gb-SFP+ switches are used
Lets set aside 24 ports per switch for network devices, routers, firewalls, load balancers, security devices, and other appliances. 24 pairs of ports for network devices is probably way overkill as most environments do not have that many physical network devices/appliances…but this post and the associated math is to prove a point so a ‘safe’ and larger number of 24 ports for network devices is chosen. 72 ports are therefore left for use to connect servers, lets assume that all 72 servers are hypervisor hosts (VMware, HyperV, KVM, etc) since there are very few reasons in current times to *not* visualize everything.
72 host servers
For average VM size, lets use 2 vCPU, 16 GB RAM, and 100 GB of storage space. This is slightly higher then the actual industry average.
4:1 oversubscription on CPU (VMware will recommend 8:1, so 4:1 is very safe)
1:1 oversubscription on RAM
1:1 oversubscription on storage space
Hosts are filled to a maximum of 75% to allow for fail-over and redundancy.
There is still a huge price premium on 128GB DIMM’s, a dual socket server will support 24x DIMM @ 64GB DIMM = 1.5 TB of RAM per server
1536 GB RAM * 75% = 1152 GB
1152 GB / 16 GB per VM = 72 VM per host
72 VM * 2 vCPU = 144 vCPU
144 vCPU * 4:1 oversubscription = 36 physical CPU cores = dual 18 core CPU’s
Lets also assume that we are using a software defined storage such as vSAN.
72 VM per host * 72 hosts * 100 GB per host = 506.25 TB of storage space
506.25 TB / 75% maximum fill = 675 required TB of usable storage
675 TB / 72 hosts = 9.375 TB per host
A typical 1U server has 10x 2.5″ drive bays. 2 disk groups, 4+1 each.
9.375 TB / 8 capacity drives = 1.17 TB or larger SSD drives for capacity drives
108 TB of RAM
2592 physical CPU cores
675 TB of usable SSD storage
72U of rackspace
All of that, plus the 24 different network devices, on a single pair of network switches, and it will all fit in 3 of the smaller 42U racks or only 2 of the taller 50+U racks. The majority of organizations have infrastructures that are far smaller then that.
You would never touch software defined storage with 10 ft pole? Fine. (but you might want to rethink that opinion). Since you are more conservative when it comes to storage, put in a pair of 80 port 8Gb FiberChannel switches or 96 port 16Gb Fiberchannel switches and something such as an all flash Pure array. The whole setup will fit 10U of rackspace and the rest of this post still applies.
Some questions to think about
- Think of how simple the network is. Only a single pair of switches. Your entire switching infrastructure has a total cost of $30k or less. A typical recommended design from any consultant or networking manufacturer would likely start at $150k and some of the more traditional designs would easily be approaching a million.
- Think of the space savings and the reduction of costs in hosting square footage.
- The cost savings in management overhead. I’ve personally seen environments with a server population that is only 10% of this total but there are more then 20 different network switches and a whole team dedicated to keep it running.
- From a business perspective think of the drastic reduction in capital expenditure costs.
- This is how cloud providers think and design and whom every internal IT infrastructure department is competing against. Either innovate and change, or you will eventually be replaced by the cloud.
- If all you need are two fixed-port switches, and the cost for name brand switches is in the $30k range for the both of them, it becomes easier to select the name brand switches instead of whitebox if that is what you so wish. Whitebox/ODM switches vs a full blown Cisco Nexus 7k/9k chassis or a 9k leaf/spine ACI setup where there is likely a $100+ price difference is a much different discussion and financial consideration then a pair of whitebox/ODM switches for $20k vs a pair of Cisco switches for $30k.