Cluster problem - slower machine slow down whole cluster

Forum dedicated to older versions of EMS package (WRFEMS v3.2, v3.1 or older). Support is user-to-user based, so please help others if you can.
Post Reply
bunjil
Posts: 1
Joined: Wed Jan 27, 2010 12:54 pm

Cluster problem - slower machine slow down whole cluster

Post by bunjil » Wed Jan 27, 2010 1:02 pm

Hi!

I am building a small giga ethernet cluster with three quad core machines. The problem is that when I add my third machine i get no benefit.

My setup looks like this:

Head: PHENOM II X4 965 3.4 GHZ
Node1: PHENOM II X4 965 3.4 GHZ
Node2: PHENOM X4 9650 2.3 GHZ (less performance than above computers)

I am using your latest betarelease: EMS Version 3.1.1.4.26.beta, running on OpenSUSE.
When I do the NMM benchmark, I get the following:

Head: 30m59s
Head + Node1: 18m58s
Head + Node1 + Node2: 20m45s

A guess is that the slower machine (node 3) drags the entire cluster performance down.

Any other ideas of how to go about this?

jpb
Posts: 14
Joined: Thu Jan 28, 2010 7:25 am

Re: Cluster problem

Post by jpb » Thu Jan 28, 2010 7:28 am

The global performance depends of the slowest node performance...

meteoadriatic
Posts: 1566
Joined: Wed Aug 19, 2009 10:05 am

Re: Cluster problem

Post by meteoadriatic » Thu Jan 28, 2010 1:05 pm

So ideally, all nodes should have the same computing power?

jpb
Posts: 14
Joined: Thu Jan 28, 2010 7:25 am

Re: Cluster problem

Post by jpb » Fri Jan 29, 2010 6:16 pm

Except for the Head, this could be a good idea to have same Nodes.
After, the performance depends on the ethernet switch of course. If you have a direct connection between the head and one node, you will have better performance in proportion than with multiple nodes connected to a switch.

wedgef5
Posts: 3
Joined: Mon Nov 30, 2009 1:55 pm

Re: Cluster problem

Post by wedgef5 » Wed Feb 03, 2010 1:35 pm

I found that even a slower head can drag down a cluster. Matching hardware is definitely optimal.

meteoadriatic
Posts: 1566
Joined: Wed Aug 19, 2009 10:05 am

Re: Cluster problem

Post by meteoadriatic » Wed Feb 03, 2010 6:40 pm

If you have two machines, one somewhat slower than other, what would be better to do; put slower machine as head or as slave node?

scott.gayer
Posts: 1
Joined: Thu Apr 29, 2010 1:01 pm

Re: Cluster problem

Post by scott.gayer » Thu Apr 29, 2010 1:16 pm

I have the same issue, however all of my machines are the same, same hardware, and software loads. All boxes benchmark out the same. However my run times for the ARW Benchmark are 1 box 107 minutes, 2 boxes 65 minutes, 3 boxes 95 minutes, and 4 boxes 110 minutes. While the benchmark is running I do not see my memory loads increase more for more boxes, my network load never goes above 20Mb/s and the CPUs are all maxed at 100%. I am running IBM Intellistation Z Pro Model 6221 dual processor Xeon 2.4Ghz, with 2Gb RAM and Gb ethernet, and the PIV cores. Any ideas?

meteoadriatic
Posts: 1566
Joined: Wed Aug 19, 2009 10:05 am

Re: Cluster problem

Post by meteoadriatic » Sun May 02, 2010 10:58 am

I think this is network related problem. My thinking is that there is no demand for high network traffic between nodes, but for minimum latency, and here comes where the congestion might be found - if network latency becomes high, your remote processors will have large waiting time for input.

If I am correct, then you can find in htop output that large portion of cpu utilization is displayed in red color instead in green. That will mean your cpu is not utilized at 100% but only at that portion that is in green color, the remaining cpu time (red) is in fact waiting for data input that comes slowly from the network.

So, if you find that I'm correct, your best move will be upgrading network to infiniband type.

I hope this helps, please share with us your findings.

taylormade
Posts: 18
Joined: Fri Aug 26, 2011 9:36 pm

Re: Cluster problem - slower machine slow down whole cluster

Post by taylormade » Wed Aug 31, 2011 10:56 pm

I agree with meteoadriatic, Infiniband is wonderful. However if you do not want to drop the cash for it (and the configuration is a bit more complex), one simpler solution is to aggregate several ethernet cards together to form a larger pipe. You can increase your bandwidth by "bonding" 2 gigabit ethernet cards together so that traffic is distributed across both. You will not get infiniband performance, but it should help. If you google ethernet aggregation or bonding, you should find some tutorials on how to do this.

Not entirely how it is implemented in wrf-ems, but it may also be worth investigating the mpi configuration you are using, as MPI can be 'tuned' to specific interconnects to increase performance (for instance if you decided to go for infiniband).

In terms of slower machines bogging down the simulation. For sure, if the grid is partitioned equally, faster nodes can get stuck waiting for slower nodes to finish their calculation, especially in tightly coupled analysis like NWP.

Hope that helps a bit,

Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest