wrf - multi cpu amd (4x12core)

Looking for new hardware to run WRF? Intel or AMD? Check this forum.
Post Reply
kamilp
Posts: 1
Joined: Sat Mar 17, 2012 4:54 pm

wrf - multi cpu amd (4x12core)

Post by kamilp » Sat Mar 17, 2012 5:10 pm

I am looking for someone using multi processor multi core system to run WRF model. I have server with 48GB RAM and 4x AMD Opteron 6172 (12 core) CPU running SLES 11 SP1 64bit. 2x 300GB SAS 10krpm, gnu gcc compiler and openmpi
I have issues with scaling the wrf.exe to multiple processors.
When running 1 process with 12 threads, my calculation takes around 2:30h, when running 2 processes each 12 threads it 2:10h, and 4 processes each 12 threads is slightly below 2:00h So the scalability is very low.
I tested also 1 process 48 threads, and 48 processes 1 thread per process, nothing helps.

Anyone has advice what to do to improve the scalability ? I need to improve calculation time below 1 hour.

Surprisingly, I have tested same configuration on PC with Intel Core i7 920 and I had similar time - around 2:30h

Thanks for any ideas...

EDIT: I noticed, I am getting unusually high number of interrupts - according to vmstat approx. 20000-60000 per second

beowrf
Posts: 43
Joined: Sat Jul 23, 2011 3:07 pm

Re: wrf - multi cpu amd (4x12core)

Post by beowrf » Wed Mar 21, 2012 8:24 pm

Hi,

I also ran the model on a 48 core machine at the University. You should better use Intel Fortran Compiler or PGI, Gnu is not handling properly the compilation when using MPI.
Intel is for free, when using non-commercially.

Cheers

alfe
Posts: 99
Joined: Thu Nov 25, 2010 8:13 pm

Re: wrf - multi cpu amd (4x12core)

Post by alfe » Wed Mar 21, 2012 8:43 pm

Actually, what is the compiler used for the binary delivered with EMS distribution ???

meteoadriatic
Posts: 1603
Joined: Wed Aug 19, 2009 10:05 am

Re: wrf - multi cpu amd (4x12core)

Post by meteoadriatic » Thu Mar 22, 2012 12:50 am

alfe wrote:Actually, what is the compiler used for the binary delivered with EMS distribution ???
PGI

pattim
Posts: 199
Joined: Sun Jun 24, 2012 8:42 pm
Location: Los Angeles, CA, USA

Re: wrf - multi cpu amd (4x12core)

Post by pattim » Tue Aug 28, 2012 5:11 am

"1 process with 12 threads" - what does that mean? One WRF-EMS run with NODECPUS = 12? What's the typical CPU utilization during these runs? What memory are you running? Is that a SuperMicro or TYAN box?

I think it also depends on the domain decomposition method you choose and the number of tiles per processor. When I run on a machine like that, it doesn't scale much beyond ~24 processors. I think part of the problem is the HTI links in Opteron 4x12's are lower bandwidth than the HTI links in Opteron 2x12's, but I have not been able to verify that makes a difference. I was able to verify that the 48core Opteron box is faster than a slightly older Intel box for the same number of CPUs.

Post Reply