Monday 16 March 2009

CMS cpu consumer, back to business

Looks like someone in CMS was reading this blog, since few hours after the last post, saturday evening, CMS jobs started arriving to PIC. We see a constant load of about 300 jobs from CMS since then. Not bad.
Apparently these are the so-called "backfill" jobs. All the Tier-1s but us (and Taiwan, down these days due to a serious fire incident) started running these backfill jobs early March. After a bit of asking around, we found out that PIC was not getting its workload share because the old 32bit batch queue names were hardcoded somewhere in the CMS sytem (we deprecated 32bit queues more than one month ago!) plus they had a bug in the setup script that got the available TMPDIR space wrong.
Good that we found these problems and that they were promptly solved. Now CMS is back to the cpuburning business at PIC. ATLAS is still debugging the memory-exploding problem that stopped jobs being sent to PIC about one week ago. Looks we are close to the solution (missing packages) and we will soon se both experiments competing again for the CPU cycles at PIC.

1 comment:

joseflix said...

Yes, we are back. Be prepared!