Tuesday 19 February 2008

January Availability

Last week we received the availability data for the LCG Tiers for
January 2008. This time, at PIC we were just on top of the target for Tier-0/Tier-1: 93%.
The positive read of it is that we are still one of the only three Tiers that reached the reliability target for every month since last July. The other two sites are CERN and TRIUMF.
The negative read is that 93% looks like a too low figure, when we were getting used to score over 95% in the last quarter of 2007.
The 7% unreliability of PIC in January 2008 is fully due to one single incidence that we had in the Storage system the weekend of the 26-27 January. The Monday before (21/01/2008) had been a black-monday in the global markets - European and Asian exchanges plummeted 4 to 7% - so, we still do not discard that our failure might be correlated to that fact.
However, Gerard's investigations point to the fact that the most probable cause of our incident was a less-glamourous problem in the system disk of the dCache core server. The funny symptom is that all the GET transfers from PIC were working fine, but the PUT transfers to PIC were failing. The problem could only be solved by manual intervention of the MoD, who came on Sunday to "press the button".
So, the "moraleja" as we call it in Spanish, could read: a) we need to implement remote reboot at least in the critical servers, b) a little sensor that checks that the system disk is alive would be very useful.
Now, back to work and let's see if next month we reach the super-cool super-green 100% monthly reliability that up to now only CERN is able to reach with apparently no much effort.

Monday 18 February 2008

2008, the LHC and PIC

So, this is 2008. The year that the LHC will (finally) start colliding
protons at CERN. At PIC we are deploying one of the so-called Tier-1 centres: large computing centres that will receive and process the data from the detectors online. There will be eleven of such Tier-1s worldwide. Together with CERN (the Tier-0) and almost 200 more sites (the Tier-2s) these will form one of the largest distributed computing infrastructures in the world for scientific purposes: The LHC Computing Grid.
So, handling the many-Petabytes of data from the LHC is the challenge, and the LCG must be the tool.