Tuesday 27 July 2010

CMS Dark Data

Last month it was ATLAS who was checking the consistency of their catalogs and the actual contents in our Storage. The ultimate goal is to get rid of what has been called as "dark" or uncatalogued data, which fills up the disks with unusable data. Let us recall that at that time ATLAS found that 10% of their data at PIC was dark...
Now it has been CMS that has carried out this consistency check on the Storage at PIC. Fortunately, they have also quite automatized machinery for this so we have got the results pretty fast.
Out of almost 1PB they have at PIC, CMS has found a mere 15TB of "dark data", or files that were not present in their catalog. Most of them from pretty recent (Jan 2010) productions that were known to have failed.
So, for the moment the CMS data seems to be around one order of magnitude "brighter" than the ATLAS one... another significant difference for a two quite similar detectors.

Friday 23 July 2010

ATLAS pilot analysis stressing LAN

These days a big physics conference is starting in Paris. May be this is the reason behind the ATLAS "I/O storm" analysis jobs we saw yesterday running at PIC... if this is so, I hope the guy sending them got a nice plot to show to the audience.
The two first plots on the left show the last 24h monitoring of the number of jobs in the farm and the total bandwidth in the Storage system, respectively. We see two nice peaks around 17h and 22h which got actually very near to a 4Gbytes/second total bandwidth being read from dCache. As far as I remember we had never seen this before at PIC, so we got another record for our picture album.
Looking at the pools that got the load, we can deduce that it was ATLAS who was generating this load. The good news is that the Storage and LAN systems at PIC coped with the load with no problems. Unfortunately, there is not much more we can learn from this: were these bytes actually generating useful information or were they just the artifact of some suboptimal ROOT caches configuration?

Monday 5 July 2010

LHCb token full: game over, insert coin?


This is what happened las 23rd June. The MC-M-DST space token of the LHCb experiment at PIC got full and, according to the monitoring, we are stuck since then.
PIC is probably the smallest LHCb Tier1. Smallest than the average, and this probably creates some issues for the LHCb data distribution model. At first order, they consider all Tier1 the same size so essentially all DST data should go everywhere.
PIC can not pledge 16% of the LHCb needs for various reasons, so this is why some months ago we agreed with the experiment that, in order to still make an efficient use of the space we could provide, the data stored should be somehow "managed". In particular, we agreed that we could just keep the "two last versions" of the reprocessed data at PIC instead of keeping a longer history. Looked like a fair compromise.
Now we have our token full and looks we are stuck. It is time to check if that nice idea of "keeping only the two most recent versions" can actually be implemented.