
PIC Tier-1 availability during August was just right on top of the target: 97%. About 1% of the unavailability was due to the usual monthly Scheduled Downtime which took place on the 25th August. Most of the remaining 2% unavailability we spent it also on that same day, which suggests that there is still room for improvement on SD coordination. One of the A-critical services, the site-bdii, was off during almost 4h after its scheduled intervention... and no one of us noticed! There was also an issue with the Computing Service (B-critical) which had its queues closed for 2h longer than planned. We should now then feed this experience back into our operation system and make sure the relevant procedures are improved.
Besides that, the 19th of August instabilities appeared in the OPN link which were affecting the SRM service, specially for outgoing transfers. The problem disappeared in about 24h, but we never knew what had really happened. The Spanish NREN did not answer to our query for information. First we thought this was an August-effect, but later we realised the problem was that our e-mail contact for operational issues in the network was wrong. We have corrected this and the e-mail we have now should even trigger a ticket opening automatically.
The good news for August were that, despite being one of the hottest in several years, the cooling system of the PIC machine room coped perfectly with it. Seems that the new maintenance team did a good job in preparing the system for the summer campaign.
No comments:
Post a Comment