CPDN News and Announcements

This forum provides an FAQ for users as well as allowing you to ask questions about MundayWeb's very popular BOINC stats services.
Post Reply
mo.v
Member
Posts: 99
Joined: Mon Feb 19, 2007 7:31 pm
Location: Portsmouth UK

Post by mo.v »

On Wednesday 19 Sep the cpdn server for data-driven web pages was down again for a few hours. It was soon running again but there may be some delay in the awarding of credits.

BBC web pages (the BBC forum and results pages) are still down. Milo is taking action to avoid the server disks filling up.

There are links to the server status pages of all four ClimatePrediction projects further up this thread.
mo.v
Member
Posts: 99
Joined: Mon Feb 19, 2007 7:31 pm
Location: Portsmouth UK

Post by mo.v »

All ClimatePrediction servers for all projects are now running again.

However, Milo says there was a problem with the database yesterday 20 Sep - a mysterious crash the cause of which could not be determined. This was entirely separate from the outage to upgrade the web pages, forums &c. It is likely that the credit generation script didn't run for that reason.

Credits should be up again by tomorrow; the script will run overnight. It may be a bit slow to catch up, as usual after an outage. So members may not receive their outstanding credits until Sunday or Monday.

A new file server will be installed in Oxford soon, probably within a week or two, easing the pressure on disk space and of course on Milo and Tolu.
mo.v
Member
Posts: 99
Joined: Mon Feb 19, 2007 7:31 pm
Location: Portsmouth UK

Post by mo.v »

The UK Royal Meteorological Society which originally designed our climate models is holding an afternoon of lectures on the topic 'Observing and detecting climate change' at London Zoo on Wednesday 17 October at 1.30pm.

http://www.rmets.org/event/detail.php?ID=268

As far as I know advance booking is not required and attendance is free. Anyone planning to attend may want to post in the Cafe of the cpdn forum so we know who to look out for.

http://www.climateprediction.net/board/ ... 8720#68720
mo.v
Member
Posts: 99
Joined: Mon Feb 19, 2007 7:31 pm
Location: Portsmouth UK

Post by mo.v »

The boinc servers in Berkeley CA are currently down. This means boinc can't at the moment be downloaded or upgraded and the boinc_dev forum is down.

This has no effect on ClimatePrediction servers which are all working. Our trickles are continuing normally.
mo.v
Member
Posts: 99
Joined: Mon Feb 19, 2007 7:31 pm
Location: Portsmouth UK

Post by mo.v »

The cpdn independent forum hosted on an Open University server in Milton Keynes is down. It's the middle link here. The other two links are to the cpdn-boinc forums which are unaffected.

The forum outage is not boinc-related and will not affect models, trickles or credits.
mo.v
Member
Posts: 99
Joined: Mon Feb 19, 2007 7:31 pm
Location: Portsmouth UK

Post by mo.v »

Astro's discovered that while the Milton Keynes server is down it's impossible to attach to ClimatePrediction. The 'Attach' command from the computer to Oxford is routed through the MK server and at the moment can't get through.

It may also be impossible to detach from cpdn at the moment, but I'm not going to test this.......

Milo has had no indication yet from the Open University about what is wrong with their server.
mo.v
Member
Posts: 99
Joined: Mon Feb 19, 2007 7:31 pm
Location: Portsmouth UK

Post by mo.v »

The Open Uni server is now functioning again. So the independent forum is open and it's again possible to attach to CPDN. Hope nobody's been seriously inconvenienced.
mo.v
Member
Posts: 99
Joined: Mon Feb 19, 2007 7:31 pm
Location: Portsmouth UK

Post by mo.v »

I'm afraid there's now a different problem with one of the Oxford servers. A CPDN HadCM coupled model has been unable to upload a 10-year zip file, getting the boinc manager message 'No space left on device/server'. This does not mean your computer disk is full. The server disk has filled up.

The problem may also affect HadCM trickles and HadSM slab zip uploads and trickles.

It would be a good idea for anyone running a CPDN model to suspend network activity in the Activity menu of boinc manager until the problem is resolved. The model can then safely run.

We realise this is difficult for multi-project crunchers who need to keep network activity enabled. They may prefer to avoid the problem by setting CPDN to No new work (using the button in the Projects tab) and suspending their model in the Tasks tab. Then crunch other projects' tasks until the problem is resolved.

It's best to avoid multiple failed uploads of zip files. These are produced at the beginning of December at the end of each decade (eg 1930, 2040) for HadCM and at the end of each 15-year phase for HadSM slabs.

BBC, CPDN beta and SAP models should not be affected by this problem.
Last edited by mo.v on Mon Dec 03, 2007 2:31 am, edited 1 time in total.
mo.v
Member
Posts: 99
Joined: Mon Feb 19, 2007 7:31 pm
Location: Portsmouth UK

Post by mo.v »

Milo confirms that the cpdn server is short of space and says it will take a day or two to fix. At the moment cpdn model trickles are being accepted by the server, but not zip file uploads. This has now been confirmed for all cpdn models, both coupled and slab. Please see the previous post for when your model will create its next zip file.

The best plan of action is to either

* suspend network activity well before you reach a zip file point

* or if you can't suspend network activity, suspend the model instead so it doesn't reach a zip file point while the problem lasts

Beta, BBC and SAP models can upload files.
mo.v
Member
Posts: 99
Joined: Mon Feb 19, 2007 7:31 pm
Location: Portsmouth UK

Post by mo.v »

For CPDN members - does not apply to beta, BBC or SAP.

Milo says the problematic server is up and running again 'for the moment'. My own zip file that couldn't upload has just uploaded. So boinc network activity can be allowed again.

But until the new server is delivered and installed in Oxford it would be a good idea to

* Check your boinc manager Transfers window regularly to ensure that no file is stuck there unable to upload

* Check your boinc manager messages each day if you can. If you have an up-to-date version of boinc, messages indicating a problem will be coloured red

* Check this news thread regularly

* If you only run climate models (not other projects with short tasks) remember you can if you wish keep boinc network activity suspended most of the time and only allow it once a day/week (or month in a few situations!)
mo.v
Member
Posts: 99
Joined: Mon Feb 19, 2007 7:31 pm
Location: Portsmouth UK

Post by mo.v »

We've just discovered that CPDN credits haven't been sent to the stats sites for two days, though BBC and SAP credits seem to getting through normally. Milo has been informed.

When there's a stats glitch everyone receives their correct credits eventually so there's no need to panic.
mo.v
Member
Posts: 99
Joined: Mon Feb 19, 2007 7:31 pm
Location: Portsmouth UK

Post by mo.v »

Milo has fixed this and our CPDN credits on the stats sites have updated.
mo.v
Member
Posts: 99
Joined: Mon Feb 19, 2007 7:31 pm
Location: Portsmouth UK

Post by mo.v »

  • On Wednesday, CPDN credits again did not export to the external stats sites, though CPDN beta, SAP and BBC credits did. We are receiving our credits within CPDN but they are not being reported to these sites. This problem could perhaps take until next week to resolve fully.
  • Again on Wednesday, some CPDN members reported that their HADCM 10-year zip files or HADSM end-of-phase zip files failed to upload to the CPDN server. They received the boinc manager message 'No space left on device'. The device in question is the server whose disk is full (not our home computer disks). Trickles appear to be accepted normally; the problem is the zip files.
  • Tolu has been informed.
  • Until the problem is resolved, if you have a zip file waiting to upload in the boinc manager Transfers window, suspend network activity. We realise that this is difficult for multi-project crunchers. The ideal is to avoid multiple failed upload attempts.

    If your HADCM model is approaching the end of a decade (eg Dec 1 2030), or your HADSM is approaching the end of a 15-year phase, suspend network activity before it creates the zip file. Multi-project crunchers who need network activity could suspend their model in the Tasks tab (while setting CPDN to No new work in the Projects tab) before the zip file is created.
  • Otherwise, CPDN models can safely be left running.
  • BBC, CPDN beta and SAP zip files are uploading normally.
  • The good news is that the new CPDN server with disk space for lots of terabytes has been delivered to Oxford. The less good news is that it hasn't yet been installed. After its installation these periodic server crises should be over.
  • Thank you to all crunchers for your patience and good humour.
mo.v
Member
Posts: 99
Joined: Mon Feb 19, 2007 7:31 pm
Location: Portsmouth UK

Post by mo.v »

* The CPDN upload server is now back to normal and accepting zip files, so the special precautions are no longer necessary. A new server disk has been installed to provide plenty of space for these files.

* The CPDN credits export to the stats sites also appears to be functioning normally now.

* On an unrelated but hot topic, Milo says
'We have had two attempts to upgrade or alter the server software so that it will send the usual 32-bit apps to those who connect with a 64-bit client. Unfortunately, we haven't yet managed to make it work properly. All I can say is that we will manage it eventually.'
mo.v
Member
Posts: 99
Joined: Mon Feb 19, 2007 7:31 pm
Location: Portsmouth UK

Post by mo.v »

The CPDN upload file server climateapps3 is down. This does not affect Beta, BBC or SAP uploads. Further up this thread there are links to all the ClimatePrediction server status pages.
Post Reply