Loobos using a suite

Loobos is a well-known eddy flux tower site in the Netherlands. The Loobos example dataset is familiar to most JULES users because it has been used as a standard example since JULESvn1.0 (since JULESvn4.3.1 the Loobos example dataset has been moved from the JULES model download to a separate ‘Doc’ area of the MOSRS).

At this point it’s good to highlight the following about the Loobos example dataset: “This example does a 1 year run at a single point, using the Loobos data. This is intended to be a test of the code, so that users can compare with a set of standard results - it is not necessarily the best set up for users who are interested in modelling the Loobos site.” (from the point_loobos_example.jin file in JULESvn2.x; my bold added). What this means is that Loobos is not a standard configuration in the sense of the note here: it's just a test run.

We're going to run this example using a Rose suite.

STEP 1: The first step is to get hold of an appropriate suite and then modify it for use with your system, which can be done in two ways:

   (1) Download from the Rosie Go system

   (2) Convert from a set of valid namelists using create_rose_app

What do I mean "modify it for use with my system"? This is because SUITES ALWAYS NEED TO BE MODIFIED BEFORE YOU CAN USE THEM. Rose suites are system-specific (unlike namelists): a suite for a particular run on JASMIN will not work on MONSooN even if all the science settings are the same. So: if you are a JASMIN user and a MONSooN-using colleague sends you a suite (or you get it from Rosie Go) then you will need to modify it before it'll run (at the very least, you usually need to change the NetCDF library paths to the correct ones for your platform). I've explained how to do that bit in STEP 3 of links (1) or (2) above using the point-and-click Rose Editor.

n.b. Even if you already have a Rose suite, please at least skim through the instructions of (1) or (2) - whichever is most appropriate - because (i) you may have a suite but may not have modified it appropriately for your system yet and (ii) your environment variables must be set up correctly, which you'll need in the steps below.

 

STEP 2: Next, quickly check that the output directory (exists and) is empty:
   ls $OUTPUT
Running the model is now accomplished either by clicking the play button in the Rose Edit GUI or by typing:
   rose suite-run -C $RSUITE
and then expand the ‘linux      running’ line in the Cylc GUI window when it appears so that you can see the two extra lines fcm_make and jules. You don't see any [INFO] lines appearing in your terminal this time (they have been diverted to the job.out log file). After a minute or so (this is a small example of only a year’s data at a single point) it should finish and say “stopped with ‘succeeded’” in the bottom left corner of the Cylc GUI. Now close the Cylc GUI and you can check the output has appeared in $OUTPUT/ (three .dump. files).

Points for information:

  • Queuing does sometimes take a while with Cylc, but it should only be of the order of minutes.
  • Optionally, on a longer run, while the suite is running right-click on jules → View → job stdout to check on progress.
  • If you find that the fcm_make step works, but it hangs when it moves on to the jules app, check that you have correctly added the JULES command set to $PATH (step 5 here).
  • If you close the Cylc GUI accidentally, type cylc gscan & and double-click on the right Rose suite job to get it back.
  • A new directory ~/cylc-run/ will appear on your system when the run starts, if not there already. This is the runtime locations directory described here: when you run a rose suite, a copy of the whole suite is put in this directory - specifically, at $CSUITE defined below - and it is this copy that is sent to Cylc rather than the original copy at $RSUITE; the directory ~/cylc-run/ is mostly just a scratch directory for Cylc, but it does also contain the JULES run logs you see in Rose Bush later).
  • Note that Rose suites run independently of the session you’re in (Cylc suites run as daemons) so you gain nothing by opening three separate shells and initiating three Rose suite-runs in each rather than running three in the same session.
  • Once you've done an initial run with rose suite-run, it is possible to avoid the fcm_make step on subsequent runs using rose app-run:

   rose app-run -C $RSUITE/app/jules >job.out

  • (a copy of $RSUITE/app/jules/rose-app.conf mysteriously appears at $NAMELIST/rose-app-run.conf: just ignore it or see here for why). HOWEVER, it must be said that the time saving using rose app-run is really minimal (remember fcm_make won’t recompile every time, but only after a change to the source code), so the advised way to run JULES with Rose is still to use the play button or rose suite-run.
  • How can you abort a run? If you're in Rose Edit, right-click on the “jules” job in the Cylc GUI, set its state to ‘failed’ and then click 'Stop Suite'. If on the command line, use:

   rose suite-shutdown –name=${RSUITE##*/} -- --now
   rose suite-clean ${RSUITE##*/}

I've found that some jobs are difficult to kill, however, so if you really need to (and please be careful if you do this), do the following:
   (i) cylc gscan & will open up a window to show all currently-running rose jobs (double-click on all ones you see to open windows for each and leave those open).
   (ii) For each job, reset the status of jules to "failed" and THEN click stop
   (iii) Back in the terminal, do top -u <username> and then kill -9 <PID> to kill any jules.exe jobs still running
   (iv) Finally, delete completely the large directory ~/cylc-run/ (although n.b. this will remove all logs of previous runs)
Those steps should kill any JULES runs you have in progress and then you can restart all from fresh.

 

STEP 3: Look at the log files. The JULES screen output that used to appear in your terminal on running JULES (the [info] lines containing diagnostics of the JULES setup, spin-up and main run) has been redirected to a job.out log file (with the errors split off into a separate file job.err):
   export CSUITE=$HOME/cylc-run/${RSUITE##*/}
   echo $CSUITE
   more $CSUITE/log/job/1/jules/NN/job.out
   more $CSUITE/log/job/1/jules/NN/job.err
(by the way, the namelist files have been extracted from the Rose suite and placed in $CSUITE/work/1/jules/ too (exactly as if you had used the command rose app-run -i -C $RSUITE/app/jules - see ss.3.5.2 here) - along with a copy of rose-app-run.conf - so you can use these to run JULES without Rose at a later point now, if needed). The job.err file should only contain a few (ignorable) warnings because there were no errors in this job. If you prefer, you can also open these two log files through the Cylc GUI by right-clicking on the 'jules   running' line → View → job stdout/job stderr.

If running JULES in parallel (e.g. on JASMIN), it seems that those job.out and job.err files are not necessarily generated (or perhaps not generated until the run ends, which is much less useful). If so, modify lines 3-4 of your Rose suite's app/jules/rose-app,conf file from "default=rose-jules-run" to "default=<PATH_TO_MPIRUN_CMD> jules.exe > <PATH>/run.log 2> <PATH>/error.log" (just under "[command]") where <PATH> is where you want the logs to appear and <PATH_TO_MPIRUN_CMD> is /usr/local/bin/mpirun.lotus on JASMIN.

An alternative, more user-friendly (but also usually slower, unfortunately) way of viewing these logs is to use Rose Bush:
   cd $RSUITE
   rose suite-log $RSUITE
   cd ~
(on MONSooN, instead launch firefox on exvmsrose and go to http://localhost/rose-bush/ - see here). Click on the links inside Rose Bush and make sure you can retrieve job.out for the jules app.

  • If the web browser opens but with a page starting “Index of file:///...”, just close it and try again
  • If you get a message "suite log not found", there is not record on your system of any run of that $RSUITE
  • If there is an error saying “can't establish a connection to the server at 0.0.0.0:8080” then check that there are two lines BUILD_HOST='localhost' and COMPUTE_HOST='localhost' just after [jinja2:suite.rc] in $RSUITE/rose-suite.conf (if those lines are there but you still get this error, try replacing ‘localhost’ with ‘127.0.0.1’ too?)
  • If Rose Bush doesn't work (or it's too slow), there is a simple alternative (see 'optional extra bit' below).

 

CONGRATULATIONS: you have just succeeded in running JULES !!!

 

Summary:
   export M=$HOME/MODELS
   export JULES_ROOT=$M/jules-vn4.8
   export OUTPUT=$M/iofiles/io_loobos/output_Loobos1
   export NAMELIST=$M/iofiles/io_loobos/nmls
   export RSUITE=$NAMELIST/../rsLoobos
   export CSUITE=$HOME/cylc-run/${RSUITE##*/}
   echo $CSUITE
Check the Rose suite (either through Rosie Go or directly in Rose Edit):
   rosie go &
   rose edit -C $RSUITE &
(or, if necessary, more directly using nedit $RSUITE/app/jules/rose-app.conf &)
Compile and run JULES (copy of the namelists appears in $CSUITE/work/1/jules too, if you need them):
   rose suite-run -C $RSUITE          (or the Play button)
Check the output and logs:
   more $CSUITE/log/job/1/fcm_make/NN/job.err
   more $CSUITE/log/job/1/jules/NN/job.err
   more $CSUITE/log/job/1/jules/NN/job.out
   ls $OUTPUT
Or, using Rose Bush:
   cd $RSUITE
   rose suite-log $RSUITE
and finally, clean up previous run logs and scratch files:
   rose suite-clean ${RSUITE##*/}

 

OPTIONAL EXTRA BIT: I find that setting up environment variables helps quite a bit too:

 

(HISTORICAL NOTE for long-term JULES users: In JULESvn1.x a model simulation used to be initiated with:
   cd $JULES_ROOT
   ./xjules <jules_in_example
..., in JULESvn2.x with:
   cd $JULES_ROOT
   ./jules.exe <point_loobos_example.jin
...and in JULESvn3.x with:
   export NAMELIST=$JULES_ROOT/examples/point_loobos
   cd $NAMELIST
   $JULES_ROOT/jules.exe
It is only with JULESvn4.0 that it has become advised to use the Rose system instead and the data for the Loobos example has been removed from the JULES download (and put on Rosie Go instead). There are a lot of advantages to this new system, but it is definitely quite a jump from before).