Understanding an error
How to understand an error that BlueSky has given during a run
Although we strive to make the BlueSky Framework easy to use, simple to configure, and reliable in operation, sometimes things do go wrong. Whenever something goes wrong, BlueSky will abort the current run and print an error message. Understanding what these error messages mean will help you to find and resolve the source of the problem.
BlueSky error-handling process
When an error occurs while running a BlueSky module, the module itself is given the first opportunity to resolve the error. If the module resolves it successfully, no error message is printed (although a note or warning may be logged to the run log). In some cases, the module will recognize the error and do some cleanup of its own, but then re-raise the error, propagating it on anyway.
If the module doesn't resolve the error, then it is propagated back to the BlueSky Kernel (the engine behind the BlueSky Framework). The BlueSky Kernel considers the module to have failed, and will begin to abort the run. At this point you will see a message like:
BlueSky: >>> Aborting execution
However, before the framework aborts, it tries to write out whatever data it has in whatever output formats the user has requested. So the BlueSky Kernel goes through each of the selected output modules in turn and tries to run it with whatever data it does have. Since these output modules may be getting an incomplete set of data, they may produce additional warning messages. Once BlueSky has tried to run all the output modules, it prints a more detailed error message. Finally, the framework terminates with an error code.
Parts of a BlueSky Framework error
Here is an example BlueSky Framework run that was aborted by an error:
BlueSky: BlueSky Framework version 3.1.0 (rev 5953) BlueSky: Using OUTPUT_DIR: /projects/BlueSky/dpryden/trunk/build/output/2008050100.1/ BlueSky: Using WORK_DIR: /projects/BlueSky/dpryden/trunk/build/working/2008050100.1/ BlueSky: Emissions period: 20080501 00Z to 20080502 00Z BlueSky: Dispersion period: 20080501 00Z to 20080502 00Z SMARTFIRE: Downloading fire data for 2008-05-01 from SMARTFIRE... SMARTFIRE: Error retrieving data from SMARTFIRE: >>> Invalid username or password SMARTFIRE: For more information, see http://www.getbluesky.org/smartfire/ BlueSky: >>> Aborting execution StandardFiles: Writing fire locations to standard format file StandardFiles: Successfully wrote 0 fire locations SMOKEReadyFiles: Writing fire locations to SMOKE-ready files SMOKEReadyFiles: WARNING: No fires were written to SMOKE-ready files BlueSky: >>> >>> ERROR: Invalid username or password for SMARTFIRE web service >>> File "build/base/modules/smartfire.py", line 173, in run >>> BlueSky: Completed in 1.08 seconds
Notice that the framework starts up normally. Once it gets to the SMARTFIRE input step, however, there is an error downloading input fire data from the SMARTFIRE system. The SMARTFIRE input module recognizes the cause of the error, and writes:
SMARTFIRE: Error retrieving data from SMARTFIRE: >>> Invalid username or password SMARTFIRE: For more information, see http://www.getbluesky.org/smartfire/
However, the SMARTFIRE module isn't able to usefully handle this error, so it propagates the error back to the BlueSky Kernel. The kernel, in turn, tries to run the selected output modules -- in this case, the StandardFiles and SMOKEReadyFiles outputs. Because the SMARTFIRE module wasn't able to retrieve any inputs, and because there were no other input data sources in this run, there is no data for either of these modules to output. So both the StandardFiles and SMOKEReadyFiles output modules print informational messages to this effect and do nothing:
StandardFiles: Writing fire locations to standard format file StandardFiles: Successfully wrote 0 fire locations SMOKEReadyFiles: Writing fire locations to SMOKE-ready files SMOKEReadyFiles: WARNING: No fires were written to SMOKE-ready files
Now the error that the SMARTFIRE module raised is printed in its entirety:
BlueSky: >>> >>> ERROR: Invalid username or password for SMARTFIRE web service >>> File "build/base/modules/smartfire.py", line 173, in run >>>
Notice that the log message starts "BlueSky". This is because at this point the log message is coming from BlueSky itself, not from the SMARTFIRE module (even though the SMARTFIRE module is the one that raised the error).
Next, BlueSky prints the error message that the SMARTFIRE module provided. The message is "Invalid username or password for SMARTFIRE web service". In this case, the error is of a generic type, but the module could have provided an error type code that might indicate the kind of problem. In that case, there would be an additional line in the error output like:
>>> ERROR TYPE: ValueError
(An error of type ValueError would indicate that the cause of the problem is that, somewhere, a field has an invalid value.)
After the error description, a stack trace is printed. This is a step-by-step list of the subroutines that were executed leading up to the error. Along with each subroutine, a filename is printed, corresponding to the file containing the source code for that subroutine. If the source code is part of a module (and thus the source code file is locally available), then a line number is printed as well, corresponding to the line of the file where the error originated. Here, we can see that the error originates from line 173 of the file "base/modules/smartfire.py", which is part of a subroutine called run.
In this case, the error was raised by a module. So BlueSky only prints the stack trace until it reaches the BlueSky Kernel code -- which, in this case, is a single entry. However, if an error originates in the BlueSky Kernel itself, the stack trace will show a much more complicated flow of execution inside the framework itself. In this case, it might mean that there is a bug in the framework, or it may simply mean that the framework had an error because of a system condition. For example, if the operating system is running low on disk space or memory, or if data files have been corrupted somehow, you may see errors originating inside the Framework itself. However, in most cases, the error is actually caused by a module and not by the framework.
Finally, now that there are no modules left to run, the framework terminates.
BlueSky: Completed in 1.08 seconds
What to do next
If the error message indicates a problem with a simple solution, like "Invalid username or password for SMARTFIRE web service", then that is probably the best step to take. If the error is less clear, like "Unable to convert value to specified type", then a good place to start is to look at your input data files. A misnamed CSV column name or a malformed value in one of the fields may be the cause of the error.
If you're not sure what an error message means, it can be useful to try running BlueSky with the -v (verbose) or -vv (very verbose) command line switches. This will print more information to the console, so you can see what was going on at the time the error occurred. Alternatively, you can inspect the run log, which contains the verbose log messages. If the framework exited abnormally, this will be in a file called run.log in the WORK_DIR. BlueSky will print the location of this directory at the beginning of the run, like so:
BlueSky: Using WORK_DIR: /projects/BlueSky/dpryden/trunk/build/working/2008050100.1/
(If the framework didn't actually exit because of the error, but went on to clean up and exit normally, then the run.log file will get archived in the archive-YYYYMMDDHH.tar.gz file in the output directory. You can use the tar command to extract the run.log from this archive.)
Look through the run.log file for the "Aborting execution" message. Just above it you should see some more detail about what was going on when the problems began.
Sometimes, the root cause of an error is a bug or crash in an underlying model. In this case, the run.log will contain information about exactly how the model was executed, including the complete command line and working directory. It may be useful to try re-running the model manually with the intermediate files in the working directory to see if that provides more information about what went wrong. Additionally, many models produce their own log or "list" files that may provide more information.
Finally, if you're still stuck, you can contact the BlueSky Framework team for support, and we will help as best we can.



