.. Marine user manual documentation master file, created by sphinx-quickstart on Mon Jun 22 22:10:11 2020. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. Welcome to Marine user guide's documentation! ============================================== .. toctree:: :maxdepth: 3 :caption: Contents: Introduction ============ This project contains the necessary code to produce the data summaries that are included in the Marine User Guide (MUG). These help document the status of the marine in situ data in the CDS after every new data release. The marine data available in the CDS is the result of a series of data releases that are stored in the marine data file system in different directories. This project uses the data in the marine file system, rather than accessing the CDS data. Additionally, the tools employed to create the individual source deck reports are also available in this project. These can be created for a single data release or for the combination of releases included in a Marine User Guide version. Every new data release can potentially be created with a different version of the marine processing software. The current version of this project is compatible with the glamod-marine-processing [#gmp]_ code (https://glamod-marine-processing.readthedocs.io) up to version v8.0.0 (https://zenodo.org/records/17404810). Tool set-up =========== Code set up ----------- To clone the latest available version of the Marine User Guide repository: .. code-block:: bash git clone https://github.com/glamod/marine-user-guide Build the python environment using the requirements.txt. .. code-block:: bash cd marine-user-guide python -m venv .venv/MUG source .venv/MUG/bin/activate pip install -r marine-user-guide/env/requirements.txt Paths setup ----------- Some directory paths and handles are used throughout this document and are summarized in Table 1. .. table:: Some directory paths and handles used throughout the document. ============= ============================================ ============================================================== Shorthand Description Example ============= ============================================ ============================================================== Marine User Guide home directory /ichec/work/glamod/glamod_marine/marine-user-guide Marine User Guide data directory /ichec/work/glamod/data/marine/marine/marine-user-guide_202510 Tag of the MUG version v10 Directory for log files //level2/log ASCII file for sid-dck partitions to process /config//mug_list_full.txt Marine User Guide configuration JSON file /config//mug_config.json ============= ============================================ ============================================================== Marine User Guide ================= Every C3S Marine User Guide version includes a series of figures that describe the marine in situ data holdings in the CDS. The following sections explain how these figures are created for every new version of the Marine User Guide. Initializing a new user guide ----------------------------- The data the tools in this project use and the products created are stored in the marine-user-guide data directory. This directory does not contain the actual data files, but links to the files in the data releases’ directories. This approach greatly simplifies the configuration of the different scripts and is followed even if a given Marine User Guide version is made up of a single data release. The marine-user-guide data directory is then split in directories to host subsequent versions of the Marine User Guide (figure 1). .. figure:: ../pics/in_data_space.png :width: 300 :align: center Edit and as needed for . Every new version of the MUG needs to be initialized in the tools data directory as shown in figure 2. .. figure:: ../pics/file_links.png :width: 300 :align: center Marine User Guide data directory and its relation to the individual data releases' directories. These steps initialize a new version: 1. Create the data configuration file (*mug_file*) by merging the level2 configuration files of the different data releases included in the new version (`level2`). .. code-block:: bash source /setenv.sh python /init_version/init_config.py See table 1 for the meaning of the shorthands. This step copies a and a to /. Hereinafter, referred to as **MUG_version_config** and **MUG_version_list** An example of this step is as follows: .. code-block:: bash $ python init_version/init_config.py Input name of release (no path: release_8.0) 2. Create the directory tree for the version in the marine-user-guide data directory. .. code-block:: bash python /init_version/create_version_dir_tree.py Note that the first two lines do not need to be repeated if these steps are performed in one session. For completeness we will repeat them every time here. 4. Populate it with a view of the merged data releases: rather than copying all the files, this is done by linking the corresponding files from the releases directories to the marine-user-guide data directory. Data linked is the level2 data files and level1a and level1c quicklook json files. A bash script links each data partition and logs to /sid-dck/merge_release_data.*ext*, with ext being ok or failed depending on job termination status. .. code-block:: bash ./init_version/merge_release_data.slurm where: * mug_config: path to *mug_config* file * mug_list: path to *mug_list* file. 5. Check that the copies really reflect the merge of the releases. \ Edit the following script to add the corresponding paths and run. If any does \ not match, it will prompt an error. .. code-block:: bash ./init_version/merge_release_data_check.sh .. important:: This is not working yet! Data summaries -------------- The data summaries are monthly aggregations over all the source-deck ID partitions in the data. These aggregations are on the data counts and observation values and on some relevant quality indicators and are the basis to then create the time series plots and maps included in the MUG. Monthly grids ^^^^^^^^^^^^^ Aggregations in monthly lat-lon grids. The CDM table determines what aggregations are applied: * header table: number of reports per grid cell per month. * observations tables: number of observations and mean observed_value per grid \ cell per month. Each aggregation is stored in an individual netcdf file. Monthly time series of selected quality indicators ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Monthly time series of quality indicators' value counts aggregated over all the source-deck partitions. These are additionally, split in counts by main platform types (ships and buoys) and include the total number of reports. They are stored in ascii pipe separated files. The configuration file monthly_qi.json, includes very limited parameterization, basically the data paths. The python script only works on the CDM header table quality indicators. Running the code: Data summaries ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Grid and time series aggregations are performed by monthly_grids.py and monthly_qi.py, respectively. However, to support speed and ease, both of those scripts are configured and launched (in parallel) by monthly_agg_slurm.py. They use the common configuration file monthly_grids.json (Monthly grids). The launcher script configures and queues a single SLURM job in the log directory (}), named monthly.slurm which executes each line of the monthly.tasks file in the same directory individually. Depending on the job termination status, each aggregation creates an empty aggregation_name.success or aggregation_name.failure file in the log directory. The current configuration for the MUG excludes reports not passing all the quality checks. The same tool can be used to produce data summaries with different filter criteria, but modifying the filter values in the configuration file. .. code-block:: bash source /setenv.sh python /data_summaries/monthly_agg_slurm.py See table 1 for the meaning of the . This creates a txt file containig python commands in //level2/log/monthly.tasks. You can simply run it by: .. code-block:: bash //level2/log/monthly.tasks Or you can execute the single python commands in your terminal. Figures ------- The data summaries generated are used to create the maps and time series plots included in the Marine User Guide. The following sections give the necessary directives to create them, with references to the configuration files used. Number of reports time series plot ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * Data summary used: Monthly time series of selected quality indicators (report_quality counts file: \ total number of reports field only) * Configuration file: nreports_ts_plot.json Duplicate status time series plot ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * Data summary used: Monthly time series of selected quality indicators (duplicate_status file) * Configuration file: duplicate_status_ts_plot.json Report quality time series plot ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * Data summary used: Monthly time series of selected quality indicators (report_quality file) * Configuration file: report_quality_ts_plot.json Number of reports Hovmoeller plots ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * Data summary used: Monthly grids (report counts files: header and observation tables) * Configuration file: nreports_hovmoller_plot.json ECV coverage time series plot grid ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * Data summary used: Monthly grids (report counts files: header and observation tables) * Configuration file: ecv_coverage_ts_plot_grid.json Number of reports and number of months maps ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * Data summary used: Monthly grids (report counts files: header and observation tables) * Configuration file: nreports_and_nmonths_maps.json Mean observed value maps ^^^^^^^^^^^^^^^^^^^^^^^^ * Data summary used: Monthyl grids (mean files: observation tables) * Configuration file: mean_observed_value_maps.json Runnig the code: Figures ^^^^^^^^^^^^^^^^^^^^^^^^ The above figures can be created individually or at once with the bash script /figures/plot_all.sh. For the syntax to run individual plotting scripts we recommend to look into plot_all.sh. Each figure requires its own configuration file, located in //figures/ which might need some edits with new versions of the MUG. Where is defined in /setpath.sh. .. code-block:: bash source /setenv.sh /figures/plot_all.sh where can be `grid` or `ts` to specify to plot only gridded properties (`grid`) or only time series type plots (`ts`). If not specified all figures are created. The bash script executes all plotting scripts in parallel on the login node. We consider this light post-processing which is permitted on login nodes, however, if more data is added it could become necessary to move this to a production node (see individual SID-DCK chapter/code). Log files are written to the log directory (log_dir) and are named in accordance with the scripts. Figures are saved to //level2/reports/. There are no .success/.failure files in this case because the presence/absence of figures is already a good indicator of the exit code. Appendix ======== The configuration files needed to run this project are maintained in the glamod github repository (https://github.com/glamod/marine-user-guide) under directory config/. Every Marine User Guide version has a dedicated directory within this repository which are further subdivided by data summaries and figures, for the whole dataset and optionally for individual source-deck combination (`sd`). The Marine User Guide v10 has been created by *MUG v10* of the github Marine User Guide repository without individual source-deck-combination. .. rubric:: Footnotes .. [#gmp] Lierhammer, L., Andersson, A., Leiding, T., Cornes, R., Kent, E., Siddons, J., and Kennedy, J. (2025). glamod-marine-processing: Toolbox for GLAMOD marine processing (v8.0.0). Zenodo. https://doi.org/10.5281/zenodo.17404810 .. [#fDDS] When producing data summaries and figures of individual source-decks \ of a single release, the data is accessed directly from the release data directory.