wp1 - data collection and metadata compilation in sea regions sissy iona (hcmr/hnodc) emodnet...
TRANSCRIPT
WP1 - Data collection and metadata compilation in sea regions
Sissy Iona (HCMR/HNODC)
EMODNET Chemistry 2 - 4th Steering Committee, 2-3 December 2014, Amsterdam, The Netherlands
Overview of last month activities
• Creation of metadata enriched ODV collections – Regional data set of 28 October 2014
• Data aggregations (not complete-missing conversions)
• QC (zero values, N:P) (not complete)• No DIVA runs
Findings -1: mismatches
Total: 87024 odv txt files imported in ODV 4.6.3 linux 64-bit (Oct 2014)
• 4393 files with mismatches in local_cdi_ids between ODV txt and CDIs
• 30465 files with mismatches in edmo_codes between ODV txt and CDIs
• 15 files with both mismatches• 26 edmo_codes in CDIs (csv files)• 36 edmo_codes in ODV txt files
Summary of mismatches EDMO (from csv file) CDI-Partner nb of CDIs 43 BODC 18 All OK
108 CNR-Venezia-Italy
280 In all, cdi-partner=108 while originator=120
110 ENEA-Frascati-Italy
17 In all, cdi-partner=110 while originator=120
120 OGS 39217 22716 had cdi-partner=120 and originator=108, 127, 128, 134, 144, 145, 149, 237, 238, 1009, 1010, 1130, 1338, 1339, 1710, 2259, 2431
127 CNR-Trieste-Italy
320 In all, cdi-partner=127 while originator=120
134 CNR-Lerici-Italy 453 In all, cdi-partner=134 while originator=120
144 ISMAR-Ancona-Italy
1739 in 140, cdi-partner=144 while originator=120
145 ISMAR-Bologna-Italy
18 In all, cdi-partner=145 while originator=120
164 HCMR 2200 In all, cdi-partner=164 while originator=269 237 SZN-Italy 243 In all, cdi-partner=243 while originator=120
269 HNODC 6426 505 empty, 4 zero file, 18 had differences between local_cdi_id in ODV files and in CDIs, 2947 had cdi-partner=269 and originator=164
353 IEO-Spain 5909 All OK
486 IFREMER-SISMER
7770 1 empty, All OK
681 RIHMI-WDC-Russia
3302 All OK
696 IMS-METU-Turkey
4247 All had differences between local_cdi_id in ODV files and in CDIs
700 IOF-Croatia 1625 All OK 708 Malta 128 All had differences between local_cdi_id in ODV files and in CDIs
711 OC-UCY-Cyprus
512 4 empty, all other OK
730 ICES 40 All OK 963 IORL - Israel 3623 in 29, cdi-partner=963 while originator=710
1130 ARPA Emilia Romagna
1086 In all, cdi-partner=1130 while originator=120
1229 NIB-Slovenia 6902 All OK 1232 INSTM-Tunisia 148 All OK
2432 IMBK-Montenegro
13 All OK
3009 ISPRA 711 All OK 3234 PANGAEA 77 All OK
Total 87024
Findings -2: empty files• 510 empty files
• 4 zeros odv txt files
Findings -3: invalid code (?)
During import in ODV • Error: Invalid parameter code
'SDN:P01::PRESPS02' of primary variable 'PRES (DECIBAR=10000 PASCALS)' detected– Action: PRESPS02 changed to PRESPR01
Findings -4: semantic descriptionsDuring import in ODV : same label, code, different unit *• Warning: Duplicate variable name 'DISSOLVED OXYGEN' found in line 9. None of
the 'DISSOLVED OXYGEN' variables will be imported.Error: Header Line. 'DISSOLVED OXYGEN [umol/l]' is not in the semantic header. Cannot import this variable.Error: Header Line. 'DISSOLVED OXYGEN [ml/l]' is not in the semantic header. Cannot import this variable.
* Same findings for other parameter e.g. PARTICULATE ORGANIC NITROGEN
Findings -4: semantic descriptionsAction: change user label at semantic header (plus at column header)
Findings -5: semantic descriptionsDuring import in ODV : mismatch of labels in semantic and column headers • Error: Header Line. 'NITRATE _NO3-N_ CONTENT
[umol/kg]' is not in the semantic header. Cannot import this variable.
• Error: Semantic header entry 'NITRATE _NO3-N_ CONTENT ' not found in column header line
Name with 6 spacesName with 7 spaces
Findings -6: semantic descriptions
• Missing closing brackets
• Action: corrected missing closing brackets
Findings -7: format errorsDuring import in ODV : wrong time series format• Error: Invalid parameter code 'SDN:P01::YEARXXXX'
of primary variable 'YEAR (yyyy)' detected
Findings -7: format errorsAction: format correction
Findings -8: Warnings• Warning: Label of primary variable differs from standard:Expected
'time_ISO8601‘ but found 'time_ISO8601 [ISO8601]'
• Warning: Header Line. Incorrect meta-variable label: expected 'EDMO_code' - found 'EDMO_CODE‘
• Warning: Header Line. Incorrect meta-variable label: expected 'Bot. Depth [m]' - found 'Bot.Depth [m]'
• Warning: Unexpected empty line (line 2).Warning: Unexpected empty line (line 16).Warning: Unexpected end of SDN semantic header in line 17.
Import to ODV V4.6.3.3 Linux 64-bit(unofficial version because current regional data are not corrected)
Metadata enriched ODV collections
Data aggregations
QC (zero values, N:P)
Substitution procedure of zero values with respective LOQ/2 (1/3)
•It should be applied after the typical QC procedure (selection of 0, 1, 2, 6 QC flags, data aggregation, search out of range data, broad range checks, default values treatment, etc.). •It should also be applied for each institute/NODC separately•The creation of macro file that contains the expressions and the LOQ/2 value is required.•The procedure described below in four steps refers to the simple case of substituting zero values of one parameter.
In ODV 4.6.3:
STEP 1: Create the macro file :Tools macro editor create newcomplete the fields (bold and underlined):‘Label’ ,‘Units’ and ‘Digits’‘Comments’ (optional)‘Input Variables’ Write a name in the ‘New’ field and then click << to move the variable to ‘Defined’ field‘Expression in Post fix Notation’:#1 0.000 <= #1 x.xxx + #1 IFTEExplanation of the notation: if the input variable #1 is less or equal to 0.000 then #1+x.xxx where x.xxx the respective LOQ/2 value)Save as the macro editor file (e.g. ForZeroChanges.mac
Substitution procedure of zero values with respective LOQ/2 (2/3)
STEP 2: Identify the EDMO_Codes of zero-value sample data in the initial aggregated collection :
•Selection criteria availability (for the parameter of interest)•ExportStation data ODV SpreadSheetSelect variables: the primary variable and the variable you are working with and in Data Filter Apply sample range and quality filters Range: select the working variable Acceptable range fields: 0 - 0, Ok. •Close Initial Aggregated Collection•Re-import the exported txt file in ODV and ExportMetadata. The EDMO Codes of all data sets that contain zero values are in the exported metadata.odv file •Note that when selecting in zero values collection F4 for data statistics the mean=0, std=0 but minimum=-0.001 and maximum=0.001 (parameters with 3 digits) (bug?)
Substitution procedure of zero values with respective LOQ/2 (3/3)
STEP3: Substitution of zero values per EDMO_Code per Parameter.
• Load the initial aggregated collection • Station Selection Criteria MetadataEDMO CodeRange: type 1 EDMO_Code in
both fields• Station Selection Criteria Availability (for the variable you are working with)• Export Data Station data ODV SpreadSheetSelect variables (the primary
variable and the variable you are working with)Export All data. Now, the data with EDMO_code you want to work with are in the new data set.
• Close the aggregated collection and import the data exported in the previous stage in ODV. View1scatter-plot
• Right Click on the sample data fields “Derived Variables Expression,Derived,Integrals Macro File Add.
• Choose the macro file of STEP 1 identify (actually assign) the ‘Input Variable’ (check step 1) Press the find and click on the variable you are working with. Click ok.
• The Derived variable (step 1) is inserted in the variables. All zero values are replaced with LOQ/2.
• Export Station data Choose primary variable and derived variable ONLY
STEP 4: Import the above dataset to the initial aggregated collection and select ‘Replace all’ for the stations.
• It should be applied after all other --- QC checks e.g. flag changes, default values, substitution of zero values with LOQ/2 etc.
• All 3 parameters for DIN=NH4+NO2+NO3 must have values in order to have the expression. If one of those is missing no value for the expression returns
• Export in separate ODV.txt files the water body phosphate, water body nitrite, water body nitrate and water body ammonium
• Create a new collection with parameters:depth, water body phosphate, water body nitrite, water body nitrate and water body ammonium
• Insert in the new collection the ODV files selecting merge data in order to avoid replacing stations
• Insert again selecting add/replace and when asked for replace select NO
• Derived VariablesExpression,Derivatives,IntegralsExpression:
#1 #2 + #3+ #4 /• If one of the parameters used in the expression
is missing N:P ratio value will not be calculated• Apply the N:P ratio QC for each area separately
by usingSelection CriteriaDefine Polygon • Export Data
N:P ratio