different types of ingestion
DESCRIPTION
Different types of ingestion. Ingestion for replication Essential data Oriented replication and subscription Added services Integration of the legacy databases Parsing of the GTS flow Global products Geo-localisation products. Added Services 1: Access to GTS legacy database. Advantages - PowerPoint PPT PresentationTRANSCRIPT
GTS Ingestion Simdat Meeting 05-02-08
Different types of ingestion Ingestion for replication
– Essential data– Oriented replication and subscription
Added services– Integration of the legacy databases– Parsing of the GTS flow
• Global products• Geo-localisation products
GTS Ingestion Simdat Meeting 05-02-08
Added Services 1: Access to GTS legacy database
Advantages– Not mandatory to parse and crack the GTS files– Use of the legacy databases behind the switch (outside vgisc
area)o Database of flat files in folders mapped on the short header TTAAiiGGgggo Database of flat files in folders mapped on the hour or the minute
Disadvantages– It’s impossible to have a dynamic metadata state (no
notification)– Catalogue Synchronization and replication impossible– static representation of GTS products, no standard GTS
database Existing sofware in Meteo France
– Database of flat files in folders mapped on the short header TTAAiiGGggg– FTP access (Jakarta Java Classes)
GTS Ingestion Simdat Meeting 05-02-08
GTS Switch
DR
CN
NWPDB
GTS Database
OtherDB
GTSCollectionsMetadata
NWPMetadata
RemoteSwitch
RemoteCN
GTS Database
GTS Database
GTS Ingestion Simdat Meeting 05-02-08
Added Services 2: GTS database in the VGISC area
Advantages• The Data Repository owns its GTS Data Bases (Data ,
Metadata)• Dynamic metadata state (what is present in the database)• Individual messages database• The real-time push harvesting of metadata is possible• The GISC parser waits and stores. It works on solicitation.
Disadvantages– Parsing and cracking the GTS files– Management of the Metadata -> what structure ?– Hard work with strong knowledge on GTS messages
GTS Ingestion Simdat Meeting 05-02-08
GTS information package '2 Flux GTS System' {1/5}
FluxGTS
GTS_File
Document
TTACode
_doc
1..*
_doc
1..*
Heading
T1T2A1A2IICCCCYYGGggBBB
header
1
header
1
Bulletin
Collection
IndividualMessage
_collectionOfMsg1..*
_collectionOfMsg1..*
Grib
METAR
ID_OACI
SYNOP
ID_METEO
TAF
ID_OACI
TextBrut
// Binary formatExtension file : b or ub
// Alphanumeric TextExtension file : a or ua
<<actor>>
GTS_FileManagermanage
manage
GetFile
GetFile
BUFR
// Liste of product :SA, SI, SM, FC, FT (To be completed)
GTS Ingestion Simdat Meeting 05-02-08
Experience: GTS anomalies Collections
– Stations that do not belong to the country– A few TTAA headers not compliant with the tables (Analyses
for example)– Different collections with the same header, BBB integrated
(need rules to choose the good one) Individual messages
– The same messages (same station, same type, same time) are different in different collections
GTS Ingestion Simdat Meeting 05-02-08
GTS Switch
DR
CN
NWPDB
OtherDB
GTSCollectionsMetadata
NWPMetadata
RemoteSwitch
RemoteCN
GTS Database
GTS Database
GTS Database
GTS GISCParser
GTSMessagesMetadata