This document describes the landscaping exercise proposed for deliverable 4.2 (D4.2) within Work Package (WP) 4 of the FAIR Earth Sciences & Environment services project (FAIR-EASE, FE). The goal of this exercise is to analyse different special use case (UC) datasets per pilot and the requirements they must meet to be included in the data lake infrastructure proposed in D4.1 (landscaping exercise: the (meta)data, software, and cloud needs for the data lake). The pilots per UC are: UC1 - Earth and Environmental Dynamics: Coastal waters dynamics (Pilot 5.1.1), Earth Critical zones observatory (Pilot 5.1.2), and Volcano Space Observatory (Pilot 5.1.3), UC2 - Environmental Bio-geochemical Assets: Ocean Bio-Geo-Chemical Observatory (Pilot 5.2.1) and, UC3 - Biodiversity Observation: Marine Omics Observatory (Pilot 5.3.1). Datasets from each pilot were selected from Table 1 in Annex A of D5.1 (report on key requirements from Use Cases and Pilots, [1]) and with a prior selection from a representative from each pilot. These datasets were selected to cover as much diversity as possible and to reflect the multidisciplinary nature of each UC. The deliverable aims to analyse and highlight the criticalities of the selected datasets considering their current limitations and needs and how they could fit into the “data provider” view proposed for the “data lakes” architecture in D4.1. Bear in mind that the datasets described should not be taken as the only source for the data lake ingestion. As stated, before a few special UCs datasets were selected using the minimum selection and maximum diversity criteria, to analyse the requirements for them to be ingested in the data lakes proposed.
FAIR-EASE_D4.2_Landscaping exercise_The inclusion of special use case datasets in the data lake
Nydia Catalina Reyes Suarez
;Alessandra Giorgetti;
2023-01-01
Abstract
This document describes the landscaping exercise proposed for deliverable 4.2 (D4.2) within Work Package (WP) 4 of the FAIR Earth Sciences & Environment services project (FAIR-EASE, FE). The goal of this exercise is to analyse different special use case (UC) datasets per pilot and the requirements they must meet to be included in the data lake infrastructure proposed in D4.1 (landscaping exercise: the (meta)data, software, and cloud needs for the data lake). The pilots per UC are: UC1 - Earth and Environmental Dynamics: Coastal waters dynamics (Pilot 5.1.1), Earth Critical zones observatory (Pilot 5.1.2), and Volcano Space Observatory (Pilot 5.1.3), UC2 - Environmental Bio-geochemical Assets: Ocean Bio-Geo-Chemical Observatory (Pilot 5.2.1) and, UC3 - Biodiversity Observation: Marine Omics Observatory (Pilot 5.3.1). Datasets from each pilot were selected from Table 1 in Annex A of D5.1 (report on key requirements from Use Cases and Pilots, [1]) and with a prior selection from a representative from each pilot. These datasets were selected to cover as much diversity as possible and to reflect the multidisciplinary nature of each UC. The deliverable aims to analyse and highlight the criticalities of the selected datasets considering their current limitations and needs and how they could fit into the “data provider” view proposed for the “data lakes” architecture in D4.1. Bear in mind that the datasets described should not be taken as the only source for the data lake ingestion. As stated, before a few special UCs datasets were selected using the minimum selection and maximum diversity criteria, to analyse the requirements for them to be ingested in the data lakes proposed.File | Dimensione | Formato | |
---|---|---|---|
FAIR-EASE_D4.2_Landscaping exercise_The inclusion of special use case datasets in the data lake.pdf
accesso aperto
Descrizione: This document describes the landscaping exercise proposed for deliverable 4.2 (D4.2) within Work Package (WP) 4 of the FAIR Earth Sciences & Environment services project (FAIR-EASE, FE). The goal of this exercise is to analyse different special use case (UC) datasets per pilot and the requirements they must meet to be included in the data lake infrastructure proposed in D4.1 (landscaping exercise: the (meta)data, software, and cloud needs for the data lake). The pilots per UC are: UC1 - Earth and Environmental Dynamics: Coastal waters dynamics (Pilot 5.1.1), Earth Critical zones observatory (Pilot 5.1.2), and Volcano Space Observatory (Pilot 5.1.3), UC2 - Environmental Bio-geochemical Assets: Ocean Bio-Geo-Chemical Observatory (Pilot 5.2.1) and, UC3 - Biodiversity Observation: Marine Omics Observatory (Pilot 5.3.1). Datasets from each pilot were selected from Table 1 in Annex A of D5.1 (report on key requirements from Use Cases and Pilots, [1]) and with a prior selection from a representative from each pilot. These datasets were selected to cover as much diversity as possible and to reflect the multidisciplinary nature of each UC. The deliverable aims to analyse and highlight the criticalities of the selected datasets considering their current limitations and needs and how they could fit into the “data provider” view proposed for the “data lakes” architecture in D4.1. Bear in mind that the datasets described should not be taken as the only source for the data lake ingestion.
Tipologia:
Versione Editoriale (PDF)
Licenza:
Copyright dell'editore
Dimensione
7.26 MB
Formato
Adobe PDF
|
7.26 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.