This document describes the landscaping exercise proposed for deliverable 4.2 (D4.2) within Work Package (WP) 4 of the FAIR Earth Sciences & Environment services project (FAIR-EASE, FE). The goal of this exercise is to analyse different special use case (UC) datasets per pilot and the requirements they must meet to be included in the data lake infrastructure proposed in D4.1 (landscaping exercise: the (meta)data, software, and cloud needs for the data lake). The pilots per UC are: UC1 - Earth and Environmental Dynamics: Coastal waters dynamics (Pilot 5.1.1), Earth Critical zones observatory (Pilot 5.1.2), and Volcano Space Observatory (Pilot 5.1.3), UC2 - Environmental Bio-geochemical Assets: Ocean Bio-Geo-Chemical Observatory (Pilot 5.2.1) and, UC3 - Biodiversity Observation: Marine Omics Observatory (Pilot 5.3.1). Datasets from each pilot were selected from Table 1 in Annex A of D5.1 (report on key requirements from Use Cases and Pilots, [1]) and with a prior selection from a representative from each pilot. These datasets were selected to cover as much diversity as possible and to reflect the multidisciplinary nature of each UC. The deliverable aims to analyse and highlight the criticalities of the selected datasets considering their current limitations and needs and how they could fit into the “data provider” view proposed for the “data lakes” architecture in D4.1. Bear in mind that the datasets described should not be taken as the only source for the data lake ingestion. As stated, before a few special UCs datasets were selected using the minimum selection and maximum diversity criteria, to analyse the requirements for them to be ingested in the data lakes proposed.

FAIR-EASE_D4.2_Landscaping exercise_The inclusion of special use case datasets in the data lake

Nydia Catalina Reyes Suarez
;
Alessandra Giorgetti;
2023-01-01

Abstract

This document describes the landscaping exercise proposed for deliverable 4.2 (D4.2) within Work Package (WP) 4 of the FAIR Earth Sciences & Environment services project (FAIR-EASE, FE). The goal of this exercise is to analyse different special use case (UC) datasets per pilot and the requirements they must meet to be included in the data lake infrastructure proposed in D4.1 (landscaping exercise: the (meta)data, software, and cloud needs for the data lake). The pilots per UC are: UC1 - Earth and Environmental Dynamics: Coastal waters dynamics (Pilot 5.1.1), Earth Critical zones observatory (Pilot 5.1.2), and Volcano Space Observatory (Pilot 5.1.3), UC2 - Environmental Bio-geochemical Assets: Ocean Bio-Geo-Chemical Observatory (Pilot 5.2.1) and, UC3 - Biodiversity Observation: Marine Omics Observatory (Pilot 5.3.1). Datasets from each pilot were selected from Table 1 in Annex A of D5.1 (report on key requirements from Use Cases and Pilots, [1]) and with a prior selection from a representative from each pilot. These datasets were selected to cover as much diversity as possible and to reflect the multidisciplinary nature of each UC. The deliverable aims to analyse and highlight the criticalities of the selected datasets considering their current limitations and needs and how they could fit into the “data provider” view proposed for the “data lakes” architecture in D4.1. Bear in mind that the datasets described should not be taken as the only source for the data lake ingestion. As stated, before a few special UCs datasets were selected using the minimum selection and maximum diversity criteria, to analyse the requirements for them to be ingested in the data lakes proposed.
2023
File in questo prodotto:
File Dimensione Formato  
FAIR-EASE_D4.2_Landscaping exercise_The inclusion of special use case datasets in the data lake.pdf

accesso aperto

Descrizione: This document describes the landscaping exercise proposed for deliverable 4.2 (D4.2) within Work Package (WP) 4 of the FAIR Earth Sciences & Environment services project (FAIR-EASE, FE). The goal of this exercise is to analyse different special use case (UC) datasets per pilot and the requirements they must meet to be included in the data lake infrastructure proposed in D4.1 (landscaping exercise: the (meta)data, software, and cloud needs for the data lake). The pilots per UC are: UC1 - Earth and Environmental Dynamics: Coastal waters dynamics (Pilot 5.1.1), Earth Critical zones observatory (Pilot 5.1.2), and Volcano Space Observatory (Pilot 5.1.3), UC2 - Environmental Bio-geochemical Assets: Ocean Bio-Geo-Chemical Observatory (Pilot 5.2.1) and, UC3 - Biodiversity Observation: Marine Omics Observatory (Pilot 5.3.1). Datasets from each pilot were selected from Table 1 in Annex A of D5.1 (report on key requirements from Use Cases and Pilots, [1]) and with a prior selection from a representative from each pilot. These datasets were selected to cover as much diversity as possible and to reflect the multidisciplinary nature of each UC. The deliverable aims to analyse and highlight the criticalities of the selected datasets considering their current limitations and needs and how they could fit into the “data provider” view proposed for the “data lakes” architecture in D4.1. Bear in mind that the datasets described should not be taken as the only source for the data lake ingestion.
Tipologia: Versione Editoriale (PDF)
Licenza: Copyright dell'editore
Dimensione 7.26 MB
Formato Adobe PDF
7.26 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14083/34103
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact