Validation of Dataset Upon Ingestion
After data is uploaded to the ELIXIR Luxembourg platform, it undergoes a validation process before being officially released. This step ensures that the dataset matches its metadata description, meets basic quality standards, and maintains data integrity.
Integrity Checks
Each file is verified using the checksums provided during submission. This process confirms that the data has not been corrupted or altered during transfer.
If integrity issues are detected, the data provider will be asked to re-submit either the entire dataset or only the affected files, depending on the extent of the corruption.
Dataset Transformation and Standardization
ELIXIR Luxembourg data stewards perform only minimal transformations to preserve the original structure and meaning of the data. For more complex modifications, collaboration with the data provider is required to ensure accuracy and context are maintained.
Note
Any update to a released dataset results in a new version under the same catalogue entry. For major updates, it is recommended to publish a new dataset and reference the original in the description.
Re-pseudonymisation
Data providers may request re-pseudonymisation as an additional privacy safeguard. This process replaces existing pseudonyms with new ones, enhancing data protection.
ELIXIR Luxembourg retains the re-pseudonymisation mapping table to support traceability when needed.
Note
Re-pseudonymisation is only possible if the dataset includes detailed metadata, such as a data dictionary and clear documentation on how records are linked via pseudonym attributes.
Data Minimization
<<<<<<< HEAD To support data minimization principles, ELIXIR-LU can split large datasets into smaller, purpose-specific subsets. This helps reduce unnecessary data exposure and improves usability. ======= To support data minimization principles, ELIXIR Luxembourg can split large datasets into smaller, more focused subsets. This helps reduce unnecessary data exposure and improves usability.
main