v2.0, 2023-06-27

Yareta

FIRST VISIT

  • You have received the confirmation e-mail for the access to your ORGANIZATIONAL UNIT. You - or the person you have designated - are the MANAGER of your ORGANIZATIONAL UNIT (OU). See Organizational Unit, Members and Roles for more information about the role of MANAGER and read our Good practices.

If you do not have an OU, ask to join an existing one: Go to > PRESERVATION SPACE > ORGANIZATIONAL UNITS (OU) > Search for an OU > REQUEST MEMBERSHIP or ask for the creation of a new Organizational Unit: > Getting started.
  • As MANAGER, log onto > Yareta and edit your Organizational Unit to complete its information and assign roles to your team members: Go to > PRESERVATION SPACE > ORGANIZATIONAL UNITS (OU) > Search for your OU > edit MEMBERS and assign ROLES.

Prior to being assigned a role by the MANAGER, each team member needs to log onto the Yareta Portal with their SWITCH login. Once this is done, the person’s user profile will automatically be created and the MANAGER can then add them to the team.

PORTAL OVERVIEW

The Yareta Portal allows you to SEARCH FOR AN ARCHIVE (public access) or CREATE A DEPOSIT (authenticated access). When you are authenticated, as a user, you may:

  1. have access to and download some non-open access data archives

  2. create deposits to generate data archives.

This guide focuses on the creation of data archives. Here are the main sections of the Yareta Portal:

  • HOME – The home page!

  • DOWNLOAD HISTORY – The place where you can see all previously downloaded data archives.

  • DEPOSIT – The place where you upload your datasets and generate new archives.

  • PRESERVATION SPACE – The place where you manage your organizational unit(s), team and deposits.

screenshot

MAIN STEPS TO GENERATE A DATA ARCHIVE

Go to the DEPOSIT page > CREATE NEW DEPOSIT

1. Edit the METADATA

  • The metadata required in this form follows the DataCite requirements for metadata (see [The DataCite Metadata Schema]). This makes your research data findable and citable.

Some metadata are displayed by default in the form. Please modify and adapt them to your new deposit if necessary.
To help you, tooltips are available for each form field. Move the mouse on the field and get information about the metadata.
  • SAVE your deposit to register your metadata

2. Add RESEARCH DATA

You have three options to add your research data to your deposit:

  1. UPLOAD PRIMARY DATA: upload new research data

  2. STRUCTURED UPLOAD: upload a zip file which will then be decompressed while preserving its contents' folder structure.

  3. ASSOCIATE ARCHIVE: add an existing data archive to create a COLLECTION.

Collections are useful for linking deposits that pertain to the same project but have been organized and archived separately. For example, if you have amassed large amounts of data over many years, you may choose to archive them in separate deposits which you will have affixed with different time periods for clarity’s sake, and then create a collection from the resulting archives to have them all grouped in one convenient location.

Once you have either added your first data file, or uploaded a zip file, all uploading options will be unlocked:

  • UPLOAD PRIMARY DATA – upload one or more files containing observational, experimental, reference, simulation, derived or digitized data

  • UPLOAD SECONDARY DATA – upload one or more files containing publication, data paper or documentation

  • STRUCTURED UPLOAD – upload a zip file which will then be decompressed while preserving its contents' folder structure.

  • ADVANCED MODE – upload special data (e.g. software or Information Packages)

The size limit for uploading files through the interface is 1GB per file but several files may be uploaded at the same time. At your request, we can provide you with assistance with uploading larger datasets; please contact us for personalized service Yareta IT Support. NOTE: Some files contain patterns that our system identifies as extraneous to your research data and are therefore classified as "excluded" or "ignored". These files must be deleted, or manually approved in the case of "ignored" files.

3. Data file statuses

Your data files may display the following statuses:

  • CHANGE RELATIVE LOCATION - The file’s relative location is being updated. For example, it is being moved from folder A to folder B

  • CLEANED - The system has purged the contents of your file.

  • CLEANING - The system is purging the contents of your file to make space.

  • EXCLUDED FILE - The system has identified the file as being extraneous to your research data and must therefore be deleted

  • FILE FORMAT IDENTIFIED - The system has identified the file’s format

  • FILE FORMAT UNKNOWN - The system was unable to identify the file’s format

  • IGNORED FILE - The system has identified the file as being possibly extraneous to your research data and must therefore be manually approved or deleted

While not extraneous to your research data, compressed file formats will be marked as ignored as the system will be unable to automatically document the file’s contents’ metadata. If you must upload compressed files, you will be required to upload a README file alongside it to describe its contents for end users.
  • IN ERROR - An error has occurred. Have a look at the error report and contact us (Yareta IT Support) if necessary.

  • PROCESSED - The system has processed the file

  • READY - The file is ready (All files in your deposit must be ready in order to submit it)

  • RECEIVED - The system has received your file and it will now be processed

  • TO PROCESS - The file is about to be processed

  • VIRUS CHECKED - The antivirus has analyzed your file

4. Generate the archive

Once your data files are all uploaded and your metadata is completed, SUBMIT your deposit to launch the archiving process which will create the archive. Please find more information about deposit statuses below.

Before submitting your deposit, please ensure that your data are ready and no longer need to be modified.

Throughout the archival process, your deposit may display different statuses:

  • APPROVED - The deposit has been validated by an APPROVER (Only if you select the submission policy "Deposit with approval", see Submission Policy).

  • CHECKED - An automatic step in the archival process.

  • CLEANED - The system has purged the contents of your deposit.

  • CLEANING - The system is purging the contents of your deposit to make space. Of course, this process is only launched once your deposit has been archived and secured on all of our storage nodes.

  • COMPLETED - The archive has been properly generated; an XML file has been created and a DOI is automatically assigned to your dataset.

In this stage, your deposit has become a data archive and can no longer be modified.
  • IN ERROR - An error has occurred. Have a look at the error report and contact us (Yareta IT Support) if necessary.

  • IN PROGRESS - The deposit has not been submitted and can be modified

  • IN VALIDATION - The deposit has to be approved or rejected by a member who has the role of APPROVER (Only if you select the submission policy "Deposit with approval", see Submission Policy) or a higher role (see Organizational Unit, Members and Roles).

  • PAUSED - The archiving process has been paused.

  • REJECTED - The deposit has been rejected by an APPROVER (Only if you select the submission policy "Deposit with approval", see Submission Policy).

Consult your APPROVER. See Organizational Unit, Members and Roles and enquire as to why your deposit has been rejected and modify it according to feedback.
  • SUBMITTED - the user has submitted the deposit.

KEY CONCEPTS

DOI

A DOI (Digital Object Identifier) will be automatically assigned to your deposit (archive) when the archiving process has ended (COMPLETED DEPOSIT). If you wish to get a DOI before the archiving process is completed, you may already request a DOI while editing the deposit’s METADATA with the RESERVE A DOI button.

If you delete a deposit for which you have reserved a DOI before triggering the archiving process (SUBMITTED STATUS, the reservation will be cancelled and the DOI will not be recorded in the DOI® System.

The DataCite metadata schema

The DataCite Metadata Schema was designed to support dataset citation, discovery and persistence. The DataCite metadata are:

  • Intended to be generic to the broadest range of research datasets

  • Not intended to replace the discipline or community specific metadata

Please ensure that the specific metadata are completed and relevant to your dataset to make it reusable. You need to add these specific metadata within your archive (included in datasets). Yareta only provides you with a metadata form for generic metadata (DataCite).

Rich metadata

Uploading additional metadata files for a dataset beyond the default DataCite metadata is a common use case. Yareta provides a simple way to do this under FILES > ADVANCED MODE, select the Data category: PACKAGE and the Data type: METADATA.

The CUSTOM METADATA type gives you the ability to define a metadata scheme in a format that is unique and meaningful to your research field, such as XML or JSON. Yareta validates both the type (format) of the file and its content (structure) against the metadata scheme. Contact us (Yareta IT Support) to request a new metadata scheme to be uploaded to Yareta.

The Creative Common licences

The CC licences allow you to protect your copyrights when publishing your research data in open access (see [access level]). Creative Commons proposes one dedication tool and six licences that are briefly presented below.

You may ask us to assign any other licence to your data. Let us know at [Support IT].

The Creative Common licences

  1. CC0: The Creative Commons Public Domain Dedication is a public dedication tool, which allows creators to give up their copyright and put their work into the worldwide public domain. CC0 allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, without conditions.

  2. CC BY: This license allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.

  3. CC BY-SA: This license allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use. If you remix, adapt, or build upon the material, you must license the modified material under identical terms.

  4. CC BY-NC: This license allows reusers to distribute, remix, adapt, and build upon the material in any medium or format for noncommercial purposes only, and only so long as attribution is given to the creator.

  5. CC BY-NC-SA: This license allows reusers to distribute, remix, adapt, and build upon the material in any medium or format for noncommercial purposes only, and only so long as attribution is given to the creator. If you remix, adapt, or build upon the material, you must license the modified material under identical terms.

  6. CC BY-ND: This license allows reusers to copy and distribute the material in any medium or format in unadapted form only, and only so long as attribution is given to the creator. The license allows for commercial use.

  7. CC BY-NC-ND: This license allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.

    • BY – Credit must be given to the creator

    • NC – Only noncommercial uses of the work are permitted

    • ND – No derivatives or adaptations of the work are permitted

Note that with the CC0 public domain dedication, no credit must legally be given to the contributor (author). It nevertheless remains a scientific good practice to indicate the source and original creator of a work when it is reused, as per research integrity guidelines.
CC licenses are only suitable for data or datasets that are copyrightable works, in the sense of Swiss copyright law (meaning they fill 3 conditions: made by a human being, have an original character and are perceivable to the senses). If the data doesn’t meet those criteria (eg. it is too factual or created by a measuring instrument), then CC0 is the only possibility.

Access Level

The access level corresponds to the level of restriction/authorization for viewing and downloading the data archive. Please note that no matter which access option you select, the metadata itself always remains visible, in line with FAIR principles.

  • Select the suitable access level:

    • Open > public

    • Restricted > restricted to team members (See Organizational Unit, Members and Roles)

    • Closed > Restricted to the team’s manager, the deposit’s creator and any person designated by the creator/team manager

  • Assign an embargo to postpone the open publication of your data

Access Level

Data Compliance Level

The DATA COMPLIANCE LEVEL is automatically defined when the data files are uploaded. The data compliance level informs you about the compliance of data file formats with recommended file formats for preservation. The reference file formats were defined according to the recommendations of the Library of Congress and implemented through the technical registry PRONOM of The National Archives.

Sensitivity Level

The SENSITIVITY of data is defined according to the DataTags System (http://datatags.org/datatags-compliant) for sharing sensitive Data with Confidence (Sweeney et al, 2015, https://techscience.org/a/2015101601/). A DataTag is a set of security features and access requirements for file handling.

To contribute to the impact, transparency and reproducibility of scientific research, funders and publishers mandate the sharing of data where possible. However, data containing sensitive information about individuals cannot be shared openly without appropriate safeguards. An extensive body of statutes, regulations, institutional policies, consent forms, data sharing agreements, and best practices govern how sensitive data should be used and disclosed in different contexts. DataTags help navigate these complex issues, enabling data sharing in a secure and legal way while maximizing transparency. Each tag provides a well-defined prescription that defines how the data can be legally shared.

Six standardized DataTags levels were defined from public to most restricted level:

  • blue = public

  • green = controlled public

  • yellow = accountable

  • orange = more accountable

  • red = fully accountable

  • crimson = maximally restricted

Sensitivity_level

DUA = Data Use Agreement

Sharing Sensitive Data with Confidence: The Datatags System, from Sweeney et al (2015)

Submission Policy

You can choose a submission policy when you ask for the creation of an Organizational Unit. In case you wish to modify it, please contact us > Yareta IT Support.

Available options:

  • DEPOSIT WITH APPROVAL means that the dataset needs to be validated by an APPROVER before being published; the APPROVER is part of your research team and/or has a peer expertise. He/she is appointed by the MANAGER. See Organizational Unit, Members and Roles.

  • DEPOSIT WITHOUT APPROVAL means that no validation step is required in the archiving process for your Organizational Unit.

Preservation Policy

The preservation policy defines the period during which the archive will be preserved.

Available options:

  • KEEP IT FOR 5 YEARS

  • KEEP IT FOR 1O YEARS (recommended by the SNSF)

  • KEEP IT FOR 15 YEARS

  • KEEP IT FOREVER

The disposal process (General Policies – removal) determines the fate of the archive at the end of its retention period. There are two options to choose from once the retention period has ended:

  • Extend the retention period

  • Dispose of the data while preserving the metadata as a "tombstone"

We will contact you at the end of your archive’s retention period so you can decide which option to choose.

Organizational Unit, Members and Roles

An Organizational Unit (OrgUnit) is a logical entity defined by security rules (see user roles below). Used as an administrative entity to manage member roles. The OrgUnit may be associated with a research project, a laboratory, a department or any other organizational group of researchers.

There is no specific institutional requirements regarding the management of OrgUnits. However, the MANAGER is responsible for properly managing and monitoring their OrgUnit, as well as ensuring the quality of metadata and uploaded data. See Good practices for more information.

DLCM Roles

The UNIGE website Research Data was created to help and support researchers in the management of their research data. Advice on best practices from the planning to the publication of research data is provided. Here are some of the key links to help you properly deposit your research data in Yareta.

  • Do you use personal and/or sensitive data? Please ensure your data are anonymized before archiving them in Yareta. > Help

  • All file formats are accepted in Yareta but the use of file formats suitable for preservation is strongly recommended. > Help

  • Select your data carefully before archiving them, the data you produced during your research may not all be relevant for archiving. > Help

  • Name and organize your files properly to make your data understandable and reusable. > Help

  • Choose the appropriate licence to publish your research data in open access. > Help

SUPPORT

Yareta IT Support

Provides IT support on Yareta as well as access rights to the Yareta Web portal

  • For UNIGE researchers > Support SI

  • For researchers from Geneva’s other Higher Education Institution > Support

Research Data Points

Provides best practices on data management throughout the data lifecycle as well as guidance on legal, ethical and funding aspects.