# CoGDat Metadata ## Regular Metadata For the CoGDat data portal currently the following metadata are defined: ### IMS-ID The RKI [IMS](https://www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/DESH/IMS_Grafik.html) ID of the sample. ### Lab-ENA-ID (optional) The submitter [ENA](https://www.ebi.ac.uk/ena/) ID of the sample. If an ENA ID is specified here, CoGDat assumes that the submitter has already submitted the case to ENA and will not create a new submission to ENA for this case. If *no* ENA ID is specified here, the submitter guarantes that they will *not* create an ENA submission and agrees that CoGDat will submit the case to ENA. ### SampleDate (optional) The collection date of the sample. *The date has to be specified in the format YYYY-MM-DD. If the metadata is submitted in an Excel sheet, the corresponding cells should either be formatted as text adhering to the aforementioned format or they should be formatted as date values. In the latter case they will be automatically converted.* ### AGS-5 (optional) The first five digits of the community identification number (AGS) of the domicile of the patient the sample was collected from. This corresponds to the resolution of "Landkreis" / "kreisfreie Stadt" or "Bezirk" in the case of a city with districts. *The value has to contain exactly five digits, otherwise it will fail validation. When you are submitting or preparing metadata using Microsoft Excel or another spreadsheet program, make sure that the AGS code is formatted as plain text and not as a numeric value, as otherwise ZIP codes beginning with '0' will fail validation.* ### RawFQ1 The filename of the FASTQ raw file. If the data originates from paired-end sequencing and two FASTQ files have been produced, then this refers to the filename of the `R1` file. *This is a file field. A file with the corresponding name will have to be uploaded and linked to the metadata in a joint submission* ### RawFQ2 (optional) The filename of the `R2` file, given that the data originates from paired-end sequencing. *This is a file field. A file with the corresponding name will have to be uploaded and linked to the metadata in a joint submission* ### AssemblyFA The filename of the FASTA file containing the virus genome assembly. *This is a file field. A file with the corresponding name will have to be uploaded and linked to the metadata in a joint submission* ### SeqPlatform The used sequencing platform. The field is constraint to a controlled vocabulary, the allowed values and the corresponding sequencing platforms are listed below: | Value | Platform | | --------------- | ---------------------------------------------------------------------------------------------------------------- | | LS454 | 454 technology | | ILLUMINA | Illumina | | PACBIO\_SMRT | PacificBiosciences | | ION_TORRENT | Ion Torrent Personal Genome Machine (PGM) | | CAPILLARY | Sequencers based on capillary electrophoresis technology manufactured by LifeTech (formerly Applied BioSciences) | | OXFORD_NANOPORE | Oxford Nanopore platform type | | BGISEQ | BGI Next Generation Sequencing Platform | | DNBSEQ | MGI DNBSEQ Platform | ### SeqInstrument (optional) The user sequencing instrument. The permitted values are * 454 GS * 454 GS 20 * 454 GS FLX * 454 GS FLX+ * 454 GS FLX Titanium * 454 GS Junior * HiSeq X Five * HiSeq X Ten * Illumina Genome Analyzer * Illumina Genome Analyzer II * Illumina Genome Analyzer IIx * Illumina HiScanSQ * Illumina HiSeq 1000 * Illumina HiSeq 1500 * Illumina HiSeq 2000 * Illumina HiSeq 2500 * Illumina HiSeq 3000 * Illumina HiSeq 4000 * Illumina iSeq 100 * Illumina MiSeq * Illumina MiniSeq * Illumina NovaSeq 6000 * NextSeq 500 * NextSeq 550 * PacBio RS * PacBio RS II * Sequel * Ion Torrent PGM * Ion Torrent Proton * Ion Torrent S5 * Ion Torrent S5 XL * AB 3730xL Genetic Analyzer * AB 3730 Genetic Analyzer * AB 3500xL Genetic Analyzer * AB 3500 Genetic Analyzer * AB 3130xL Genetic Analyzer * AB 3130 Genetic Analyzer * AB 310 Genetic Analyzer * MinION * GridION * PromethION * BGISEQ-500 * DNBSEQ-T7 * DNBSEQ-G400 * DNBSEQ-G50 * DNBSEQ-G400 FAST ### AmpKit The used amplification kit. The field is constraint to a controlled vocabulary, the allowed values and the corresponding amplification kits are listed below: | Value | Kit | | ------------- | ------------- | | COVIDSeq | Illumina COVIDSeq Test | | NEBNext | NEBNext® ARTIC SARS-CoV-2 FS Library Prep Kit (Illumina®) | | Twist | Twist-SARS-CoV-2 Research Panel | | QIASeq | Artic-Primer with QIASeq FX DNA Library Kit | | ArticV1 | Artic v1 primers + Quick protocol | | ArticV3 | Artic v3 primers + Quick protocol | | covseq | DKFZ covseq protocol | ### PCRCt (optional) The integer part of the PCR Ct value.