Biological data - Related Links

Open Access Articles- Top Results for Biological data

Biological data

Biological data are data or measurements collected from biological sources, which are often stored or exchanged in a digital form. Biological data are commonly stored in files or databases. Examples of biological data are DNA base-pair sequences, and population data used in ecology.

Data File Formats

Each file format has been designed for specific needs and outputs in mind.

  • GFF
  • BAM
  • SAM
  • VCF
  • AB1 – In DNA sequencing, chromatogram files used by instruments from Applied Biosystems
  • ACE – A sequence assembly format
  • BAM – Binary compressed SAM format
  • BED – The browser extensible display format is used for describing genes and other features of DNA sequences
  • CAF – Common Assembly Format for sequence assembly
  • EMBL – The flatfile format used by the EMBL to represent database records for nucleotide and peptide sequences from EMBL databases
  • FASTA – The FASTA file format, for sequence data. Sometimes also given as FNA or FAA (Fasta Nucleic Acid or Fasta Amino Acid).
  • FASTQ – The FASTQ file format, for sequence data with quality. Sometimes also given as QUAL.
  • GenBank – The flatfile format used by the NCBI to represent database records for nucleotide and peptide sequences from the GenBank and RefSeq databases
  • GFF – The General feature format is used for describing genes and other features of DNA, RNA and protein sequences
  • GTF – The Gene transfer format is used to hold information about gene structure.
  • NEXUS – The Nexus file encodes mixed information about genetic sequence data in a block structured format.
  • NWK – The Newick tree format is a way of representing graph-theoretical trees with edge lengths using parentheses and commas and usefil to hold phylogenetic trees.
  • PDB – structures of biomolecules deposited in Protein Data Bank. Also used for exchanging protein/nucleic acid structures.
  • PHD – Phred output, from the basecalling software Phred
  • SAM – Sequence Alignment/Map format, in which the results of the 1000 Genomes Project will be released.
  • SCF – Staden chromatogram files used to store data from DNA sequencing
  • SBML – The Systems Biology Markup Language is used to store biochemical network computational models
  • SFF - Standard Flowgram Format
  • Stockholm – The Stockholm format for representing multiple sequence alignments
  • Swiss-Prot – The flatfile format used to represent database records for protein sequences from the Swiss-Prot database
  • VCF – Variant Call Format, a standard created by the 1000 Genomes Project that lists and annotates the entire collection of human variants (with the exception of approximately 1.6 million variants).

Biological Data Sharing

See also

Lua error in package.lua at line 80: module 'Module:Buffer' not found.