Sequence Formats & Conversions

FASTA Format
Description line starting by '>' followed by name and then
description;
Sequence in standard IUB/IUPAC amino acid and nucleic acid codes
starting on the next line until description line of next sequence
or end of file is reached. '-' often represents a gap of
indeterminated length.

Example:
>albumin of human origin 
MKWVTFISLLFLFSSAYSRGVFRRDAHKSEVAHRFKDLGEENFKALVLIAFAQYLQQCPFEDHVKL
VNEVTEFAKTCVADESAENCDKSLHTLFGDKLCTVATLRETYGEMADCCAKQEPERNECFLQHKDD 
NPNLPRLVRPEVDVMCTAFHDNEETFLKKYLYEIARRHPYFYAPELLFFAKRYKAAFTECCQAADK 
[...]

The description line (or header line) is often used to add 
information, but without clear consensus; here are a few usages:
>name
>name description
>name accession description
>namespace|accession.version|name description
>gi|identifier|namespace|accession.version|name description (NCBI)

Example:
>gi|412163|emb|CAA00606.1| albumin [Homo sapiens]
ReadSeq online sequence format conversion tool at EBI
ReadSeq, biosequence conversion tool, at Indiana University
Converts input DNA/AA sequence to specified format (Input format is determined automatically).
Prettyseq @ Emboss reformat sequences as text in columns of fixed width and with numbering
Convert Genbank to FASTA at Gene Infinity
Description of multiple sequence formats used by EMBOSS
Sequence format converter hosted by NIH based on Segret from EMBOSS software suite.
Sequence formats are simply the way in which the amino acid or DNA sequence is recorded in a computer file. Different programs expect different formats, so if you are to submit a job successfully, it is important to understand what the various formats look like.
The Web Bench
Sequence Analysis with GenBeans
GenBeans SoftwareTry GenBeans: Best free software for DNA sequence editing!
FEEDBACK
Your comments & your suggestionsare appreciated. Please, notify us for resources and tools that you would like to see on this bench!