This readme describes how to use the sequence flat files extracted from the CR-EST database. In general, only the public sequences are stored. They are organized into folders which represents the different organisms. - barley (Hordeum vulgare) - pea (Pisum sativum) - petunia (Petunia hybrida) - potato (Solanum tuberosum) - tobacco (Nicotiana tabacum) - wheat (Triticum aestivum) In each of those folders the EST sequences stored in one file: est_<#sequences>.fasta e.g. est_139724.fasta The format in each est sequence identification line is: crest| e.g. >crest|HA01A02r ATCCCAATCCGCGCACCCGAATTCCCAATCCCCCCCAAAACCCTAGCCCC GATCCCGATCCCGATCCCGGCGAGTGAGATGGCCAACCCGAAGGTGTTCT TCGACATGACGGTGGGCGGCGCCCCGGCCGGCCGCATCGTGATGGAGCTG TACAAGGACGCGGTGCCGAGGACGGTGGAGAACTTCCGCGCGCTCTGCAC CGGCGAGAAGGGCGTCGGCAAGAGCGGCAAGCCGCTGCACTACAAGGGCA GCTCGTTCCACCGCGTCATCCCCGACTTCATGTGCCAGGGCGGCGACTTC ACCAGGGGCAACGGCACCGGCGGCGAGTCCATCTACGGCGAGAAGTTCGC CGACGAGAAGTTCGTCCACAAGCACACCAAGCCAGGGATCCTCTCCATGG CCAACGCCGGGCCCAACACCAACGGCTCCCAGTTCTTCATCTGCACCGTC CCCTGCAATTGGCTCGACGGCAAGCACGTGGTCTTCGGCGAGGTCGTCGA GGGCATGGACGTCGTCAAGAACAT Furthermore, a non redundand set of consensus sequences are available resulting from clustering using the StackPack package: consensus_g.fasta e.g. consensus_g01.fasta The format in each consensus sequence identification line is: >.fasta e.g. >cl19ct48cn54g01 ACCAAACCCGACCATCTCGTTCCGTTCCAGCTCAGGCGAGCCAGCGACCA CCGGCCTCCGGCGAGGGCAGCAGCGATGGAGGGCAAGGAGGAGGACGTGC GCCTCGGGGCGAACAAGTACTCGGAGCGGCAGCCCATCGGCACGGCGGCG CAGGGGTCCGAGGACAAGGACTACAAGGAGCCCCCGCCGGCGCCGCTGTT CGAGCCCGGCGAGCTCAAGTCGTGGTCCTTCTACCGCGCCGGCATCGCCG AGTTCATGGCCACCTTCCTCTTCCTCTACGTCACCATCCTCACCGTCATG GGCTACAGCGGCGCCGCCTCCAAGTGCGCCACCGTCGGCATCCAGGGCAT CGCCTGGTCCTTCGGCGGCATGATCTTCGCCCTCGTCTACTGCACCGCCG GCATCTCTGGCGGGCACATCAACCCGGCGGTGACCTTCGGGCTGTTCCTG GCGAGGAAGCTGTCGCTGACGAGGGCGGTGTTCTACATCATCATGCAGTG CCTGGGCGCCATCTGCGGCGCCGGCGTGGTCAAGGGGTTCCAGCAGGGCC TGTACATGGGCAACGGCGGCGGCGCCAACGTGGTGGCGTCCGGCTACACC AAGGGCTCCGGGCTCGGCGCCGAGATCATCGGCACCTTCGTCCTTCGTCT