Tutorial
Requirements
To use BACA a browser with SVG support is Suggested, but you can
use a broswer without SVG. We recommend a
recent version of Firefox.
Using BACA
Using BACA is very simple. The interface is as follows:

As a first example, lets retrieve, organize and visualize the mitochondrial
genomes for turtles.
The first task is to devise a query to fetch all mitochondrial genomes.
We start by visiting NCBI's
TaxBrowser,
then we simply search for turtles. We click Testudines, followed
by Genome Sequences (subtree links) (check the box on the right side of
the screen). On the search box you can find the rather cryptic string we are
looking for:
txid8459[Organism:exp]
This is the query you will have to input in BACA.
Before you click retrieve you can also choose with annotations you want
organized (that is, with FASTA files produced per annotation).
Another example are the placentals. Using the steps above you can find that
the query will be:
txid9347[Organism:exp]
Have a look.
In the case of placentals, there are also nuclear genomes available
(think Homo Sapiens), so we will need to retrieve only mitochondrial genomes,
for that we add AND mitochondrion, that is:
txid9347[Organism:exp] AND mitochondrion
You might also want to exclude certain genomes, for instance, for Homo
Sapiens there are 2 versions, lets exclude one:
txid9347[Organism:exp] AND mitochondrion NOT NC_001807
It might happen that BACA is not able to conclude the work. The most probable
cause for this to happen is either a network problem or performance problems
with GenBank. If you think there is a issue with GenBank performance, then
resubmitting the query in BACA might solve it, if not then just try again
in a GenBank non-peak hour.
This concludes the instructions on inputing data in BACA.
The result, on the retrieval and organization part includes a zip archive with several files inside. An ilustrative part of the available files is described now:
- GENE_cDNA_ATP6.fasta Has all the ATP6 genes from all the genomes that have the said gene.
- GENE_cDNA_ATP6.stats Includes statistics related to 4 fold degenerate sites for ATP6.
- GENOME_NC_001807.fasta Has all the genes for the NC_001807 genome.
- GENOME_NC_001807.stats Includes statistics regarding 4 fold degenerate sites of genes in NC_001087.
- NC_001807.gbk The complete NC_007938 GenBank entry.
- NC_001807.meta Just the species name of NC_001807.
- NC_001807#gene#ND6.fasta A FASTA file with just ND6 from NC_001807.
- NC_001807#rRNA#l-rRNA.fasta A FASTA file with just 16S rRNA from NC_001807
- NC_001807#D-Loop#D-Loop.fasta A FASTA file with just the D-Loop from NC_001807
- NC_001807#tRNA#tRNA-Asp.fasta A FASTA file with just the Aspartic Acid tRNA from NC_001807
- TRNA_tRNA#tRNA-Asp.fasta Includes all Aspartic Acid tRNAs from all genomes that have this tRNA.
- RRNA_rRNA#16S_ribosomal_RNA.fasta Includes all 16S rRNAs from all genomes that have 16S rRNA.
- DL_D-loop#D-loop.fasta Includes all D-loops existing in all genomes.
The zip file, in most cases, will have lots of files, just choose the ones
needed in your case according to the naming standards above. The reason
to provide so many ways of organizing data is to be able to help as
many types of potential users as possible.
For all features, a file with data specifying the start, end and strand of
the feature is provided.
This information is available for all genes and all genomes that were
retrieved with the query.
Files of type GENE_* can, for instance, be processed by a multiple alignment tool
like ClustalW.
Regarding visualizing the results of BACA, we suggest playing with our examples
which can be found on the main page.
Please note that if you click an annotation type with has no information, you
will only get the SVG map redrawn, as an example of annotations with extra
information, choose, e.g., cDNA.
If you have any question or comment, please don't hesitate to
contact us.