Notes
- The chromosomal genomes that are 'scaffold','unlocalized','contig','unplaced','patch','unknown' were removed entirely during the analysis and therefore are not available in the database. Due to this reason some organisms might only have mitochondrial data.
- If in some organisms, the fr_..._N column is 0 while the coresponding bp_..._N has a value, it means that the frequency of N is extremely low compared to other bases, therefore the rounded off value is shown as 0.
- In order to improve efficiency of the database, Bacteria names were not included in the dropdown list. Please refer only to the tab 'Bacteria' to access bacterial data.
GBRAP Software Tool
GBRAP is an open-source tool and is freely available at *GitHub link will be available after publication*. It can be efficiently executed even on a standard laptop without the need for specialized hardware or configurations. GBRAP includes all the needed libraries to execute the complete analysis, so no additional installations are necessary. Once you have the gbff file needed to be analysed, GBRAP can be used in the command line/terminal by running the command,
“./GBRAP_command_line_tool.py -in input_file_name.gbff -out output_file_name.csv -g” for analysing the whole genome together
“./GBRAP_command_line_tool.py -in input_file_name.gbff -out output_file_name.csv -c” for analysing the chromosomes seperately
Each column of the data tables contains the following data:
- Assembly: Genome Assembly accession ID
- Locus ID: Sequence ID
- Version: Version number of the sequence
- Definition: Sequence description
- bp_chromo_A: Total number of 'A' nucleotides in the whole sequence
- bp_chromo_T: Total number of 'T' nucleotides in the whole sequence
- bp_chromo_C: Total number of 'C' nucleotides in the whole sequence
- bp_chromo_G: Total number of 'G' nucleotides in the whole sequence
- bp_chromo_N: Total number of 'N' nucleotides in the whole sequence
- bp_chromo_tot: Total number of nucleotides in the whole sequence
- fr_chromo_A: Frequency of 'A' nucleotides in the whole sequence
- fr_chromo_T: Frequency of 'T' nucleotides in the whole sequence
- fr_chromo_C: Frequency of 'C' nucleotides in the whole sequence
- fr_chromo_G: Frequency of 'G' nucleotides in the whole sequence
- fr_chromo_N: Frequency of 'N' nucleotides in the whole sequence
- GC_chromo: Percentage of ‘G’ and ‘C’ nucleotides in the whole sequence
- topo_entropy_chromo: Topological entropy of the whole sequence
- chargaff_pf_chromo: Chargaff's second parity rule score (method1) of the whole sequence
- chargaff_ct_chromo: Chargaff's second parity rule score (method2) of the whole sequence
- shannon_chromo: Shannon entropy score of the whole sequence
- n_gene_pos: Total number of genes located on the positive strand
- n_gene_neg: Total number of genes located on the negative strand
- n_gene_tot: Total number of genes in the sequence
- bp_gene_A: Total number of 'A' nucleotides in total genes
- bp_gene_T: Total number of 'T' nucleotides in total genes
- bp_gene_C: Total number of 'C' nucleotides in total genes
- bp_gene_G: Total number of 'G' nucleotides in total genes
- bp_gene_N: Total number of 'N' nucleotides in total genes
- bp_gene_tot: Total number of nucleotides in total genes
- fr_gene_A: Frequency of 'A' nucleotides in total genes
- fr_gene_T: Frequency of 'T' nucleotides in total genes
- fr_gene_C: Frequency of 'C' nucleotides in total genes
- fr_gene_G: Frequency of 'G' nucleotides in total genes
- fr_gene_N: Frequency of 'N' nucleotides in total genes
- GC_gene: Percentage of ‘G’ and ‘C’ nucleotides in total genes
- topo_entropy_gene: Topological entropy of total genes
- chargaff_pf_gene: Chargaff's second parity rule score (method1) of total genes
- chargaff_ct_gene: Chargaff's second parity rule score (method2) of total genes
- shannon_gene: Shannon entropy score of total genes
- bp_gene_overlap_tot: Total number of nucleotides in overlapping gene regions
- n_cds_pos: Total number of CDS located on the positive strand
- n_cds_neg: Total number of CDS located on the negative strand
- n_cds_tot: Total number of CDS in the sequence
- bp_cds_A: Total number of 'A' nucleotides in total CDS (coding sequences)
- bp_cds_T: Total number of 'T' nucleotides in total CDS
- bp_cds_C: Total number of 'C' nucleotides in total CDS
- bp_cds_G: Total number of 'G' nucleotides in total CDS
- bp_cds_N: Total number of 'N' nucleotides in total CDS
- bp_cds_tot: Total nucleotides in total CDS
- fr_cds_A: Frequency of 'A' nucleotides in total CDS
- fr_cds_T: Frequency of 'T' nucleotides in total CDS
- fr_cds_C: Frequency of 'C' nucleotides in total CDS
- fr_cds_G: Frequency of 'G' nucleotides in total CDS
- fr_cds_N: Frequency of 'N' nucleotides in total CDS
- GC_cds: Percentage of ‘G’ and ‘C’ nucleotides in total CDS
- topo_entropy_cds: Topological entropy of total CDS
- chargaff_pf_cds: Chargaff's second parity rule score (method1) of total CDS
- chargaff_ct_cds: Chargaff's second parity rule score (method2) of total CDS
- shannon_cds: Shannon entropy score of total CDS
- bp_cds_overlap_tot: Total number of nucleotides in overlapping CDS regions
- bp_cds_intron_A: Total number of 'A' nucleotides in total CDS introns
- bp_cds_intron_T: Total number of 'T' nucleotides in total CDS introns
- bp_cds_intron_C: Total number of 'C' nucleotides in total CDS introns
- 'bp_cds_intron_G': Total number of 'G' nucleotides in total CDS introns
- bp_cds_intron_N: Total number of 'N' nucleotides in total CDS introns
- bp_cds_intron_tot: Total nucleotides in total CDS introns
- fr_cds_intron_A: Frequency of 'A' nucleotides in total CDS introns
- fr_cds_intron_T: Frequency of 'T' nucleotides in total CDS introns
- 'fr_cds_intron_C': Frequency of 'C' nucleotides in total CDS introns
- fr_cds_intron_G: Frequency of 'G' nucleotides in total CDS introns
- fr_cds_intron_N: Frequency of 'N' nucleotides in total CDS introns
- GC_cds_intron: Percentage of ‘G’ and ‘C’ nucleotides in total CDS introns
- topo_entropy_cds_intron: Topological entropy of total intron sequences in CDS
- chargaff_pf_cds_intron: Chargaff's second parity rule score (method1) of total introns n CDS
- chargaff_ct_cds_intron: Chargaff's second parity rule score (method2) of total CDS introns
- shannon_cds_intron: Shannon entropy score of total CDS introns
- bp_cds_intron_overlap_tot: Total number of nucleotides in overlapping CDS intron regions
- n_ncRNA_pos: Total number of ncRNA located on the positive strand
- n_ncRNA_neg: Total number of ncRNA located on the negative strand
- n_ncRNA_tot: Total number of ncRNA in the sequence
- bp_ncRNA_A: Total number of 'A' nucleotides in total ncRNA
- bp_ncRNA_T: Total number of 'T' nucleotides in total ncRNA
- bp_ncRNA_C: Total number of 'C' nucleotides in total ncRNA
- bp_ncRNA_G: Total number of 'G' nucleotides in total ncRNA
- bp_ncRNA_N: Total number of 'N' nucleotides in total ncRNA
- bp_ncRNA_tot: Total nucleotides in total ncRNA
- fr_ncRNA_A: Frequency of 'A' nucleotides in total ncRNA
- fr_ncRNA_T: Frequency of 'T' nucleotides in total ncRNA
- fr_ncRNA_C: Frequency of 'C' nucleotides in total ncRNA
- fr_ncRNA_G: Frequency of 'G' nucleotides in total ncRNA
- fr_ncRNA_N: Frequency of 'N' nucleotides in total ncRNA
- GC_ncRNA: Percentage of ‘G’ and ‘C’ nucleotides in total ncRNA
- topo_entropy_ncRNA: Topological entropy of total ncRNA
- chargaff_pf_ncRNA: Chargaff's second parity rule score (method1) of total ncRNA
- chargaff_ct_ncRNA: Chargaff's second parity rule score (method2) of total ncRNA
- shannon_ncRNA: Shannon entropy score of total ncRNA
- bp_ncRNA_overlap_tot: Total number of nucleotides in overlapping ncRNA regions
- bp_nc_intron_A: Total number of 'A' nucleotides in total ncRNA introns
- bp_nc_intron_T: Total number of 'T' nucleotides in total ncRNA introns
- bp_nc_intron_C: Total number of 'C' nucleotides in total ncRNA introns
- bp_nc_intron_G: Total number of 'G' nucleotides in total ncRNA introns
- bp_nc_intron_N: Total number of 'N' nucleotides in total ncRNA introns
- bp_nc_intron_tot: Total nucleotides in total introns between ncRNA
- fr_nc_intron_A: Frequency of 'A' nucleotides in total ncRNA introns
- fr_nc_intron_T: Frequency of 'T' nucleotides in total ncRNA introns
- fr_nc_intron_C: Frequency of 'C' nucleotides in total ncRNA introns
- fr_nc_intron_G: Frequency of 'G' nucleotides in total ncRNA introns
- fr_nc_intron_N: Frequency of 'N' nucleotides in total ncRNA introns
- GC_nc_intron: Percentage of ‘G’ and ‘C’ nucleotides in total ncRNA introns
- topo_entropy_nc_intron: Topological entropy of total intron sequences in ncRNA
- chargaff_pf_nc_intron: Chargaff's second parity rule score (method1) of total ncRNA introns
- chargaff_ct_nc_intron: Chargaff's second parity rule score (method2) of total ncRNA introns
- shannon_nc_intron: Shannon entropy score of total ncRNA introns
- bp_nc_intron_overlap_tot: Total number of nucleotides in overlapping ncRNA intron regions
- n_tRNA_pos: Total number of tRNA located on the positive strand
- n_tRNA_neg: Total number of tRNA located on the negative strand
- n_tRNA_tot: Total number of tRNA in the sequence
- bp_tRNA_A: Total number of 'A' nucleotides in total tRNA
- bp_tRNA_T: Total number of 'T' nucleotides in total tRNA
- bp_tRNA_C: Total number of 'C' nucleotides in total tRNA
- bp_tRNA_G: Total number of 'G' nucleotides in total tRNA
- bp_tRNA_N: Total number of 'N' nucleotides in total tRNA
- bp_tRNA_tot: Total nucleotides in total tRNA
- fr_tRNA_A: Frequency of 'A' nucleotides in total tRNA
- fr_tRNA_T: Frequency of 'T' nucleotides in total tRNA
- fr_tRNA_C: Frequency of 'C' nucleotides in total tRNA
- fr_tRNA_G: Frequency of 'G' nucleotides in total tRNA
- fr_tRNA_N: Frequency of 'N' nucleotides in total tRNA
- GC_tRNA: Percentage of ‘G’ and ‘C’ nucleotides in total tRNA
- topo_entropy_tRNA: Topological entropy of total tRNA
- chargaff_pf_tRNA: Chargaff's second parity rule score (method1) of total tRNA
- chargaff_ct_tRNA: Chargaff's second parity rule score (method2) of total tRNA
- shannon_tRNA: Shannon entropy score of total tRNA
- bp_tRNA_overlap_tot: Total number of nucleotides in overlapping tRNA regions
- n_rRNA_pos: Total number of rRNA located on the positive strand
- n_rRNA_neg: Total number of rRNA located on the negative strand
- n_rRNA_tot: Total number of rRNA in the sequence
- bp_rRNA_A: Total number of 'A' nucleotides in total rRNA
- bp_rRNA_T: Total number of 'T' nucleotides in total rRNA
- bp_rRNA_C: Total number of 'C' nucleotides in total rRNA
- bp_rRNA_G: Total number of 'G' nucleotides in total rRNA
- bp_rRNA_N: Total number of 'N' nucleotides in total rRNA
- bp_rRNA_tot: Total nucleotides in total rRNA
- fr_rRNA_A: Frequency of 'A' nucleotides in total rRNA
- fr_rRNA_T: Frequency of 'T' nucleotides in total rRNA
- fr_rRNA_C: Frequency of 'C' nucleotides in total rRNA
- fr_rRNA_G: Frequency of 'G' nucleotides in total rRNA
- fr_rRNA_N: Frequency of 'N' nucleotides in total rRNA
- GC_rRNA: Percentage of ‘G’ and ‘C’ nucleotides in total rRNA
- topo_entropy_rRNA: Topological entropy of total rRNA
- chargaff_pf_rRNA: Chargaff's second parity rule score (method1) of total rRNA
- chargaff_ct_rRNA: Chargaff's second parity rule score (method2) of total rRNA
- shannon_rRNA: Shannon entropy score of total rRNA
- bp_rRNA_overlap_tot: Total number of nucleotides in overlapping rRNA regions
- ATG AAG GTA etc.: Count of respective codons in CDS (Codon usage)