Bacterial Isolate Genome Sequence Database (BIGSdb)
Gene-by-gene population annotation and analysis
BIGSdb is software designed to store and analyse sequence data for bacterial isolates. Any number of sequences can be linked to isolate records - these can be small contigs assembled from dideoxy sequencing through to whole genomes (complete or multiple contigs generated from parallel sequencing technologies such as Illumina or Oxford Nanopore).
BIGSdb extends the principle of MLST to genomic data, where large numbers of loci can be defined, with alleles assigned by reference to sequence definition databases (which can also be set up with BIGSdb). Loci can also be grouped into schemes so that types can be defined by combinations of allelic profiles, a concept analagous to MLST.
The software has been released under the GNU General Public Licence version 3. The latest version of this documentation can be found at https://bigsdb.readthedocs.org/.
- Concepts and terms
- BIGSdb dependencies
- Installation and configuration of BIGSdb
- Software installation
- Configuring PostgreSQL
- Setting global connection parameters
- Site-specific configuration
- Setting up the offline job manager
- Setting up the submission system
- Setting up a site-wide user database
- Periodically delete temporary files
- Prevent preference database getting too large
- Log file rotation
- Upgrading BIGSdb
- Running the BIGSdb RESTful interface
- Enabling database logging of web and API access
- Enabling isolate embargoes
- Database setup
- Creating databases
- Database-specific configuration
- XML configuration attributes used in config.xml
- Over-riding global defaults set in bigsdb.conf
- Over-riding values set in config.xml
- Setting field validation rules
- Sparsely-populated fields
- Kiosk mode
- User authentication
- Setting up the admin user
- Retrieving PubMed citations from NCBI
- Configuring access to remote contigs
- Setting up front-end and query dashboards
- Administrator’s guide
- Types of user
- User groups
- Curator permissions
- Locus and scheme permissions (sequence definition database)
- Controlling access
- Setting user passwords
- Setting the first user password
- Enabling plugins
- Temporarily disabling database updates
- Host mapping
- Improving performance
- Dataset partitioning
- Setting a site-wide users database
- Adding new loci
- Defining locus extended attributes
- Defining locus amino acid variants or single-nucleotide polymorphisms
- Defining schemes
- Organizing schemes into hierarchical groups
- Setting up client databases
- Workflow for setting up a MLST scheme
- Automated assignment of scheme profiles
- Scheme profile clustering - setting up classification schemes
- Defining new loci based on annotated reference genome
- Setting up LINcode definitions for cgMLST schemes
- Genome filtering
- Setting locus genome positions
- Defining composite fields
- Extended provenance attributes (lookup tables)
- Sequence bin attributes
- Checking external database configuration settings
- Exporting table configurations
- Authorizing third-party client software to access authenticated resources
- BLAST caches
- Config-specific file downloads
- Curator’s guide
- Adding new sender details
- Adding new allele sequence definitions
- Updating and deleting allele sequence definitions
- Retiring allele identifiers
- Un-retiring allele identifiers
- Updating locus descriptions
- Adding new scheme profile definitions
- Updating and deleting scheme profile definitions
- Retiring scheme profile identifiers
- Un-retiring scheme profile identifiers
- Adding isolate records
- Updating and deleting single isolate records
- Batch updating multiple isolate records
- Deleting multiple isolate records
- Retiring isolate identifiers
- Un-retiring isolate identifiers
- Setting alternative names for isolates (aliases)
- Linking isolate records to publications
- Uploading sequence contigs linked to an isolate record
- Batch uploading sequence contigs linked to multiple isolate records
- Linking remote contigs to isolate records
- Automated web-based sequence tagging
- Projects
- Isolate record versioning
- Populating geographic coordinate lookup values
- Curating data submitted via the automated submission system
- Offline curation tools
- Definition downloads
- Data records
- Querying data
- Querying sequences to determine allele identity
- Querying multiple sequences to identify allele identities
- Searching for specific allele definitions
- Browsing scheme profile definitions
- Querying scheme profile definitions
- Identifying allelic profile definitions
- Batch profile queries
- Investigating allele differences
- Browsing isolate data
- Querying isolate data
- Bookmarking an isolate query
- Retrieving isolates by linked publication
- User-configurable options
- User projects
- Private records
- Data analysis plugins
- Data export plugins
- Submitting data using the submission system
- RESTful Application Programming Interface (API)
- Frequently asked questions (FAQs)
- Appendix
- Database schema