Last updated: 9th October 2018
We have established a specimen database for the laboratory to keep track of specimens and sequence. This page provides a brief introduction to the database, and more detailed how-to pages can be found in the side menu.
Log on to the EARTHCAPE DATABASE
Introduction and FAQs:
What is held on the database?
Information for over 20,000 samples, collected from 594 different locations, in 28 countries across 5 continents, with almost 600 different species listed, and sequencing data for over 1750 individuals.
How do I log on?
Follow the link above and a login page will open. If you are in the group or a collaborating group then we can give you a personal account with a Username and password. However there is also a ‘read-only’ option as well
How can I get help with the database?
There is an ongoing effort to create and maintain how-to pages and those are listed in the adjacent sidebar. If you have particular problems or want a how-to page creating then contact Ian Warren or Chris Jiggins. Authorised users will be invited to join a Redbooth working group, where it is possible to interact with the database administrator, and more details queries and bugs can reported there. Further information can be found from the hosting company, Earthcape.
How do I find sequences associated with a published paper?
Papers and ongoing projects are linked as ‘Record Sets’. If you want to find specimens and sequences in a published paper first click on the ‘Record Sets’ on the left hand menu. Then click on the relevant paper and a list of associated specimens will be generated. Click on ‘Show Sequencing’ on the top right of the page to generate a list of sequence entries associated with those specimens. The ENA accession codes link out to online sequence database.
How do I upload a new set of samples?
We have a set of template files for uploading data. The process can be done both online or on a local version the of the database. Instructions can be found here: web version; and here: local installation.
How do I find a particular set of specimens?
Lets say you want to find all specimens of H. melpomene from Ecuador. Click first on the Units link on the right hand side. Right-click on the column header row and select ‘Filter Row’. A new row will appear at the top of the table. Type ‘Heliconius melpomene’ in the Taxonomic Name column and ‘Ecuador’ in the Country column. You can also select different search options by clicking again on the header row and selecting ‘Filter Row Menu’. Then a pull down menu appears on each row where you can select search terms such as ‘Equals’ or ‘Contains’.
How do I search for many individuals in batch?
If you have a list of individual unit IDs (e.g. CAM000654)¬†in an excel file (or any tab delimited format): click on the Units table, click on [..] next to the search bar (top right), click on Filter list,¬†copy the desired row of Unit IDs¬ into the text box and click enter. If you search “654” all the unit IDs finishing in “654” will come up.
How do I upload a new set of sequences?
- First make sure all the samples are in the database (see above)
- Associate your samples with a ‘Record Set’ (Instructions can be found here). Publications and ongoing projects are stored as Record Sets and these can be used to group individuals. One individual can be in multiple Record Sets. First make a new Record Set if there isn’t one already for your project. Include the Title of the paper as the Name field and the DOI and authorship in Comments.
- Next¬†make a list of the UnitIDs of the individuals in your dataset as a text file that you can paste in to the website. Usually of the format CAM000001. Next go to Units tab in earthcape and select ‘Filter List’ from the (…) button. Paste in the list of UnitIDs and click OK. Once you have a list with just your samples, select them all with the tick box on the left column and go to ‘Add to Record Set’ also under the (…) button . Select your project and click Okay.
- Upload ‘Sample Accession Numbers’ to Units. These are the ENA accession numbers that are unique to a biological sample (not a sequence dataset). You can add these manually or import using the Windows Earthcape client on the laptop.
- Make a csv file with column titles:¬†UnitID,AccessionEna (make sure UnitIDs match exactly with those in the database)
- Upload with Windows database
- Add Sequencing data to ‘Sequencing Table‘. Make an upload file using Sarah’s template¬†Note that the accession number in this table is the ‘Run Accession’ which is specific to a particular sequence dataset. One sample could be sequenced in various ways so a single sample accession number could be associated with several run accession numbers
- Upload with Windows database
Unit ID = Sample (individual butterfly). Generally 2-3 letters determining the dataset + 6 digits
Table = part of the database corresponding to one of the tabs on the left (e.g. units table, sequencing table)
Filter list = search in batch for Unit IDs
Column chooser = to add more columns to any table
Datasets = collections of individuals (usually by institution)