We have established a specimen database for the laboratory to keep track of specimens and sequence. This page is a brief introduction to using the database.
Log on to the EARTHCAPE DATABASE
How do I log on?
Follow the link above and a login page will open. If you are in the group or a collaborating group then we can give you a personal account with a Username and password. However there is also a ‘read-only’ option using the Username ‘Anonymous’ and a blank password.
How do I find sequences associated with a published paper?
Papers and ongoing projects are linked as ‘Record Sets’. If you want to find specimens and sequences in a published paper first click on the ‘Record Sets’ on the left hand menu. Then click on the relevant paper and a list of associated specimens will be generated. Click on ‘Show Sequencing’ on the top right of the page to generate a list of sequence entries associated with those specimens. The ENA accession codes link out to online sequence database.
How do I upload a new set of samples?
We have a set of template files for uploading data. The actual upload process needs to be carried out on a local installation of the database (i.e. not the web version). If your specimens are collected from new localities you will need to first prepare a localities upload file, and upload this before uploading the specimens.
How do I find a particular set of specimens?
Lets say you want to find all specimens of H. melpomene from Ecuador. Click first on the Units link on the right hand side. Right-click on the column header row and select ‘Filter Row’. A new row will appear at the top of the table. Type ‘Heliconius melpomene’ in the Taxonomic Name column and ‘Ecuador’ in the Country column. You can also select different search options by clicking again on the header row and selecting ‘Filter Row Menu’. Then a pull down menu appears on each row where you can select search terms such as ‘Equals’ or ‘Contains’.
How do I search for many individuals in batch?
If you have a list of individual unit ID (e.g. CAM000654) in an excel file (or any tab delimited format): click on the Units table, click on [..] next to the search bar (top right), click on Filter list, copy the desired row of Unit IDs into the text box and click enter. If you search “654” all the unit IDs finishing in “654” will come up.
How do I upload a new set of sequences?
- First make sure all the samples are in the database. There will be a separate post on this
- Associate your samples with a ‘Record Set’. Publications and ongoing projects are stored as Record Sets and these can be used to group individuals. One individual can be in multiple Record Sets. First make a new Record Set if there isn’t one already for your project. Include the Title of the paper as the Name field and the DOI and authorship in Comments.
- Next make a list of the UnitIDs of the individuals in your dataset as a text file that you can paste in to the website. Usually of the format CAM000001. Next go to Units tab in earthcape and select ‘Filter List’ from the (…) button. Paste in the list of UnitIDs and click OK. Once you have a list with just your samples, select them all with the tick box on the left column and go to ‘Add to Record Set’ also under the (…) button . Select your project and click Okay.
- Upload ‘Sample Accession Numbers’ to Units. These are the ENA accession numbers that are unique to a biological sample (not a sequence dataset). You can add these manually or import using the Windows Earthcape client on the laptop.
- Make a csv file with column titles: UnitID,AccessionEna (make sure UnitIDs match exactly with those in the database)
- Upload with Windows database
- Add Sequencing data to ‘Sequencing Table‘. Make an upload file using Sarah’s template Note that the accession number in this table is the ‘Run Accession’ which is specific to a particular sequence dataset. One sample could be sequenced in various ways so a single sample accession number could be associated with several run accession numbers
- Upload with Windows database
Unit ID = Sample (indivudal butterfly). Generally 2-3 letters determining the dataset + 6 digits
Table = part of the database corresponding to one of the tabs on the left (e.g. units table, sequencing table)
Filter list = search in batch for Unit IDs
Column chooser = to add more columns to any table
Datasets = collections of individuals (usually by institution)