Calculations
Calculations Data Editor
Calculations List Manager
Discussions
Instructions
   2- Manual Matrix Entry
   4- Upload Matrix
   Dealing with Start Up Difficulties
Teaching & Application

 

Services and Resources
 
  • Similarity Calculator
  • Offline Services
  • Amanita Studies Site

  •  
    Make a Difference
     
  • Help Amanita-Bear

  •  
    Only you can prevent taxonomic and nomenclatural confusion! Only you!

    [ Login ]

     
     
     

    Similarity > Instructions >

    4- Upload Matrix  

    [ go to data entry or (after computation) results matrix ]

    At the bottom of this page is a sample of the input form for the Upload Matrix mode of the calculator. This is not a mock-up. The user can upload data, run the calculator, and view the output matrix. The input data expected is NOT in Excel format. For reasons described below, it is best to use a comma-separated values (CSV) formatted file to input the data. The Upload Matrix process expects to see all your data in a single matrix, in a single file.

    NOTE: If you have trouble starting up, this is very likely to be a data entry problem. We are very willing to review your first data set for format so that you can use the "Upload Matrix" data entry mode. If you have a problem, contact us
    here.

    The user should now take a look at the calculator's input form before continuing with the instructions.

    The user can use the first drop down list ("Select Process Type...") to select whether biogeographical affinity is to be computed in addition to the similarity metric. In this sample input form, the mode "Upload Matrix" has already been selected. A user of the Upload Matrix mode must be sure to select this mode from the drop down list labeled "Input Type."

    Note: The row (item name) divider symbol may, but does not have to, appear at the end of the last line of the input file.

    Note: Neither the row (item name) divider nor the column (list name) divider may be included in the text of any list item (row name) or list name (column header). E.g., the data entry person for an original spreadsheet may have split a column heading with a return character to make the column reasonably narrow. Before uploading a comma-separated or tab-separated form of the spreadsheet (say, the CSV (comma separated values) ouput form from Microsoft Excel), such returns would have to be removed from the uploadable text. Continuing the example, the following could be the CSV representation of the first (heading) row of a presence-matrix spreadsheet comparing the taxa of four regions (return characters invisible):

          .., Central

          Asia, Southeastern

          Asia, Japan, China

    This would have to have some return characters removed before uploading because the return character is the end of row indicator for the similarity metric input program. Removing the return characters will produce the following:

          .., Central Asia, Southeastern Asia, Japan, China

    Note: The similarity metric calculator assumes that data is entered sensitive to case -- "A" is different from "a" in the comparison phase of the calculator.

    Note: We have assumed that names of items in the table are not duplicated within the table.

    Note: The user will receive an error message when an attempt is made to compare a list that has no member items.

    Data Entry: The user of this form of I/O is expected to have access to an uploadable form of a presence-absence matrix such as might be output from Microsoft XL in its CSV output format or from a word processor imitating that format. (See the example, above). This matrix must have either the list names across the header row of the spread sheet (naming the columns) or down the first column of the spreadsheet (naming the rows). For the sake of simplicity we'll assume the list names are in the header row beginning in the second column. The first column is used for the names of all the items that appear in one or more of the lists. The matrix of cells running from the 2nd row-2nd column to the last row-last column are assumed to contain text other than "0" if and only if the item named in the first column of the given row is present in the list named in the header cell of the given column. Absence is indicated either by the character "0" or absence of any text (presence only of white space or nothing) in a given cell. Two rows of such a four list matrix might look like the following in CSV format:

          calyptratoides, 1, 1, 0, 1

          diemii, 0, 0, 0, 0

    The user will be expected to know the number of columns excluding the leftmost column with its item names and the number of rows excluding the header row with its list names. The first value is entered in the field labeled "Number of Lists." The second is entered in the field labeled "Item Rows." Wait after entering each of these values. The form will be redrawn each time. Haste makes waste. Redrawn parts of the form may lose data entered in them too hastily.

    The user may wish to provide a title for the output matrix of similarity indices. A field is provided in which this optional datum may be entered.

    Next the user should enter the character that divides the column cells (column divider) within a line of the file to be uploaded (in the field labeled "Column Division") and the character (row divider) that separates the rows of the matrix in the file to be uploaded (in the field labeled "Row Division"). In the above examples, the column divider is a comma; and the row divider is a return character. We strongly recommend that these dividers be utilized by our users. We have found that some versions of Excel generate tab-separated values outputs that are missing all the tabs at the end of a row after the last non-empty cell. The similarity metric interprets this as corrupted input data.

    Next the orientation of the table to be uploaded must be specified by indicating which sorts of names appear in the header row of the table. In the above examples, the header row contained the list names. The orientation is selected from a drop down llist labeled "File Columns Are." We suggest that the user follow the approach of our examples.

    The last step before running the similarity calculator is to select the uploadable input file. You can browse for this file on your computer and select it for upload. A sample data file is provided for you to upload as part of your learning process.

    Using Sample Data Provided from this Site: Sample data for use from this page will need to be downloaded to your system. It is available in compressed form (see buttons below). SIT files are often preferred by Apple users and ZIP files, by PC users. Save your choice to your system, then decompress it. This may happen automatically when you click the button you choose. Once the demo input file is on your system, you may use the sample form to upload the decompressed version of it.

    Running the Similarity Calculator: When you are ready to produce your matrix of similarity indices, click the "Process" button. The similarity calculator provides an approximate run time for the computation part of the processing.

    After your output similarity index matrix has been prepared it can be saved by printing the screen content or it can be downloaded to your computer.

    Downloading Output: To download the output matrix as a text list click the button on the Results page marked "download printable results file."

    To download the output matrix as lines of tab-separated cell entries -- ready for input to a word processor or a spreadsheet tool such as XL -- click the button on the Results page marked "download tab separated results file."







    [ back to top of page ]

     

    © 2003, Amanita-Bear Consulting.
    - Last modified: 07/31/2006