Bulk-loading, an Overview

[ moved to https://arctosdb.wordpress.com/documentation/bulkloader/bulkloader-tutorial/#overview ]

New specimen records are created from a single "flat file, usually a text file in which all data for a single cataloged item are in a single row. This file can be created with any convenient client-side application (often Microsoft Access). The file is then loaded into a similarly structured table on the server, and a server-side application (the bulk-loader) parses each row into the relational structure of the database.

This approach makes keyboarding of data a client-side process, and thereby allows easy customization of data-entry applications. The process provides an independent layer of data checking before new information is incorporated into the database proper. Original data that are received in electronic format may require minimal manipulation; you can sometimes merely add the necessary columns to build a file in the bulk-loading format (download bulkloader templates or build your own with the Bulkloader Builder application).


What the bulk-loader does:
  • The bulk-loader expects to find pre-existing values in the database for
    taxonomy, agent names, and higher geography.
    Data of these types that are not already in the system must be added prior
    to bulk-loading by a user who is priviliged to modify those tables.
  • The content of several data fields is controlled by "code tables"
    which restrict the acceptable values for those fields.
    (The values you see in dropdowns in the editting screens come from these code tables.)
  • The bulk-loader evaluates each row in the submitted table.
    If the locality already exists in the database, then the bulk-loader
    uses the existing locality.
    If the locality does not exist, then (assuming that the row is otherwise acceptable)
    the bulk-loader creates the new locality and parses the row into the appropriate
    tables.
  • If no catalog number is provided by the submitted table, then bulkloader provides the next sequential catalog number from the indicated collection.

The template for this flat file handles the most common data but has limitations.
For example, if a specimen has five collectors, you could load the specimen and the
first three collectors with the bulk-loader, then add the other collectors by editting the online record. Or, you could request a change to the structure of the template.
(Ask Dusty nicely.)

No comments: