Duplicate Agents

[ moved to https://arctosdb.wordpress.com/documentation/agent/#duplicate ]

There are duplicate agents in Arctos. What am I supposed to do about that?

First, make sure that there are actually two separate records for the agents. Identical agent names, between and among agents, is another issue and not covered here. Duplicate agents are two or more agent records that mean the same physical entity (THAT PARTICULAR John Smith; US Fish and Wildlife Service). It is not necessary for duplicate agents to share a name; in fact, they are often introduced because of misspellings. The "Agent Activity" link is a good place to make sure you're dealing with real duplicates.

Once you're sure you have duplicate agents, simply flag one of them as a "bad duplicate of" the other in Agent Relations. Generally, you will want to flag the record with least complete information and/or the least activity. The application to actually make the deletion must then be run manually.

Once agents are flagged, the actual agent cleanup process (Manage Data/Agents/Merge Bad Dup Agent) often involves individual analysis of agents in roles. Arctos will not merge agents used in Publications, those with Addresses, or otherwise differentiated at the Agent Name level. You may have to address errors individually.

Bulk-loading, an Overview

[ moved to https://arctosdb.wordpress.com/documentation/bulkloader/bulkloader-tutorial/#overview ]

New specimen records are created from a single "flat file, usually a text file in which all data for a single cataloged item are in a single row. This file can be created with any convenient client-side application (often Microsoft Access). The file is then loaded into a similarly structured table on the server, and a server-side application (the bulk-loader) parses each row into the relational structure of the database.

This approach makes keyboarding of data a client-side process, and thereby allows easy customization of data-entry applications. The process provides an independent layer of data checking before new information is incorporated into the database proper. Original data that are received in electronic format may require minimal manipulation; you can sometimes merely add the necessary columns to build a file in the bulk-loading format (download bulkloader templates or build your own with the Bulkloader Builder application).

What the bulk-loader does:
  • The bulk-loader expects to find pre-existing values in the database for
    taxonomy, agent names, and higher geography.
    Data of these types that are not already in the system must be added prior
    to bulk-loading by a user who is priviliged to modify those tables.
  • The content of several data fields is controlled by "code tables"
    which restrict the acceptable values for those fields.
    (The values you see in dropdowns in the editting screens come from these code tables.)
  • The bulk-loader evaluates each row in the submitted table.
    If the locality already exists in the database, then the bulk-loader
    uses the existing locality.
    If the locality does not exist, then (assuming that the row is otherwise acceptable)
    the bulk-loader creates the new locality and parses the row into the appropriate
  • If no catalog number is provided by the submitted table, then bulkloader provides the next sequential catalog number from the indicated collection.

The template for this flat file handles the most common data but has limitations.
For example, if a specimen has five collectors, you could load the specimen and the
first three collectors with the bulk-loader, then add the other collectors by editting the online record. Or, you could request a change to the structure of the template.
(Ask Dusty nicely.)