Discover Life in America

James Ashe - 23 July, 1999

Re: Unique identifiers & barcodes

Date: Fri, 23 Jul 1999 13:28:00 -0500
From: "James S. Ashe" <ashe@falcon.cc.ukans.edu>
To: "Robert K. Colwell" <colwell@uconnvm.uconn.edu>
CC: John Pickering <pick@pick.uga.edu>, sackley@compuserve.com,
        Steve Ashe <ksem@kuhub.cc.ukans.edu>, brianb@mizar.usc.edu,
        Gladys_Cotter@usgs.gov, christine.deal@intermec.com,
        faulzeitler@ascoll.org, mark_fornwall@usgs.gov,
        David Furth <Furth.David@NMNH.SI.EDU>,
        Winnie Hallwachs <whallwac@sas.upenn.edu>,
        Dan Janzen <djanzen@sas.upenn.edu>,
        "Norman F. Johnson" <Johnson.2@osu.edu>, mkaspari@ou.edu,
        Jack Longino <longinoj@elwha.evergreen.edu>,
        Scott Miller <scottm@bishop.bishop.hawaii.org>, becky_nichols@nps.gov,
        Chuck_Parker@nps.gov, msharkey@byron.ca.uky.edu, ctemple@intermec.com,
        cthompso@sel.barc.usda.gov, jugalde@inbio.ac.cr,
        Piotr Naskrecki <pin93001@uconnvm.uconn.edu>, windsord@tivoli.si.edu,
        dl@pick.uga.edu, "James S. Ashe" <ashe@falcon.cc.ukans.edu>,
        Rob Brooks <ksem@kuhub.cc.ukans.edu>
Subject: Re: Unique identifiers & barcodes

Hi Jack, Rob, and Everyone,

    Thanks for all your comments.  I wanted to respond to some of your
specific points and questions (in no particular order), and make a few
comments.  After looking over this e-mail I realize that my comments
have become long and burdensome.  I apologize, but these are complicated
issues that need careful discussion.

Rob, Thank you for clarifying some of the points about code 49 - and the
fact that a barcode reader can be set to recognize more that one code at
a time.  I was not aware of this, and it is very useful information.
Also, your point about the speed at which code 49 bar codes can be read
highlights a fundamental point that is often overlooked.  That is, the
protocols that ones sets for how specimens are physically handled during
the databasing project are as important in facilitating rapid data entry
as are the tools that one chooses to use.  You point out that by
changing the timing of when the barcodes are read ("if reads are planned
for points in the protocol when specimens need to be "pulled" and
re-placed in the pinning trays anyway") the process is much more
efficient.  This is true of almost every aspect of the data entry
process.  We were also inexperienced when we were experimenting with
barcodes, so perhaps our relatively slow reading of code 49 labels was a
result of this.

    Your comments about the accuracy of code 49 reads is interesting.  A
persistent "rumor" about barcodes that developed early in their use in
entomology was that there were significant reading errors.  I saw
several demonstrations in which such reading errors occurred - all used
code 49 barcodes.  Upon reflection, I suspect that the problem was
related to the relatively inexpensive barcode readers that were in use
in those demonstrations - and were not directly the result of code 49
itself.

    Regarding our protocols for reading barcodes from above.  It is true
that one cannot read barcodes of very tightly packed specimens from
above - in those instances, one needs to pick up the specimen and read
the barcode, then replace it. We usually do this when we are
transferring the specimens to a new unit tray, so the only advantage to
barcodes facing up, as opposed to down, is ergonomic (which basically
means that we are comfortable with it).  However, specimens in single
rows, or more loosely packed, are easily read from above at a slight
angle without removing the specimens from the tray.  We find that we
need to pick up the specimens to read them in about half the instances
in which the specimens are to remain in the same box.  We personally
like the fact that the specimen does not have to be turned over to read
the barcode even if it needs to be picked up - the motion seems very
natural.  However - and I want to emphasize this - I do not want to get
into a "my-method-is-better-than-your-method" exchange.  I do not doubt
that one can develop an efficient specimen handling protocol no matter
whether the label is placed up or down.  I think we seriously miss the
point when we place too much emphasis on this issue.  I offer an outline
of our protocols solely in the hopes that someone can benefit from our
experiences.  I also welcome suggestions and comments that will help us
improve our protocols.

Jack, In response to your question: "First I have a question for Steve:
as I understand it from your
message, your label looks like

|||||||||||||||
SM1234567
KUNHM-ENT

So what exactly is the unique identifier in the database?
SM1234567KUNHM-ENT? SM1234567 KUNHM-ENT? KUNHM-ENT SM1234567?

It seems to me there should be nothing else on the label except the
symbology and the unique code, exactly as it should appear in a
database."

The answer is "Yes" - that is what they look like.  Our database
actually contains all three kinds of institutional identification
information,  The entries "KUNHM" and "entomology" are in separate
fields that are set to automatically fill in these as "default"
settings.  However, both can be changed via a user-modified "look-up"
table.  When the barcode is read, the "SM" is automatically stripped
from the number and written into a third field - this is, in effect, the
"barcode" institutional identifier (actually the database is designed to
strip all letters from the barcode - up to the first digit- and place
them in this field).  Your comment about the fact that only the unique
institutional identifier should be on the label is well taken.  We added
the additional text "KUNHM-ENT" to provide additional information for
anyone who was confused by "SM".  I'm not sure that our choices in this
matter were the best, and I welcome comments.

    Regarding, "partial codes" it is probably worth revisiting the
rational for institutional codes on the specimens.  The main reason is
to distinguish between specimens from different institutions that have
the same digital code, and, also to determine where the database that
includes that specimen resides (as pointed out by Chris Thompson).  In
effect, any number of institutions can have a specimen labeled number
"1234".  Obviously these are not the same individual specimens, so the
only way to insure that each of these specimens (all the ones with
"1234" in our example) actually has a unique identifier is to include an
institutional identifier on the label (so that one can acertain that the
specimen is "Smithsonian specimen 1234" rather than "Kansas specimen
1234").  This will become increasingly important as we begin to share
information among institutional databases.  Clearly, the institutional
identifier must be included as a part of the unique specimen
identification number, but I think that it can be either written, or
included in the barcode symbology (I think this is important because one
need not use barcodes to database specimens).  This institutional
identifier obviously needs to be distinctive, unambiguous and recognized
by all.  Upon reflecting on these requirements, I believe that we must
have a registry of such identifiers (with synonyms - for example, Jack
mentioned that INBio has used both INBIOCR000000, and IB000000 - and if
"SM" proves the be inadequate as an institutional identifier for the
Snow Entomological Collection, then "SM" will need to be included as a
synonym of whatever identifier is chosen).  The question is, who will
develop this registry and who will maintain it as an active and current
database (I think it should be a webbased database - rather than a
simple list).  Jack suggested earlier that we should take these issues
up at the ECN meetings in Atlanta - I think this is a great idea.  We
probably need to move quickly on this - more and more collections are
beginning to develop specimen databases - and the database sharing
protocols are being developed very rapidly.

    I have got to stop for now - my best wishes to all.

Steve Ashe

Robert K. Colwell wrote:

> Steve,
>
> Thanks for the thoughtful and helpful contribution to the discussion.
> Just a couple of things about Code 49 and related matters...
>
> >As I mentioned in an earlier communication, we experimented a lot
> >with various codes before compromising on code 128 - including
> >considerable experimentation with code 49.  After these experiments,
> I
> >am very unenthusiastic about code 49 - we found that its limitations
> as
> >a data entry tool far outweighed its value as a tool for maintaining
> the
> >maximum amount of information.
>
> Of course, that depends on how you much you value each of those two
> things.
>
> >Because it is triple stacked, one must
> >be able to scan all three lines of code before one can get an
> accurate
> >reading.  This means that the position of the barcode is limited to a
>
> >position on the specimen from which a very large portion of the label
>
> >can be "seen" by the scanner.  This is why most people who use code
> 49
> >place them upside down as the bottom label on the specimen.  This
> >increases the handling time, and we found it to be awkward.
>
> I could not agree more--in MY inexperienced hands. But it does not
> take
> long for "virtuoso" performance levels to develop that (e.g. for our
> ALAS
> parataxonomists) allow remarkably rapid, 100% first-try scans. And
> ergonomically, if reads are planned for points in the protocol when
> specimens need to be "pulled" and re-placed in the pinning trays
> anyway,
> the time cost of that part the process is built in anyhow.
>
> We also use (adhesive) Code 49 barcodes on microscope slides for
> mites,
> in Project ALAS. In this case, there is certainly no advantage to
> single-row codes, since the the slide could not be read in a slide box
> in
> any case.
>
> > In
> >addition, triple-stacked codes take longer to read, and may produce
> more
> >errors.
>
> Once under the reader, they do not take any longer in experienced
> hands
> than single-row codes. As for errors, I think I can say that we have
> experience absolutely NO reading errors with Code 49. If the reader
> beeps
> its confirmation, the code is always correctly read.
>
> But your comment made me wonder how accurately one can read the
> facing-up
> barcode of an INDIVIDUAL pinned specimen, in situ in a tightly packed
> drawer or unit tray. How can one be sure the correct specimen was
> "seen"
> by the reader?  If you have to take it out and scan it to make sure,
> then
> nothing is saved over Code 49. But I have no experience with scanning
> in
> situ in drawers/unit trays...so you comments would be helpful.
>
> >Probably the only way to achieve John's goal of being able to
> >instantly read all the barcodes in a tray of specmens from numerous
> >museums is to force everyone to use exactly the same barcode
> symbology.
> >If not, the reader will need to be recalibrated for each different
> >code.
>
> I believe you are mistaken. At least with the Intermec barcode readers
> we
> use (and I suspect they are representative of high-quality ones by
> other
> makers), the same reader can read any "mixed" serious of codes that it
>
> has been enabled (programmed) to read. If enabled for Codes 39, 128,
> and
> 49, for example, it can read any mixture of labels with those codes,
> one
> after another.
>
> ALAS is not wedded to Code 49, except (as you are to 128) by prior
> investment, but I want to be sure that it is accurately represented.
>
> Best,
>
> Rob
>
> _________
> Robert K. Colwell, Dept. of Ecology & Evolutionary Biology, U-43
> University of Connecticut, Storrs, CT 06269-3042, USA
> Voice: 860-486-4395   Fax 860-486-6364
> colwell@uconn.edu
> Visit the Biota Website at http://viceroy.eeb.uconn.edu/biota
> & the EstimateS Website at http://viceroy.eeb.uconn.edu/estimates.



--
James S. Ashe
Division of Entomology
Snow Hall
KU Biodiversity Research Center/Natural History Museum
University of Kansas
Lawrence, KS 66045
U.S.A.

Phone: (785)-864-3030
Fax: (785)-864-5260
e-mail: ashe@falcon.cc.ukans.edu





Discover Life in America | Science | Unique Identifiers & Barcodes | Correspondence | James Ashe - 23 July, 1999