Discover Life in America

John Pickering - 23 July, 1999

Unique identifiers -- institution codes

Date: Fri, 23 Jul 1999 13:35:13 -0400
To: "Donald Windsor " <WINDSORD@tivoli.si.edu>
From: pick@pick.uga.edu (John Pickering)
Subject: Unique identifiers -- institution codes
Cc: sackley@compuserve.com, ashe@falcon.cc.ukans.edu, ksem@kuhub.cc.ukans.edu,
        brianb@mizar.usc.edu, colwell@uconnvm.uconn.edu,
        Gladys_Cotter@usgs.gov, christine.deal@intermec.com,
        faulzeitler@ascoll.org, mark_fornwall@usgs.gov,
        Furth.David@NMNH.SI.EDU, whallwac@sas.upenn.edu, djanzen@sas.upenn.edu,
        Johnson.2@osu.edu, mkaspari@ou.edu, longinoj@elwha.evergreen.edu,
        scottm@bishop.bishop.hawaii.org, becky_nichols@nps.gov,
        Chuck_Parker@nps.gov, msharkey@byron.ca.uky.edu, ctemple@intermec.com,
        cthompso@sel.barc.usda.gov, jugalde@inbio.ac.cr,
        pin93001@uconnvm.uconn.edu, windsord@tivoli.si.edu, dl@pick.uga.edu,
        idg@nhm.ac.uk

Don (and others),

Why do you think that we only need 7 or 9 bar-coded digits?  We currently
are using 4 letters and 6 digits, something that code 128 cannot do on a
small insect label.  Until we decide what to put on our labels, lets not
limit our options. It would be a big mistake for us (or STRI or anyone
else) not to use a unique institution code on museum labels.

Steve Ashe and KU have adopted an efficient LOCAL solution, using "KU" (or
"SM" ?) and 7 digits.  However, rather than following in KU's code 128
footsteps, we should all work toward developing an efficient GLOBAL
solution -- one that best allow's us all to share speciemens and data with
other institutions using unique identifiers.

For entomology collections, I recommend that we generally adopt the unique
institution identifiers in R. H. Arnett & G. A. Samuelson's publication
(1996.  Insect & Spider Collections of the World, E. J. Brill: Gainesville,
FL, pp. 220).  However, at this early stage in adopting unique identifiers
among collections, we should allow some exceptions to the Arnett &
Samuelson identifiers so as to protect the investment of anyone who has
already barcoded specimens.  Existing labels should take precedence over
Arnett & Samuelson's institution identifiers.  Thus, nobody else should use
INBIOCRI, KU (SM?), etc., as these identifiers are already used by
INBio/ALAS and KU.  I recommend that we ask the Association of Systematic
Collections (ASC) to maintain such a list and manage it in conjunction with
similar lists for herbaria, vertebate, and other collections.

Code 128 is an efficient local solution, but because it only allows 2
letters for an institution identifier, it is not a good global solution.
Code 49 labels could serve as a global solution.  Several issues to
consider:

1) Our first priority is to assure that each specimen receives a unique
identifier.  We must collectively come to a consensus on how to assign
these so that they do not conflict among institutions and projects.  If
code 128 works for a short acronym, such as KU, then that institution
should use code 128.  If an institution has a longer acronym, then they
should use code 49.  Let's standardize the unique identifiers, and let
institutions change their technology for making and reading them as they
see fit.

2) As Jack so aptly noted, there is a great danger in coding only part of a
unique identifier in the barcode symbology to save space.  What got left
out?  When we go to the Web to seach for data on specimens, it becomes
difficult to share info if we are all using abbreviations for our unique
identifiers, either on our barcodes or in our databases.  As a society,
we're  spending billions to solve the Y2K problem of "saving" the first 2
digits at the beginning of the year.  Let's not repeat the folly of saving
characters -- disk space is cheap and code 49 can handle the larger
institutional identifiers that we need.  Yes, I too was guilty of cutting
my institutional "UGCA" out in some of my programs and data records.  It
saved disk space and was easier to work with just digits.  However, I now
have my own Y2K false economy, because I'm now mixing "INBio" specimens
with my "UGCA" records and specimens.  Like Jack, I'm now converting my
database to one that uses complete unique identifiers.  I now have no
abbreviations, no saving disk space, but complete compatibility with all
other collections that use unique identifiers and that don't economize on
digits!

3) Regarding reading code 49, in the hands of a trained operator it can be
exceedingly fast, on the order of a few seconds per specimen (if labels are
face up), a little longer if specimens need to be handled.  Error checking
is included in the symbology, so errors should occur very rarely.  Reading
with the Imager 1470, rather than with the old scanner should be even
easier, as it works like a digital camera.  The better scanners and imagers
can be programmed to accept code 39, 49, 128, etc. so that we can read unit
trays of mixed codes.  In short, we should all have to accept the same
technological solution, only a solution that keeps our identifiers unique.

4) The folks at Intermec are helping to develop and test the technology
that we need.  In a week or so, I expect to test their Imager 1470 in my
lab.  By next week Christy Deal hopes to have a set of proof labels printed
on their 3240 printer that I'll send to each of you who wishes to see them.
I recommend waiting on making any purchase decision until we are sure that
this new technology works.  It won't be long.

5) Because of error checking, we are likely to continue to use barcodes and
not eventually switch to reading alphanumerics on the labels.  Barcodes
include error checks to avoid recognizing an "F" plus a "leg" and an "E."
Hence, let's think long-term now and not assume that technology will bail
us out and let us read the rest of the label that is not in the barcode
symbol.

6) Finally, regarding putting barcodes face up versus face down, my vision
is that we will eventually have the hardware and software to scan a unit
tray of insects instantaneously if the labels are face up and the bugs are
not too large to obscure completely the symbols.  Computer manufacturers
are already reading multiple barcodes on boards moving past sensors on
conveyor belts.  When Sprague was here we made a jpg image with my digital
camera of appoximately 50 specimens in a unit tray.  He will test how well
his existing software can decode the barcodes in this image.  Although the
image's resolution appeared fine, we would need to develop some "leg
removing" algorithms before this becomes a practical solution to our needs.
Anyway, reading multiple barcodes simultaneously is closer to reality than
you may suspect.  After all, Ian Gauld's Daisy project is using artificial
intelligence to identifying species from images of wing veination.

Must go to empty some traps in the Smokies.  More next week.

Cheers,
Pick





>Jorge,  It  seems we now need only 7 or 9 bar-coded digits
>on our labels. Thus the label can be made even smaller!  I
>definitely think our new labels should be code 128.  And, as
>you indicate, with the 1470 scanner we should have the
>ability to read our old code 49 labels.  I look forward to
>seeing your quotation for Todd.  Thanks,  Don
>
>>>> Jorge Enrique <jdiaz@pty.com> 07/21/99 03:19pm >>>
>Hi Donald!
>
>I have to say that we made some testings with your 11 digits
>and could
>reduce the label to 2cm x 1 cm. I am quoting the package to
>Dr. Capson.  I
>the other hand, the 1470 CCD imager  can read code 49 now.
>This is a much
>more expensive scanner but I am sure it deserves to have one
>at your lab so
>you can handle both codes (49 and 128). I am working on the
>quotation to Dr.
>Capso with the 1470 too.
>
>Bye!
>
>At 10:12 AM 07/21/1999 -0400, you wrote:
>>Jorge,  Please read attached mail from a good friend who is
>>director of the Entomological collections at Univ. of
>>Kansas.
>>Using code 128 they can produce a very small label with all
>>the necessary information.  This is what we need to
>>duplicate here in Panama.  Cheers,  DOn WindsorReceived:
>from
>falcon.cc.ukans.edu
>> ([129.237.34.1])
>> by ic.si.edu; Wed, 21 Jul 1999 10:18:46 -0400
>>Received: from Steve.cc.ukans.edu by falcon.cc.ukans.edu
>(8.8.7/1.1.8.2/12Jan95-0207PM)
>> id JAA0000027897; Wed, 21 Jul 1999 09:14:24 -0500 (CDT)
>>Message-ID: <3795D615.1CC0FCF5@falcon.cc.ukans.edu>
>>Date: Wed, 21 Jul 1999 09:15:49 -0500
>>From: "James S. Ashe" <ashe@falcon.cc.ukans.edu>
>>X-Mailer: Mozilla 4.01 [en] (Win95; I)
>>To: John Pickering <pick@pick.uga.edu>
>>CC: sackley@compuserve.com, ksem@kuhub.cc.ukans.edu,
>brianb@mizar.usc.edu,
>>        colwell@uconnvm.uconn.edu, Gladys_Cotter@usgs.gov,
>>        christine.deal@intermec.com,
>faulzeitler@ascoll.org,
>>        mark_fornwall@usgs.gov, Furth.David@NMNH.SI.EDU,
>>        whallwac@sas.upenn.edu, djanzen@sas.upenn.edu,
>Johnson.2@osu.edu,
>>        mkaspari@ou.edu, longinoj@elwha.evergreen.edu,
>>        scottm@bishop.bishop.hawaii.org,
>becky_nichols@nps.gov,
>>        Chuck_Parker@nps.gov, msharkey@byron.ca.uky.edu,
>ctemple@intermec.com,
>>        cthompso@sel.barc.usda.gov, jugalde@inbio.ac.cr,
>>        pin93001@uconnvm.uconn.edu, windsord@tivoli.si.edu,
>dl@pick.uga.edu
>>Subject: Re: Unique identifiers & barcodes
>>X-Priority: 3 (Normal)
>>References: <v01540b1bb3b1211c46d3@[128.192.10.172]>
>>Mime-Version: 1.0
>>Content-Type: text/plain; charset=US-ASCII
>>Content-Disposition: inline
>>
>>Hi John and everyone,
>>
>>    This continuing discussion seems to indicate that we at
>>KU are the
>>only ones that like code 128.  I am surprised that it is so
>>uniformly
>>found to be inadequate.  Except for the compromise of
>>accepting only a
>>2-letter acronym, we find that code 128 exceeds other codes
>>in it's
>>primary function of facilitating data entry.
>>
>>    I agree that if you want a long list of alpha
>characters
>>included in
>>
>>the barcode symbology then code 128 simply cannot be made
>>small enough.
>>We experimented with this considerably before deciding to
>>include a
>>shortened institutional identifier (we use "SM" for "Snow
>>Museum") in
>>the barcode symbology.   We also include the full
>>institutional
>>identifier (we use "KUNHM-ENT" for "University of Kansas
>>Natural History
>>
>>Museum - Division of Entomology") in an additional text
>>line.  The
>>additional text line is possible because a single stacked
>>barcode can be
>>
>>much narrower, so that one can get 3 lines of information
>on
>>the label
>>(we have the barcode symboloby, the full alpha-numeric
>>written out, and
>>the full institutional identifier written out).  We
>>recognzed that this
>>was a compromise.  However, we believed that it was
>>reasonable because
>>the single-stacked code 128 was small (our barcode labels
>>are 7 by 15
>>mm) and it is extremely easy, accurate and fast to read,
>and
>>allowed us
>>to place the barcode as the bottom label and still read it
>>from above.
>>Our primary goal was to make data entry as fast and
>accurate
>>as
>>possible, while at the same time preserving essential
>>information.  We
>>thought that the compromise of having a shortened
>>institutional
>>identifier in the symbology, and the full identifier
>written
>>out on the
>>label allowed one to easily determine the origin of the
>>specimen.  In
>>effect, we felt that the full written alpha-numeric
>specimen
>>identifier
>>is the only truely "archival" part of the label - the
>>barcode is
>>transitory technology.
>>
>>    As I mentioned in an earlier communication, we
>>experimented a lot
>>with various codes before compromising on code 128 -
>>including
>>considerable experimentation with code 49.  After these
>>experiments, I
>>am very unenthusiastic about code 49 - we found that its
>>limitations as
>>a data entry tool far outweighed its value as a tool for
>>maintaining the
>>
>>maximum amount of information.  Because it is triple
>>stacked, one must
>>be able to scan all three lines of code before one can get
>>an accurate
>>reading.  This means that the position of the barcode is
>>limited to a
>>position on the specimen from which a very large portion of
>>the label
>>can be "seen" by the scanner.  This is why most people who
>>use code 49
>>place them upside down as the bottom label on the specimen.
>
>>This
>>increases the handling time, and we found it to be awkward.
>
>>In
>>addition, triple-stacked codes take longer to read, and may
>>produce more
>>
>>errors.  I would be reluctant to change from code 128 to
>>code 49
>>symbology.  Still, if one can identify a barcode symbology
>>that combines
>>
>>the advantages of code 128 with the greater information
>>content of code
>>49, I would be glad to change our system.
>>
>>    We have considerable commitment to code 128 - a
>database
>>of over
>>180,000 specimens that have code 128 barcodes on them, and
>>an investment
>>
>>in about 100,000 additional code 128 labels.  It has served
>>our needs
>>well.  Nonetheless, I have no enthusiasm for being the only
>>collection
>>using code 128, if it fails to meet community needs for
>>institutional
>>identifier.  While I don't think that the specific
>symboloby
>>used is
>>critical, the issue of consistency in institutional
>>identifiers is an
>>important one.  If we need to change to another barcode
>>symboloby to
>>satisfy those needs, then we will.  But I strongly urge the
>>community
>>NOT to make the issue of barcode standards so stringent and
>>restrictive
>>that they serve as an disincentive for development of
>>accurate and
>>informative specimen databases,  These are our real goals -
>>barcodes
>>should only be a facilitate that end.   I am increasingly
>>concerned that
>>
>>the debate about barcode standards will discourage and
>>delay, rather
>>than energize people toward, development of specimen
>>databases.
>>
>>    Probably the only way to achieve John's goal of being
>>able to
>>instantly read all the barcodes in a tray of specmens from
>>numerous
>>museums is to force everyone to use exactly the same
>barcode
>>symbology.
>>If not, the reader will need to be recalibrated for each
>>different
>>code.  Still, I'm not sure that we can ever expect to
>impose
>>absolute
>>standards for barcode symbology on the community - and to
>>have that be a
>>
>>truely effective solution.  Who is going to police the
>>standards, and
>>how will they enforce them?  In addition, such a rigid
>>standard
>>discourages innovation, prevents incorporation of new
>>technology, traps
>>the community in a standard that will become archaic (note
>>that this
>>discussion began because of the possibility that code 49
>was
>>already in
>>danger of  becoming outdataed) and discourages people from
>>developing
>>databases who find it difficult to meet that standard.
>>
>>    The cost of barcoding specimens that John mentioned is
>>probably not
>>avoidable.  Good barcode scanners are essential for
>>efficient and
>>error-free reading - inexpensive ones don't work well
>enough
>>with the
>>small labels required for insect specimens.   We have saved
>>in the long
>>term by purchasing a bar-code printer (with the Division of
>>Botany), and
>>
>>we make our own.  We expect a long-term need for barcodes,
>>and the
>>potential use of hundreds of thousands, so such an
>>investment pays off
>>in the long run.  However, I would not recommend this
>>solution for all
>>users.  The initial cost for printer, barcode formating
>>software,
>>scanner, label stock and printer ribbons was well over
>$5000
>>(this is
>>probably cheaper today than 4 years ago), not to mention
>the
>>considerable investment in time for learning to use the
>>difficult (to us
>>
>>anyway!) programs required to make customized barcodes.  In
>>addition,
>>both label stock and printer ribbons are a significant
>>continuing cost.
>>I agree that the entomological community is not likely to
>be
>>a large
>>enough market to significantly force the price down.  At
>>least on the
>>surface, there seems to be relatively little economy of
>>scale to be
>>gained because each institution requires that the barcodes
>>be
>>"customized" with their institutional identifier.
>>
>>    It is important for everyone to remember however, that
>>the cost of
>>barcodes should not be an excuse for failing to database
>>entomological
>>specimens.  It is easy to forget in this discussion that
>the
>>barcode is
>>only an identifier for a specimen.  Its' sole purpose is to
>>increase the
>>
>>efficiency, speed and accuracy of entry of that identifier
>>into a
>>database.  The cost can be avoided by simply labeling
>>specimens with a
>>alpha-numeric label that identifies the specimen.  One
>>looses the
>>data-entry advantages of barcodes, but the goal of a unique
>>identifier
>>for each specimen is fully achieved by the "low-tech"
>>alpha-numeric
>>printed label.  Most vertebrate collections do not use
>>barcodes because
>>they handle so few specimens (relative to insect
>>collections) that the
>>simple alpha-numeric identifier on the specimen label is
>>adequate for
>>databasing standards - and many of these vertebrate
>>collections have
>>fully functioning databases of their collections that do
>not
>>suffer in
>>the slightest from the fact that their collection is not
>>barcoded.  On
>>the other hand, coding the specimen identifier into a
>>barcode improves
>>the handling rate (and accuracy) by several seconds per
>>specimen.  These
>>
>>seconds add up into very large blocks of time when the
>>number of
>>specimens that entomologists handle are taken into account.
>>
>>    I hope these randon thoughts are useful additions to
>our
>>discussion.  My best wishes to all.
>>
>>Steve Ashe
>>
>>
>>
>>> >******************************************************
>>> >John T. Longino
>>> >Lab I, The Evergreen State College
>>> >Olympia WA 98505 USA
>>> >longinoj@evergreen.edu
>>> >Ants of Costa Rica on the Web at
>>http://www.evergreen.edu/ants
>>> >Project ALAS at
>>http://viceroy.eeb.uconn.edu/ALAS/ALAS.html
>>> >******************************************************
>>
>>
>>
>>--
>>James S. Ashe
>>Division of Entomology
>>Snow Hall
>>KU Biodiversity Research Center/Natural History Museum
>>University of Kansas
>>Lawrence, KS 66045
>>U.S.A.
>>
>>Phone: (785)-864-3030
>>Fax: (785)-864-5260
>>e-mail: ashe@falcon.cc.ukans.edu
>>
>>
>>
>>
>Jorge Enrique Diaz V.
>Gerente General
>BarCode de Panama, S.A.
>
>





Discover Life in America | Science | Unique Identifiers & Barcodes | Correspondence | John Pickering - 23 July, 1999