Progress report & proposal to NBII, November, 2004

Web tools to identify, report,
and map invasive species

Cooperative Agreement
between
USGS-NBII
and
The Polistes Foundation

John Pickering
University of Georgia, Athens

26 November, 2004

Rhamnus frangula, Glossy Buckthorn
Rhamnus frangula, Glossy Buckthorn
an invasive species of New England
Illustration by Cheryl Reese, 2004

Overview
Our ultimate goal is to provide everyone with the technology they need to identify, report, and help monitor the distribution and movement of species worldwide. Our first objective is to build identification guides to the flora and fauna of North America so that the general public can find and accurately identify target species. Our second objective is to provide additional tools that will enable school teachers, students, and other citizens to learn the concepts and methods of biodiversity discovery and then help scientists study and monitor species in their local communities.

Invasive species threaten our health, food supply, economy, environment, and general well-being. Our priority is to develop, test, and implement the human and technical infrastructure needed to improve vastly our quarantine, rapid detection, monitoring, and response to noxious species. The following sections describe our technology, partnership, progress, and proposal for 2005.

Technology
Discover Life uses server-side technology to gather and share information over the Web. Its 20q and other software modules are licensed from The Polistes Corporation at no cost. They provide the power behind the IDnature guides, Global Mapper, and other tools on Discover Life. Topozone.com provides users with maps and aerial photographs for displaying the distribution of species. Other partners provide images, data, and Web services that 20q integrates together. Discover Life's servers at the University of Georgia, www.discoverlife.org, and Missouri Botanical Garden, usmo4.discoverlife.org, make IDnature guides and other tools freely available to Web users. In October, 2004, they served over 1,045,000 pages and images to approximately 40,000 users. See www.discoverlife.org/pa/or/polistes/fe for more details of Discover Life's technology and other features.

Partnership
Since 2002, the U. S. Geological Survey's National Biological Information Structure (NBII) has supported The Polistes Foundation and its partners to build Web-based identification guides to North America's flora and fauna. As described in our 2002 and 2003 proposals, the NBII-Polistes partnership started by developing guides to common North American butterflies, caterpillars, wildflowers, and invasive species. In May, 2004, NBII and The Polistes Foundation signed a 5-year cooperative agreement to use Web tools to identify, report, and map invasive species in North America. NBII funded the 2004 proposal to build guides to birds, frogs & toads, salamanders, snakes, turtles and trees & shrubs in North America.

Progress
The Polistes Foundation aims to complete IDnature guides to a million species worldwide within 10 years (see www.discoverlife.org/pa/or/polistes/business_plan.html). As part of this plan, we intend to build online guides to all the described species in North America. In addition to the guides previously proposed to NBII, The Polistes Foundation and its partners have begun working on North American ants, dragonflies, ferns, grasses, mammals, mosquitoes, Opuntia cacti, and ticks.

We build checklists and turn them into guides through the following 6 step process:

  1. Checklists
    We first assemble a list of valid scientific names, ideally vetted by a taxonomic expert for the group. To this we add the authorities and years species were described. If vernacular names and synonymns are in general use, we enter them into a 'common_name' field. We assign each taxon a 'path' that reflects the taxon's phylogeny. 20q software uses this 'path' to determine the directory in which it stores information about the taxon. For example, the 'path' for ants is 'Insect/Hymenoptera/Formicidae'. Users can access checklists independently from guides. For example, usmo4.discoverlife.org/mp/20q?act=x_checklist&guide=Ants is the checklist precursor upon which the ant guide, usmo4.discoverlife.org/mp/20q?guide=Ants, is built.
  2. Species pages
    Once a species is in a checklist, we link it to images, maps, and other information on Discover Life, in our partners' databases, and on other Websites. This step includes taking and processing original photographs, building maps from spatial databases, and finding and entering URL's of relevant images, pages, and maps from around the world. 20q outputs most of the information that we assemble for each species as a single page. It puts off-site information from other Websites into the page on the fly, thus avoiding serving stale information.
  3. Guides
    Checklists become guides after we assign character-state attributes to kinds within them. We select an initial set of attributes that divide kinds into small groups. Once we score kinds for these attributes, guide users can narrow what they are identifying to a small number of possibilities, but not necessarily to a single kind.
  4. Resolving
    In 2004, we added a 'resolve' tool to 20q. This tool enables us to determine which kinds are unresolved, i. e., have similar attributes to other kinds in the guide. We then fine tune the guide by adding and scoring new attributes for the kinds that don't resolve. Once a guide contains no unresolved kinds, users can identify everything to a single kind.
  5. Illustrations
    We add illustations, photographs, and explanations to help users understand character-states.
  6. Feedback
    Even when a guide is illustrated and fully resolves, it may be difficult to use. Our final step is to test and improve the guides. We get feedback from schools and other users and then refine the guide accordingly. This is an iterative process. Typically it involves simplifying guides by removing redundant and obscure character-states.

The following table summarizes our progress on North American groups.

IDnature guides for North America
Group Species Kinds Illustrations
North America
estimate
Currently
in
checklist
Proposed
to NBII

(for year)
Currently
in
guide
Un-
resolved
Drawn Photo None
Ants 740 + 20% unknownall known ('05)819813 228021
Bees 4,500339 funded by USGS-BRD519211 55841,352
Beetles >20,00023
448 0646
Birds 995all 1,200 ('04)1,75017 720276
Butterflies 772all >600 ('02)1,2101034 950145
Caterpillars 15,000153 150 ('02)154129 02369
Damselflies 128 + unknown128

checklist only


Dragonflies 311all
316315 4130
Earthworms 150 + unknown118 ('05)118checklist only


Ferns 1,100471
150checklist only


Freshwater fish >9000 ('05)




Fungi 5,000144
15739 0085
Frogs & Toads 98all 102 ('04)10272 01204
Grasses 3,100615
616checklist only


Invasives species >1,0611,061 >350 ('02)1,1031,030 5648619
Liverworts 50024
24? 06296
Mammals 455all
468447 100400
Millipedes >9000





Mosquitoes 165 + unknown165
16643 820143
Moths >14,0000 ('05)




Opuntia Cacti 30all ('05)50checklist only


Salamanders 154all 70 ('04)18556 250165
Sawflies 1,0000 ('05)




Snakes 137all 300 ('04)304169 190146
Spiders 550 genera0 ('05)




Ticks 177 + unknown177
712540 600160
Turtles 57all 50 ('04)8414 400164
Trees 6,2001,057 common trees; important shrubs ('04)1,0551053 5247765
Wildflowers 17,8002,802 >600 ('02)2,790>1,000 4937329

The following explains the above columns:


Proposal for 2005

From: John Pickering <pick@discoverlife.org>
To: Annie Simpson <asimpson@usgs.gov>

26 November, 2004

Annie Simpson
National Biological Information Infrastructure
USGS

Annie,

Here is our proposal for FY2004-5 totaling $80,000. The first part is to further develop IDnature guides for North American species and potential alien invasive ones. In the second part, we propose to adapt and test the reporting and mapping tools on Discover Life and Topozone.com so that schools and the general public can better help us detect, map, and monitor invasive ants. If it meets your approval, please relay it to Gladys. If not, send me your recommendations, and I'll change it.

Cheers,
Pick
________________________________________________________
John Pickering Office: 706-542-1115
517 Biological Sciences Building FAX: 706-542-3344
University of Georgia Denise Lim: 706-542-6676
Athens, GA 30602-2602 Department: 706-542-3379
e-mail: pick@discoverlife.org Home: 706-353-7076
URL: www.discoverlife.org/who/Pickering,_John.html
________________________________________________________

Title: Web tools to identify, report, and map invasive species in North America -- 2nd year continuation.

Principal Investigator: John Pickering, University of Georgia, Athens

Administrative Contact: Kevin Weick, The Polistes Foundation,
      
www.discoverlife.org/who/Weick,_Kevin.html

Summary: We are developing technology to overcome the two major hurdles that greatly impeded citizens from contributing to the study and management of biological diversity and other natural resources. Many schools and volunteer organizations could help detect and manage invasive species, for example, simply by studying nature in their local communities and reporting what they find. However, most are prevented from contributing valuable data. They lack the ability to identify target species reliably and cannot easily share their findings with others. They need identification guides that are illustrated, non-technical, and can be successfully used by beginners with minimal training. They need intuitive data reporting tools that empower novice users to locate their study sites accurately, database their findings, and exchange high quality information through maps that filter data by source and reliability.

Here we propose to continue building interactive guides that will help users identify invasive species in North America. Specifically, in 2005, we propose to add at least 3,000 species to Discover Life's North American checklists and guides. Whenever possible we propose to link each of these species to high quality images, maps, and text from our partners' databases and other Websites. We propose to target the following groups for which we do not currently have checklists: freshwater fish, millipedes, moths (Arctiidae; Lymantriidae; Saturniidae; Sphingidae), sawflies, and spiders (Araneidae; Salticidae; Tetragnathidae). In addition, we propose to further develop the existing checklists and guides listed in the above table. Specifically we propose to further resolve, illustrate, simplify, and test the 10 guides that were previously funded by NBII (see Table, column 4) and guides to North American ants, earthworms, and Opuntia cacti, which were not previously proposed. The proposed checklists and guides will include native species, known alien invasive species, and exotic species that are most likely to invade the United States and its territories. We will make these products freely available to Web users through Discover Life's servers at the University of Georgia and Missouri Botanical Garden, www.discoverlife.org and usmo4.discoverlife.org, respectively. They will give teachers, students, citizen scientists, land managers, and scientists alike powerful tools to distinguish invasive species from each other and from their native look-alikes.

Discover Life, in partnership with Topozone.com, has developed Web tools to report and map information about species. Part II of this proposal is to learn how we should apply these tools so that they best meet the needs of agencies and organizations that want schools and the general public to help them detect, map, and monitor invasive species. We propose to hold a summer training workshop followed by multi-site collecting events to find and map invasive ants. Using Discover Life's identification, databasing, and mapping tools, the workshop will train teachers and other local leaders how to sample ants, identify and preserve specimens, database their findings, and build maps. We will invite participants from our network of partners, which includes organizations in 15 states and the District of Colombia, as listed below. Following the workshop, we will help participants organize classes and other events to sample ants in their schoolyards, parks, gardens, and other local areas. With the help of students and volunteers, we will test the IDnature guide to North American ants and the reporting tool. From their perspectives, we will learn what works, what needs to be fixed, and what new features they would like added.

We propose to do the following:

  1. Guides
    Using The Polistes Corporation's 20q software (see www.discoverlife.org/pa/or/polistes/fe and www.discoverlife.org/nh/id), we propose to build IDnature guides to distinguish invasive species from each other, from their native North American look-alikes, and from common exotic pets and ornamentals.
    • We propose to enhance existing checklists and guides so that users can more easily identify over 8,000 kinds of invertebrates, vertebrates, trees, and wildflowers in North America, including: Where appropriate, these guides will distinguish different sexes, seasonal morphs, and different life stages, such as eggs, juveniles, adults, fruits, seeds, flowers, etc. If resources permit, we will also enhance the mammal guide, usmo4.discoverlife.org/mp/20q?guide=Mammalia.
    • We propose checklists to freshwater fish, millipeds, moths, sawflies, and spiders. If available from our partners' databases and other Websites, we will link target species within these to high quality images, maps, and text. These annotated checklists will provide us with a sound foundation for building IDnature guides for these groups in 2006.
    • We will maintain an invasive species page, www.discoverlife.org/nh/tx/INVASIVES, and enhance our online annotated checklist of North American invasives, usmo4.discoverlife.org/mp/20q?guide=North_American_Invasives, adding new species as requested by NBII. Started in 2002, this checklist currently links to information on 1,103 kinds of invasives.
    • We will maintain an integrated plant guide, usmo4.discoverlife.org/mp/20q?guide=Plants, that includes the 2,700+ wildflowers that we have added with Kay Yatskievych and her volunteers at Missouri Botanical Gardens, 1,000+ trees and shrubs, and eventually, ferns, grasses, and other plants. Our illustrators and technical staff will provide general support to Missouri Botanical Garden and other partner organizations that contribute to building this integrated guide.
    • The proposed guides' underlying data structure will be in XML, as specified in by the schema for IDnature guides at
      www.discoverlife.org/ed/tg/Building_Web_Pages/20q_xml_tags.html
      These XML files will be put on Discover Life and made available to everyone through the software's export_xml function under Menu.
    • We will maintain one master guide to vertebrates and subguides as appropriate. The structure will follow the one we built for Butterflies that draws information from XML files for each butterfly family using the above schema's <include> tags. For an example of such an integrated guide, please see the Plant guide above.
    • The proposed guides will include scientific names, other names, character-state attributes used in identification, illustrations of character-states, and links to images and information about each species.
    • Cheryl Reese (www.discoverlife.org/who/Reese,_Cheryl.html) and other artists will illustrate the morphological character-states in the proposed guides. The guides will present their illustrations as thumbnails that link to higher resolution images. John Pickering will retain ownership and copyright of these illustrations. The public will be allowed to use them for non-profit purposes so long as they credit the illustrator and specify a link to Discover Life where used.
    • Technicians will support the proposal by processing images, scoring attributes, and linking the guides to images and information for each species. They will try to link each species in the guides to at least one high-quality photograph, a distribution map, textual information from fact sheets, such as from the Global Invasive Species Database, and taxonomy from ITIS.
    • Where appropriate the guides will distinguish kinds within species, allowing users to identify distinct kinds to sex, age, region, seasonal characters, and life stages. For species that cannot be distinguished by field markings, the guides will resolve identifications to all possible species or to a higher category that includes a species.

  2. Reporting & mapping
    Discover Life's new software tools can integrate time-sensitive, geo-spatial data from numerous sources into real-time maps. With these tools society could collect and share vast quantities of quality controlled information via the Web to help monitor and manage alien invasives. As a pilot project towards this vision, we wish to evaluate how well students, teachers, and other citizens can use our tools to identify invasive ants, report their findings, and map target species.

    As described below under "Global Mapper & institutional databases," Discover Life serves maps and aerial photographs overlayed with distributional information from our partners' databases. Unfortunately, these data are often out-of-date; typically years, even decades old. Up-to-date maps of fine-grain accuracy would greatly aid local managers to coordinate the early detection and rapid response to invasives. At larger scales, up-to-date global and regional maps would better alert quarantine inspectors and trade policy makers to potential dangers associated with movement across borders. One avenue to generating such maps is to harness the energy of schools and volunteer organizations. Here we propose a workshop and pilot project to determine how best to implement and test a real-time system that will let Web users report geo-referenced data on invasive ants and map their collective effort.

    • WORKSHOP

      Title: -- How to map invasive ants in real-time

      Goals

      • evaluate the existing online reporting and mapping tools
      • prioritize new functions users need to report and map invasive species
      • train teachers and community leaders how to sample ants
      • test the IDnature guide to ants on them
      • start to organize ant collecting events in at least 10 states
      • anticipate pitfalls and possible solutions
      • develop evaluation process and criteria for success

      We propose to organize a 2-day workshop in the summer of 2005 for approximately 15-20 particants representing

      • Data users: agencies and organizations concerned with invasive species, including representatives from the U. S. Invasive Species Council and GISIN/Global Invasive Species Program
      • Educators & other data providers: teachers and organizations that could organize individuals to participate, contribute data, and learn from their experience, such as Balsam Mountain Trust, First Hand Learning, GLOBE, Great Smoky Mountains National Park, Biodiversity Days, Missouri Botanical Garden, Roots & Shoots
      • Technical experts & scientists with experience in ant research, Web reporting, data quality assurance, and mapping, including BioNET International, Nature Mapping, NatureServe, Topozone.com, and USGS (Breeding Bird Survey; Frog Watch)

      The agenda will address issues that include:

      • Needs of data users: research design and protocols, which fields of data to collect, accuracy required, etc.
      • Needs of data providers: learning experience, training required, data ownership, what data may be shared, what must be kept private (e.g., reports on rare and endangered species), etc.
      • Technical issues: security, data quality assurance, how to distribute information to users, integration with other efforts, etc.
      • Action plan: outline next steps

    • PILOT PROJECT

      Title -- Events to map invasive ants from localities across the United States

      Goals

      • determine how accurately students and citizen scientists can use the IDnature guide to identify invasive and native ants
      • design standard research protocols that will allow students and other participants to monitor and compare ant communities
      • evaluate standard protocols through replicate sampling over time and by different individuals
      • examine the accuracy of the data collected
      • assess the enjoyment and learning experience of participants
      • modify the identification, reporting, and mapping tools based on what we learn from this pilot project
      • make data widely available
      • share the modified system and methods with other schools and organizations wishing to monitor ants and other things of concern

      Events
      In August through October, 2005, we propose to organize ant collecting events (see Nature Days, www.discoverlife.org/pa/ev/me/Nature_Days.overview.html) for schools and organizations in at least 10 states. We will select lead organizations and sites after consulting with our network of partners and their local schools in

      • Arizona (Ants of Arizona)
      • California (California Academy of Sciences)
      • District of Colombia (Georgetown University; Smithsonian Institution)
      • Florida (Florida Atlantic University)
      • Georgia (University of Georgia)
      • Hawaii (Bishop Museum; NBII-PBIN)
      • Illinois (Chicago Field Museum; University of Illinois)
      • Maryland (Patuxent Wildlife Research Center; Roots & Shoots)
      • Massachusetts (Brandeis University; Harvard University; Nuttall Ornithological Club)
      • Missouri (Missouri Botanical Garden; Shaw Nature Reserve)
      • Pennsylvania (Philadelphia Academy of Natural Sciences)
      • New Hampshire (Antioch New England Graduate School)
      • New York (First Hand Learning)
      • North Carolina (Balsam Mountain Trust; Highlands Biological Station)
      • Oregon (Evergreen State College; Nature Mapping)
      • Tennessee (Great Smoky Mountains National Park; SAMAB; NBII-SAIN).

  3. Global Mapper & institutional databases
    As part of our ongoing 2004 funding from NBII, we proposed to integrate spatial point data into distribution maps for up to 20 Web-enabled databases that NBII selects and gets authorization from their owners. In 2005, we propose to expand this process, putting up to 30 new spatial databases of NBII's choosing online. For examples of such maps, see In viewing these, click on a map point to see its individual data record; click elsewhere on the map or aerial photograph to zoom in.

    Technical details of the Global Mapper, Topozone.com, and linking database are as follows:

    • We enable Web users to build and display maps with 20q's Global Mapper (see usmo4.discoverlife.org/mp/20m.) This mapper is developed by the partnership between Topozone.com (see http://topozone.com) and The Polistes Corporation. It plots points on a composite satellite image of the globe and then allows users to zoom in through various layers to see detailed maps. Currently its base maps include a 1:1,000,000 scale map of the world, 17 million topo maps of the United States, and aerial photographs of 89% of the United States to 1 meter per pixel resolution. In total, approximately 20 terabytes of data reside on Topozone's servers and are used by the Global Mapper.
    • In addition to clicking on the satellite image from livingearth.com to zoom in on a place on the globe, the Mapper's banner includes the following four options:
    • Topozone's technology is completely compliant with Open GIS Consortium Web Mapping Service (OGCWMS). The Mapper overlays points on these maps through a thin CGI request. It would be technically straight forward to combine layers from other NBII nodes into the Mapper, so long as they are OGCWMS compliant.
    • Topozone.com has the largest, most up-to-date, and accurate set of maps and aerial photographs anywhere on the Web. It serves on the order of 300,000 maps per day, many of them to the public for free. These maps are copyrighted to Maps a la Carte, Inc. Some of Topozone's premium services are subject to user fees.
    • For compliant databases that provide unique identifiers with their point data, we add links on the maps to enable Web users to query data records associated with individual points. Currently we are doing this for data provided by databases at the Chicago Field Museum, Invasive Plant Atlas of New England, Missouri Botanical Garden, Philadelphia Academy of Natural Sciences, Smithsonian Institution, University of Georgia, University of Illinois, Utah State University Herbarium, and elsewhere.
    • The Global Mapper uses the NAD83 standard to plot points. We will require contributing databases to provide points in NAD83 or NAD27 standards and to specify which one they are using. The Global Mapper automatically switches from latitude-longitude to UTM coordinates at finer resolutions. Currently we accept points in either of these coordinate systems, but not in other ones.
    • We would prefer that partnering databases allow 20q to get points and associated records from them on the fly using HTTP GET or POST requests. 20q processes the data returned in various formats, including HTML, XML, and plain text. However, for databases that are not Web-enabled, but that are connected to the Internet, 20q can integrate data into the Global Mapper through direct requests to DBI compliant databases, such as Oracle and MySQL, after authorizing with a login name and password.
    • We envision extending the use of the Global Mapper to additional databases and also allowing users to contribute their observations to participating databases. We plan to build a Web-based, real-time monitoring system of invasive species and other things of concern.

  4. Servers
    At no charge to users, we will serve the proposed guides, images, and maps through existing Discover Life servers into the foreseeable future.
    • Currently we have 11 Sun servers at the University of Georgia, Athens; 6 Sun servers, donated by Sun Microsystems Inc., at Missouri Botanical Garden, and one Linux server at the Agricultural Research Council, South Africa. Together they served over a million pages and images in October, 2004, and are capable of handling Discover Life's anticipated growth in load through 2005.
    • If at some point we are unable to continue this service, we will transfer the guides to a non-profit organization or government agency.

  5. Copyright
    The PI and other contributors will retain ownership and exclusive copyright, with all rights reserved, to any illustrations, photographs, maps, or text they place in the guides or elsewhere on the Discover Life or associated Websites. The Polistes Corporation does the same for the 20q software that serves the guides, and Topozone.com, for the mapping software and services it provides.

Budget requested from NBII:
Part I -- IDnature guides
$32,041 Technical support & scientific illustration ($7.00 - $20.00/hour)
$13,678 PI Salary (17% time)
$1,900 Travel, including NBII All Node Meeting
$2,381 5% indirect costs for The Polistes Foundation to manage the grant
$50,000TOTAL
Part II -- Reporting & mapping tools
$10,000 Workshop
$8,571 Technical support for events ($7.00 - $20.00/hour)
$5,000 Organizational support for workshop & events, 1 month time,
Peter Alden, www.discoverlife.org/who/CV/Alden,_Peter.html
$5,000 Services from Topozone.com, including adding street addresses to the gazetteer
$1,429 5% indirect costs for The Polistes Foundation to manage the grant
$30,000TOTAL

Other in kind services:
As part of the partnership to develop the Global Mapper, Topozone.com donates over $35,000 per year in maps and aerial photographs to Discover Life.

Anticipated completion date:
9 months from completion of paper work between USGS-NBII and The Polistes Foundation.

Discover Life | All Living Things | IDnature guides | Global Mapper | The Polistes Foundation | Report & Proposal