20q XML Tags & File Structure

Sharon Ballew & John Pickering
University of Georgia, Athens

9 February, 2003


Great Horned Owl
by Nancy Lowe

Last updated: 29 January, 2006

Discover Life | Education | Training Guides | Building Web Pages | 20q XML Tags & File Structure
Overview

20q software processes information from multiple file and database types that it accesses through the Internet. Here we explain the XML file structure and tags that 20q uses to build IDnature guides and checklists. These XML files contain information about organisms and their attributes. They store Web links to images, text, databases, and programs to interface the guides and checklists to additional information. Like HTML files, they are in human-readable text, can be placed anywhere on the Web, and are simple to edit manually. We are adding tools under the "Menu" feature of IDnature guides to help build and manage the content of 20q's XML files using any Web browser (see IDnature guides' Help).

Below we use <maroon> to show XML tags and green text to represent data within tags. Standard HTML tags, which can be nested as data within some of 20q's XML tags, are <fuchsia>.

XML files are used for two reasons:
  • to store a checklist, and,
  • to present characters.
All organisms listed in an XML file will appear in a checklist. Only those with attributes appear in a guide.

Index
About XML

XML is a markup language used to structure data with tags.

Example:

   <character>Head color</character>

We use XML tags and some HTML tags in 20q files. Click here for basic information about HTML tags and some simple examples. There are three major rules when using XML:

Index
File structure

Except for top level guide_name.xml files, such as "Plants.xml" and "Birds.xml", 20q's XML files should be named "20q.xml" and may be place anywhere on the Web. Depending on its function, each XML file may have the following three types of information:
  • Guide controls (within <20q_controls>...</20q_controls> tags) -- see details
  • Checklist credit (text and HTML tags within <checklist_credit>...</checklist_credit> tags) -- see details
  • Guide credit (text and HTML tags within <guide_credit>...</guide_credit> tags) -- see details
  • Data (within <path>...</path>, <set>...</set>, and <include>...</include> tags) -- see details
If a line in a guide_name.xml or 20q.xml file begins with __END__, then 20q ignores all information after the __END__ line and does not use it in either guides or checklists.

Index
<20q_controls> ...other XML tags & data... </20q_controls>

At the beginning of your file there will be a <20q_controls> section. This specifies who is building the file, their email, and other information that is used by 20q to build and run a guide. We recommend that you format the file with the tabs as shown below, but this is not required and doesn't affect funtionality.

Example:

<20q_controls>
   <who_default>Sharon Ballew</who_default>
   <email_default>sharonballew@discoverlife.org</email_default>
   <base_path>Insecta/Hymenoptera/Apoidea/Apidae/Bombus</base_path>
   <levels_default>class order superfamily family subfamily genus species</levels_default>
   <c_initial>
      <character>Range</character>
      <character>Thorax top pattern</character>
      <character>Face color</character>
      <character>Abdomen top front to rear color change number</character>
      <character>Thorax top front color</character>
      <character>Thorax top rear color</character>
      <character>Abdomen top texture</character>
   </c_initial>
</20q controls>

Explanation of tags:

  • <who_default>value</who_default>
    The value is the name of the primary person working on the file.
  • <email_default>value</email_default>
    The value is the email for the primary person.
  • <base_path>value</base_path>
    This is the default path used if no other path is specified in the data section. Paths tell the computer where to look for a file. A base path is a good backup, but you should put a path before each entry in the data section since this will be required for future xml files. See <path> in the Data section for more information.
  • <levels_default>value</levels_default>
    This is a list of what levels of taxa you are using in this particular file. You may have more or less than the example given, depending on what level you are working with.
  • <c_initial>...other XML tags and data...</c_initial>
    The "initial characters" that appear on the first page of the guide before a user presses the "simplify" button. Initial characters appear in the order chosen by the guide builder, as sequentially listed between the <c_initial> and </c_initial> tags with pairs of <character> and </character> tags (see example above). After "simplify" is used, 20q software presents all remaining useful characters alphabetically.

    • <character>value</character>
      The characters listed here (Range, Thorax top pattern, Face color, etc.) will be displayed on the online guide in the order they are listed here. Initial characters are those which, when chosen, will separate the organisms into major categories. For example, in a guide containing swallowtail butterflies and brush-footed butterflies, you might want one of the initial characters to be <character>Tailed hindwings</character>. This particular character would pull out a major group, the swallowtails, from the rest of the butterflies in the guide.

Index
<checklist_credit>...data...</checklist_credit>

<guide_credit>...data...</guide_credit>

After the 20q controls section, you will need an area to cite credits and to put any other information (such as "Under construction"). Also, if you are doing a checklist, this is the place to title it. Note that HTML tags are shown as well. For more information on HTML tags, please go to the HTML training guide.

Example:

   <checklist_credit>
      <div align="center">
      <b>Checklist of American Bumblebees</b><p />
      John Pickering
      </div>
      Under construction. Compiled primarily from Bombus site of The Natural History Museum, London,
      Bumblebee Economics by Bernd Heinrich, and UGA Natural History Museum specimens.<p />
   </checklist_credit>

Index
Data

Data are presented between the follow three types of tags:
  • <path>Insecta/Lepidoptera/Papilionidae </path>
  • <set type="taxon" level="value ">...other XML tags and data...</set>
  • <include>http://... URL to another 20q.xml file </include>

Example:

   <path>Insecta/Hymenoptera/Apoidea/Apidae/Apinae</path>
   <set type="taxon" level="genus">
      <name>Bombus</name><authority>Latreille</authority><common_name language="english">Bumblebees; Humblebees</common_name>
      <easy_name>Bumblebee</easy_name>
   </set>
   <path>Insecta/Hymenoptera/Apoidea/Apidae/Apinae</path>
   <set type="taxon" level="species">
      <name>Bombus abnormis</name><authority>(Tkalcu), 1968:33</authority>
      <easy_name>Abby Bumblebee</easy_name><common_name language="english"></common_name>
      <attributes>
         <character>Range</character>
            <state>Oriental Region</state>
      </attributes>
   </set>
   <include>http://www.discoverlife.org/nh/tx/Insecta/Hymenoptera/Apoidea/Apidae/Apinae/Bombus/pensylvanicus/20q.xml</include>

Explanation of tags:

  • <path>value</path>
    The importance of paths has already been mentioned under the section explaining 20q controls. The first path in the data section above is telling the system where the genus Bombus is located. The second path is telling the system where the "Genus species" pair "Bombus abnormis" is located.
  • Note: If the path is incorrect and files cannot be found, error messages will appear telling you what cannot be found.

  • <set type="taxon" level="genus">...other XML tags & data...</set>
    It goes without saying that when the taxon level is set as "genus," the data immediately following will be about that genus. Also note that at the end of each segment dealing with each genus or species, there is a corresponding </set>. This completes the wraparound of the set type.
  • <name>value</name>
    The organism you are specifying the path for must be wrapped in the name tags.

    Example:

       <name>Genus species</name>

  • <authority>value</authority>
    Put the authority for the organism here. Make sure this authority is correct, adding parentheses if necessary. If possible, include a date with the authority.
  • <easy_name>value</easy_name>
    Easy names are arbitrarily assigned names to a particular organism. This is meant to be a simplifying tool, especially for younger children.
  • <common_name>value</common_name>
    The common name is the most widely known name or names for the organisms. Example: Papilio glaucus is better known as the Tiger Swallowtail. This tag has the ability to feature other languages than english.
  • <attributes>...other XML tags and data...</attributes>
    The attributes section holds all the characters and states you are including for the organism.
  • <character>...other XML tags and data...</character>
    The most important part of your file will probably be what is contained within the characters and states. Choose characters carefully, trying to separate as many organisms as possible without including extraneous information or words.
  • <state>value</state>
    States define a precise characteristic of the organism in question.

    Example:

       <path>Insecta/Lepidoptera/Pieridae/Coliadinae</path>
       <set type="taxon" level="species">
          <name>Anteos clorinde</name><authority>(Godart), 1824</authority>
          <easy_name>Ghost Brimstone</easy_name>
          <common_name language="english">White Angled-Sulphur</common_name>
          <attributes>
             <character>Antenna shape</character>
                <state>Club, rounded</state>
             <character>Leg number</character>
                <state>6</state>
          </attributes>
       </set>

    As an example of how the characters and states are used as exclusions within the system, we can also include in the file another butterfly that does not have clubbed antennae. In this circumstance, you will have to decide what you will call this new kind of antennae and replace "Club, rounded" with that definition. When the guide for this file is displayed on the web, there will be two options beneath the character "Antennae shape," and choosing the correct option for your specimen will allow you to identify it properly. When you are first developing the file, you will decide what characters are necessary and can continue developing states for the various individuals from that point.

  • <include>value</include>
    The value is the URL to another 20q.xml file that is to be included in the present file. You may have multiple <include> tags and values within one file.

Index
<image>http://...URL... Explanation</image>

Illustrate character-states & add explaining text
Use <image>http://...URL...</image> tags after <state>state name</state> tags to illustrate a state in an IDnature guide. IDnature guides can link a low resolution thumbnail image to a higher resolution. We recommend that the thumbnail image have a maximum height of 80 pixels and that the higher resolution image have a maximum width of 240 pixels, but this is not a requirement. 20q recognizes jpg, gif and png images; although older browsers will not display png images. We recommend that you name image files so that the last part of the URL reflects their resolution and file type. For example, use http://www...image_name.80.jpg and http://www...image_name.240.jpg to name a pair of jpg images with maximum height or width of 80 and 240 pixels, respectively.

If a URL ends in .80.jpg, .80.gif, or .80.png, then 20q will automatically link it to a corresponding .240. higher resolution URL. If you wish to override this feature and link any first image to any second, use
<image>http://...thumbnail_URL... http://...higher_resolution_URL...</image>, where the two URL's are separated by a space. Alternately, you can use the shorthand
<image>http://...thumbnail_URL.80.jpg 320</image>, where jpg and 320 are separated by a space, to link an .80.jpg image to a .320. resolution.
To override an 80 resolution image being automatically linked to a 240 resolution one, place a " 0" after the 80 resolution image's URL, thus
<image>http://...thumbnail_URL.80.jpg 0</image>. The " 0" is unnecessary if the thumbnail has any resolution other than 80.

If you wish to illustrate a state with two or more images, then list multiple <image>http://...URL...</image> tags after the <state>state name</state> tags.

Examples:

Index
<link>value</link>

Use <link> to link kinds to images, maps and other URL's. The format of the value between the tags depends on whether you link an image or some other URL, as follows:

Put these within a kind's <set ...>...</set> section, somewhere between the </name> tag and the <attribute> tag. Note there must be a space between the URL and the text describing the link.

Specify http:// at the beginning of the URL.
Use multiple sets of these tags to link to more than one URL.

Index

  • Listing of _ALL_ Characters

    You can also list all characters and states at the beginning of an XML file by creating a subsection called _ALL_. This will help you to keep track of all your characters and states by listing them prior to the actual data.

    Example:

    <path>Insecta/Hymenoptera/Apoidea/Apidae/Apinae/Bombus</path>
       <set type="taxon" level="top">
       <name>_ALL_</name>
       <attributes>
          List all your characters and corresponding states here.
          We recommend putting all image tags for a guide in _ALL_ rather than under other kinds. However, they work within any set of attributes tags.

       </attributes>
    </set>

    _ALL_ is not c_initial. _ALL_ will have no effect on which characters are displayed first. It is only a list for your convenience.

  • Discover Life | Education | Training Guides | Building Web Pages | 20q XML Tags & File Structure