KEYING/CODING SPECIFICATIONS

The University of Michigan:
An Encyclopedic Survey (UMSurvey)

Introduction

Source data

The material to be keyed is taken from book(s)--the Encyclopedic Survey of the University of Michigan, issued in nine parts from 1941 to 1958--as well as from two spiral-bound published typescript supplements, and about 150 pages of unpublished typescript. The material will be sent to the data conversion firm on a CD in the form of 600-dpi bitonal tiff files (one file per page).

Target data

The data-conversion vendor will return keyed and coded text files transcribed from the image files.

Transcriptional accuracy will be 99.995% or better (error rate of 1 character/byte in 20,000). We will test and if necessary reject data if it fails to meet that spec, based on 5% random sampling.

Coding will be valid SGML, validated against the supplied "vendor.dtd" or a true subset thereof. Vendor.dtd is an extract from TEI and uses TEI semantics.

Quantity of data

This is a small project: about 2000 printed pages, plus less than a thousand pages of typescript, yielding probably less than 10 MB of raw text.

Keying/Coding Guidelines

Text

Text to record

With a few standard exceptions noted below, the entire text will be recorded in the order it was intended to be read (top left to bottom right, left column before right column, etc.).

Text to record as attribute values only

Page numbers will be preserved only as the value of the "N" attribute of the (page-break) tag. Unnumbered pages in a sequence of numbered pages should receive a <PB> tag with a supplied page number in brackets as the value of the N attribute: <PB N="[47]">. Unnumbered pages for which it is difficult to determine the correct page number should receive a <PB> tag with the N attribute omitted.

Placement of <PB> tags. The rules are: (1) "pages always break at the top"; that is, <PB> tags will be inserted in the text at the actual location of the page break, regardless of the location of the page number on the printed page. (2) "Words cannot break at page breaks"; that is, if a hyphenated word straddles a page break, finish the word and any attached punctuation, then insert the tag. And (3) "Divisions begin at page breaks; they don't end there"; that is, if a structural break of some kind coincides with the page break (e.g., if a new section, paragraph, etc., begins at the head of the new page), the <PB> tag should be tucked inside the opening tag for the new division, NOT inside the closing tag for the old division.

Provide other attribute values (e.g. the TYPE attribute of the DIV elements) only when instructed to and when there is specific information to supply. Do not supply values of this sort: TYPE="unknown" or TYPE="unspecified".

Text not to record at all

Running headers; text within illustrations (except captions); handwritten notes or additions; and purely decorative typographic markings should not be recorded at all.

Formatting

Character- and word-level formatting

Block-level formatting

Structure

Characteristic structures

Most of these books contain ordinary prose, interrupted by some lists, tables, and an occasional illustration or block quotation. The prose should be tagged with paragraph tags (<P>); the lists, tables, illustrations, and block quotations with the <LIST>, <TABLE>, <FIGURE>, and <Q> tags respectively. Captions attached to an illustration should be captured using the <HEAD> tag placed within an otherwise empty <FIGURE> tag: <FIGURE><HEAD>The Michigan Creed</HEAD></FIGURE>. Figures without captions should be recorded by empty <FIGURE> tags.

Do not attempt to use the special TEI tags for title pages; instead treat all the front matter for each volume as a text division (<DIV> [see below]), and record the text as if it consisted of ordinary paragraphs, lists, etc. The front and back sides of the title pages, in particular, can be encoded as single paragraphs, with <LB> (line-break) tags between the lines: <P>THE<LB>UNIVERSITY OF MICHIGAN<LB>AN ENCYCLOPEDIC SURVEY<LB><I>In Four Volumes</I><LB>WALTER A. DONNELLY, <I>Editor</I> (etc.)

Though few or no examples may appear, the dtd contains tags for other features that may be encountered, such as verse stanzas (<LG>s) and verse lines (<L>s). Use them if necessary, following the usual TEI semantics.

The books consist of signed articles, gathered by subject matter, and often subdivided into subdivisions. Use headings (to be recorded with <HEAD>) as a guide to the existence of a text division (<DIV1>) or subdivision (<DIV2>, <DIV3>, <DIV4>, etc. down to <DIV7>).

Most articles contain a bibliography; some may contain several. Record a bibliography as a <DIV TYPE="bibliography"> containing a series of <BIBL> elements, each entry in the bibliography to be contained within a single <BIBL>. Italicized text within a bibliographic entry should be encoded with the <TITLE> tag; authors' names within a bibliographic entry (which are denoted by small caps in the print) should be encoded with the <AUTHOR> tag. All other text within the entry should be left as #PCDATA.

Most articles are signed, usually with a single name; sometimes with two; sometimes with a descriptive paragraph. Encode the signature, including all attached text, with a <BYLINE> tag; the names of the authors themselves should be tagged with <DOCAUTHOR> tags.

Some authors' names are supplied with a death date in a note, e.g.:

Orma F. Butler*

*Deceased, June 16, 1938.

Since <NOTE> is not allowed within <BYLINE>, simply omit the asterisks and place the information within brackets inside the <BYLINE> tag: <BYLINE><DOCAUTHOR>Orma F. Butler</DOCAUTHOR> [Deceased, June 16, 1938]</BYLINE>

Overall structure

In the edition that was scanned, the Encyclopedic Survey was issued as 6 volumes and a number of loose supplements.

Volumes 1-4 are subdivided into Parts (originally each Part was issued separately). Volumes 5 and 6 are spiral-bound additions issued without constituent Parts. The "Supplements" are further additions that have never been issued, but survive only as individual entries in typescript. The relationships of the parts to the volumes (and their correspondence with the files on the CD) are shown below:

Bibliographies

Many of the entries in this encyclopedic survey are accompanied by a brief bibliography, usually marked off by a heading: "SELECTED BIBLIOGRAPHY." Treat these bibliographies as constituent parts of the entry, encoding them with a <DIV> tag one level lower than that assigned to the entry itself. The main text of the entry, if it is not already divided into sections, should be placed in a corresponding <DIV> tag. The bibliography should have TYPE="bibliography" and the main text TYPE="article", like this (undivided article):

<DIV3 TYPE="entry">
  <DIV4 TYPE="article"></DIV4>
  <DIV4 TYPE="bibliography"> </DIV4>
</DIV3>
or like this (subdivided article):
<DIV3 TYPE="entry">
  <DIV4 TYPE="section"></DIV4>
  <DIV4 TYPE="section"></DIV4>
  <DIV4 TYPE="section"></DIV4>
  <DIV4 TYPE="bibliography"> </DIV4>
</DIV3>
To avoid over-complicating the chart below, the existence of discrete bibliographies has been indicated there only rarely. But bear in mind that every <DIV TYPE="entry"> potentially includes distinct <DIV>s for the article proper and for the bibliography.
DIV structure
<DIV1>

Assign eight <DIV1>s: one to each of the six "volumes," one to the "Index" at the end of volume 4, and one to the unnamed "volume" of Supplements at the end. All but the Index should have TYPE="volume"; the Index should use TYPE="index".

<DIV2>

Assign <DIV2> with TYPE="part" to each of the nine numbered "Parts."

Assign <DIV2> with TYPE="entry" to each of the entries in volumes 5 and 6 (which lack separate Parts). Entries are identified by headings in large caps and running headers.

Assign <DIV2> with TYPE="entry" to each of the entries in the Supplement (identified by headings and often by pagination that re-starts at "1").

Assign <DIV2> with TYPE="front" to the collected front matter at the head of each individual volume: title pages, forewords, tables of contents, lists of illustrations, etc. (The unpublished Supplement lacks such front matter).

<DIV3>

Assign <DIV3> with TYPE="subpart" to large subdivisions within most of the Parts in volumes 1-4. These are distinguished by having headings that stand alone on an otherwise blank page.

A few Parts have no subparts; in that case, assign <DIV3> with TYPE="entry" to the individual entries within that Part. Entries are usually identified by headings in large caps and running headers,

Assign <DIV3> with various values of "TYPE" to distinguish the various parts of the front matter to each volume that was grouped as a <DIV2>. E.g. TYPE="title page" TYPE="contents" TYPE="preface" etc.

Assign <DIV3> with TYPE="section" to any heading-equipped subsection of an entry that was itself tagged as a <DIV2>. If the heading is numbered or lettered ("B. Employment Compensation"), record that number or letter as the value of the "N" attribute of the <DIV3> element: <DIV3 TYPE="section" N="B">. Bibliographies attached to entries count as sections of that entry (though their TYPE="bibliography", not TYPE="section"). See the note on Bibliographies above.

<DIV4>

Assign <DIV4> with TYPE="entry" to individual entries in the "subparts".

Assign <DIV4> with TYPE="section" to any heading-equipped subsection of an entry that was itself tagged as a <DIV3>. If the heading is numbered or lettered ("B. Employment Compensation"), record that number or letter as the value of the "N" attribute of the <DIV4> element: <DIV4 TYPE="section" N="B">. Bibliographies attached to entries count as sections of that entry (though their TYPE="bibliography", not TYPE="section"). See the note on Bibliographies above.

<DIV5> - <DIV7>

If necessary, assign <DIV5> with TYPE="section" to any heading-equipped subsection of an entry that was itself tagged as a <DIV4>. If the heading is numbered or lettered ("B. Employment Compensation"), record that number or letter as the value of the "N" attribute of the <DIV5> element: <DIV5 TYPE="section" N="B">. Bibliographies attached to entries count as sections of that entry (though their TYPE="bibliography", not TYPE="section"). See the note on Bibliographies above.

If further subdivision is necessary, use <DIV6> and <DIV7>.

The following chart is intended to be merely illustrative; it is far from complete.

<BODY>
<DIV1 TYPE="volume">Vol. 1
  <DIV2 TYPE="front">
     <DIV3 TYPE="title page">
     <DIV3 TYPE="preface">
     <DIV3 TYPE="contents">
     <DIV3 TYPE="illustrations">
  <DIV2 TYPE="part">Part 1
     <DIV3 TYPE="entry">...State Education
        <DIV4 TYPE="article">
        <DIV4 TYPE="bibliography">
     <DIV3 TYPE="entry">Early history ...
        <DIV4 TYPE="article">
        <DIV4 TYPE="bibliography">
     <DIV3 TYPE="entry">The Administration of Henry Philip Tappan
     <DIV3 TYPE="entry">The Administration of Erastus Otis Haven
     [etc.]
  <DIV2 TYPE="part">Part 2
     <DIV3 TYPE="subpart">Organization
        <DIV4 TYPE="entry">University Senate...
        <DIV4 TYPE="entry">University Council
        <DIV4 TYPE="entry">Office of the Registrar
        [etc.]
     <DIV3 TYPE="subpart">Services
        <DIV4 TYPE="entry">School of Education...
        <DIV4 TYPE="entry">Bureau of Appointments...
        <DIV4 TYPE="entry">Bureau of Co-operation...
        [etc.]
     <DIV3 TYPE="subpart">Alumni
        <DIV4 TYPE="entry">The Alumni Association
        <DIV4 TYPE="entry">The Alumni Advisory Council
        [etc.]
     <DIV3 TYPE="subpart">Faculty Clubs
        <DIV4 TYPE="entry">Research Club...
        <DIV4 TYPE="entry">Junior Research Club...
        [etc.]
        
<DIV1 TYPE="volume">Vol. 2
  <DIV2 TYPE="front">
     <DIV3 TYPE="title page">
     <DIV3 TYPE="contents">
     <DIV3 TYPE="illustrations">
  <DIV2 TYPE="part">Part 3
     <DIV3 TYPE="entry">Administration and Curriculums
     <DIV3 TYPE="entry">University System
     <DIV3 TYPE="entry">Department of Anthropology
     <DIV3 TYPE="entry">Department of Astronomy
     <DIV3 TYPE="entry">Astronomical Observatories
     [etc.]
  <DIV2 TYPE="part">Part 4
     <DIV3 TYPE="subpart">College of Literature ... II
        <DIV4 TYPE="entry">The Department of Greek
        <DIV4 TYPE="entry">The Department of History
        <DIV4 TYPE="entry">The Department of Journalism
        [etc.]
     <DIV3 TYPE="subpart">The Summer Session
  <DIV2 TYPE="part">Part 5
     <DIV3 TYPE="subpart">The Medical School
        <DIV4 TYPE="entry">Administration and Curriculums
        <DIV4 TYPE="entry">The Department of Anatomy
        <DIV4 TYPE="entry">The Department of Bacteriology...
        <DIV4 TYPE="entry">The Department of Biological Chemistry...
        [etc.]
     <DIV3 TYPE="subpart">University Hospital
        <DIV4 TYPE="entry">The University Hospital
        <DIV4 TYPE="entry">The School of Nursing
        <DIV4 TYPE="entry">The Simpson Memorial Institute...
     <DIV3 TYPE="subpart">Homeopathic Medical College
        <DIV4 TYPE="entry">The Homeopathic Medical College
     <DIV3 TYPE="subpart">The Law School
        <DIV4 TYPE="entry">The Law School

<DIV1 TYPE="volume">Vol. 3
  <DIV2 TYPE="front">
     <DIV3 TYPE="title page">
     <DIV3 TYPE="contents">
     <DIV3 TYPE="illustrations">
  <DIV2 TYPE="part">Part 6
     <DIV3 TYPE="subpart">Horace H. Rackham School...
        <DIV4 TYPE="entry">The Horace H. Rackham School ...
        <DIV4 TYPE="entry">The Institute for Human Adjustment
        <DIV4 TYPE="entry">The Institute of Public and Social Administration
        <DIV4 TYPE="entry">The Bureau of Government
     <DIV3 TYPE="subpart">School of Business Administration
        <DIV4 TYPE="entry">The School of Business Administration
     <DIV3 TYPE="subpart">School of Education
        <DIV4 TYPE="entry">The School of Education
        <DIV4 TYPE="entry">The University High School
        [etc.]
     <DIV3 TYPE="subpart">School of Forestry and Conservation
        <DIV4 TYPE="entry">The School of Forestry and Conservation
     <DIV3 TYPE="subpart">University Musical Society...
        <DIV4 TYPE="entry">University Musical Society...
     <DIV3 TYPE="subpart">Institute of Fine Arts
        <DIV4 TYPE="entry">Institute of Fine Arts
        <DIV4 TYPE="entry">Research Seminary in Islamic Art
     <DIV3 TYPE="subpart">Division of Hygiene and Public Health
        <DIV4 TYPE="entry">Division of Hygiene and Public Health
  <DIV2 TYPE="part">Part 7
     <DIV3 TYPE="subpart">College of Engineering
        <DIV4 TYPE="entry">College of Engineering
        <DIV4 TYPE="entry">Department of Aeronautical Engineering
        <DIV4 TYPE="entry">Department of Chemical ... Engineering
        <DIV4 TYPE="entry">Department of Civil Engineering
        [etc.]
     <DIV3 TYPE="subpart">College of Architecture and Design
        <DIV4 TYPE="entry">College of Architecture and Design
     <DIV3 TYPE="subpart">School of Dentistry
        <DIV4 TYPE="entry">School of Dentistry
     <DIV3 TYPE="subpart">College of Pharmacy
        <DIV4 TYPE="entry">College of Pharmacy
     <DIV3 TYPE="subpart">Department of Military Science...
        <DIV4 TYPE="entry">Deparment of Military Science...

<DIV1 TYPE="volume">Vol. 4
  <DIV2 TYPE="front">
     <DIV3 TYPE="title page">
     <DIV3 TYPE="contents">
     <DIV3 TYPE="illustrations">
  <DIV2 TYPE="part">Part 8
     <DIV3 TYPE="subpart">The Libraries
        <DIV4 TYPE="entry">University Library
               <DIV5 TYPE="section">The University Library to 1941
	       <DIV5 TYPE="section">Regulations for the Library
	       <DIV5 TYPE="section">The University Library, 1941-1953
               <DIV5 TYPE="bibliography">Selected Bibliography
        <DIV4 TYPE="entry">Law Library
               <DIV5 TYPE="article">
               <DIV5 TYPE="bibliography">Selected Bibliography
        <DIV4 TYPE="entry">Clements Library
     <DIV3 TYPE="subpart">University of Michigan Press
        <DIV4 TYPE="entry">University of Michigan Press
               <DIV5 TYPE="section"<The Press
	       <DIV5 TYPE="section"<University Publications
	          <DIV6 TYPE="section">General series
	          <DIV6 TYPE="section">Departmental Series
	       <DIV5 TYPE="bibliography"<Bibliography
	       <DIV5 TYPE="section"<Official Publications
	       <DIV5 TYPE="section"<Michigan Alumnus Quarterly Review
     <DIV3 TYPE="subpart">Museums and Collections
        <DIV4 TYPE="entry">University Museums
        <DIV4 TYPE="entry">Herbarium
        <DIV4 TYPE="entry">Kelsey Museum
        [etc.]
     <DIV3 TYPE="subpart">School of Public Health
        <DIV4 TYPE="entry">School of Public Health
     <DIV3 TYPE="subpart">The Institutes
        <DIV4 TYPE="entry">Institute of Human Biology
        [etc.]
     <DIV3 TYPE="subpart">Television Broadcasting Service
        <DIV4 TYPE="entry">Television at the University
        [etc.]
     <DIV3 TYPE="subpart">Buildings and Lands
        <DIV4 TYPE="entry">Administration Building
        <DIV4 TYPE="entry">Alumni Memorial Hall
        [etc.]
  <DIV2 TYPE="part">Part 9
     <DIV3 TYPE="subpart">Student Life and Organizations
        <DIV4 TYPE="entry">Enrollment Survey
        <DIV4 TYPE="entry">Fees and Expenses
        <DIV4 TYPE="entry">Student Traditions and Customs
        [etc.]
     <DIV3 TYPE="subpart">Campus Societies
        <DIV4 TYPE="entry">Campus Societies
        <DIV4 TYPE="entry">Honorary Scholastic Societies
        [etc.]
     <DIV3 TYPE="subpart">Athletics and Physical Education
        <DIV4 TYPE="entry">Board in Control of Inter-Collegiate Athletics
        <DIV4 TYPE="entry">Physical Education for Men
        [etc.]
     
<DIV1 TYPE="index">[Index to Parts 1-9.]
    
<DIV1 TYPE="volume">Vol. 5
  <DIV2 TYPE="front">
  <DIV2 TYPE="entry">The Ruthven Administration
  <DIV2 TYPE="entry">The Hatcher Administration
  <DIV2 TYPE="entry">Vice-President for University Relations ...
  <DIV2 TYPE="entry">Vice-President for State Relations ...
  <DIV2 TYPE="entry">Vice-President for Research
  <DIV2 TYPE="entry">College of Architecture and Design
  <DIV2 TYPE="entry">School of Business Administration
  [etc.]
  
<DIV1 TYPE="volume">Vol. 6
  <DIV2 TYPE="front">
     <DIV3 TYPE="title page">
     <DIV3 TYPE="contents">
     <DIV3 TYPE="foreword">    
  <DIV2 TYPE="entry">Vice-President and Chief Financial Officer
     <DIV3 TYPE="section">
     <DIV3 TYPE="section" N="I">Organization and Change
     <DIV3 TYPE="section" N="II">Financial Support and Activities
        <DIV4 TYPE="section" N="A">Current Operating Funds
        <DIV4 TYPE="section" N="B">State Appropriations
        <DIV4 TYPE="section" N="C">Federal Support
        <DIV4 TYPE="section" N="D">Student Fees
        [etc.]
     <DIV3 TYPE="section" N="III">Personnel
        <DIV4 TYPE="section" N="A">Employment
        <DIV4 TYPE="section" N="B">Employee Classifications
        <DIV4 TYPE="section" N="C">Employee Unions
           <DIV5 TYPE="section">
           <DIV5 TYPE="section">International Union of Operating Engineers
           <DIV5 TYPE="section">Washtenaw County Local Building Trades...
           <DIV5 TYPE="section">American Federation of State...Employees...
           [etc.]
        <DIV4 TYPE="section" N="D">Staff Benefits
           <DIV5 TYPE="section">Vacation and Holidays
           <DIV5 TYPE="section">Disability...
           [etc.]
           
     <DIV3 TYPE="section" N="IV">Physical Properties
        <DIV4 TYPE="section" N="C">Buildings
           <DIV5 TYPE="section" N="3">North Campus
           
  <DIV2 TYPE="entry">College of Literature, Science, and the Arts  
  
<DIV1 TYPE="volume">Supplementary files
  <DIV2 TYPE="entry">The Program in American Culture ...
  <DIV2 TYPE="entry">The Program in Comparative Literature
  <DIV2 TYPE="entry">The Honors Program
  <DIV2 TYPE="entry">Program on Studies in Religion
  <DIV2 TYPE="entry">Women's Study Program
  <DIV2 TYPE="entry">Student Life since 1945
  [etc.]

Character data

Illegible/indecipherable text