MEC digitization goals July 1999-June 2000

  1. CONTENT CREATION
    1. Add information to existing MEC components
      • Complete review of HB batches, including new OD-col batches.
      • Re-order entries in HB as necessary to accommodate 'page-turner'.
      • Add references in HB to new resources as they become available.
      • Add links in HB to e-texts in CME and elsewhere.
      • Ensure that new MED stencils are added to HB
      • Add information generated by harmonisation process.
      • Add HB entries for most or all MED OD-col. citations & perh. some of the bracketed ones.
      • Create policy, evaluation process, and maintenance process for external resources page.
      • Create links page.
    2. Streamline current MEC content-creation processes
      • Continue to document all processes.
      • Continue to alter / create DTDs to match new information.
      • Document DTDs for greater portability.
    3. Begin to plan further additions to existing MEC components
      • Plan minimal-level maintenance of MEC.
      • Plan addition of existing Supplement material to MED.
      • Consider costs of adding value: a new bibliography of source texts used by MED; cross-reference map to OED/DOST/DOE entries, etc.
      • Sketch out requirements of major revision, potentially including even a new reading program and 'distributed' editorial revision.
      • Attempt to arrive at cost figures for present and future content creation.
  2. DATA CONVERSION
    1. Production of new CME texts from PD print sources.
      1. Preparation (approx. one-two months)
        • Finalize list of desired texts; arrange texts in priority order.
        • Ascertain availability of texts; decided on source for every text.
        • Ascertain and document special requirements of certain texts (en face presentation, etc.)
        • Adapt HTI vendor specs to CME needs, espec. for texts with special requirements. Specify treatment of characters and abbreviations. Annotate and illustrate a DTD with ME examples. Specify transcription accuracy. Quantify and specify markup accuracy. Quantify and specify markup completeness/fullness.
        • Contact vendor(s); negotiate and discuss these matters with vendor(s).
        • Recruit and train staff to proof texts and coding, do text preparation, etc.
        • Disbind books, scan, send copy to vendors (image files and/or page prints).
        • Create image/text ID system. Deploy it.
        • Create system of cover letters, mailing procedures, etc., including provision for exceptions. Deploy it.
        • Create system for 5% proofing (including error-correction, overlapping proofing, etc.). Deploy it.
        • Create system for code-proofing, markup guidelines, etc., including feedback provisions.
        • Create system for archiving, version control, process control.
        • Enlist cataloguers in creation of headers and Mirlyn records.
      2. Test phase (approx. one-two months)
        • Send initial shipments through process: acquisition, disbinding, imaging, shipment, receipt, proofing, code-checking and validation, etc.
        • Evaluate every stage: accuracy, text preparation and mailing, coding, vendor specs, cost; adjust as necessary.
      3. Production phase (approx. six-seven months)
        • Establish production quota, probably in pages or bytes, not books, based on the grant-based budget and the cost as established in the test phase.
        • Establish a routine that will meet the quota.
        • Produce from 50-75 new texts, or more, dependent on text size, for the Corpus of ME Verse and Prose.
        • Document (and circulate documentation for) all processes to ensure consistency, continuity, repeatability, and efficiency.
    2. Addition of fascicles to MED from SGML source.
      • Use existing process to convert new MED production (4 fascs.) to MEC use.
      • Develop processes to ensure that all changes undergone by other e-MED fascicles are shared by the new additions (enhanced tagging, entry IDs, etc., ambiguous stencil resolution, stencil merger, etc.)
      • Provide for as nearly automatic inclusion of new fascicles as possible.
      • Consult with MED about introducing MEC changes to MED rules file.
    3. Addition of outside e-texts to CME.
      • Decide if it is a good idea.
      • Develop a policy to ensure that it stays a good idea.
      • Develop tagging guidelines, etc.
      • Develop a workflow to evaluate incoming data.
      • Evaluate MEC capacity to process these texts.
      • Proceed or not.
    4. Addition of other material.
      • Continue negotiations with EETS, e-IMEV, and others.
      • Attempt to arrive at cost figures for data conversion.
      • Maintain contacts with related projects with a view toward future collaboration.
  3. INTERPRETATION/EXPANSION/MODIFICATION of content and tagging
    1. MED
      • Expand "Ibid.'s" both automatically and (as necessary) manually, using careful "stop-stencil" list to inhibit automatic replacement of stencils that may have started life as ambiguous.
      • Expand some forms (<ORTH>); automatically as far as feasible; manually as far as affordable.
      • Expand some LANG, USG, ?and POS elements; as many as useful.
      • Experiment with other expansions: swung dash; phrases; misc. abbreviations that are not tagged presently; even reticent etymologies.
      • Re-order quotations in date sequence.
      • Respond to error reports from users; resolve backlog, including ETH/YOGH problem in "S".
      • Solicit additional reports.
      • Add functional tagging to MED: XREF/POS, XR, LBL
      • Add IDs to entries (?and senses).
      • Add target RIDs to XREFs
      • Test existing tagging for completeness: LANG, POS.
      • Evaluate additional tagging allowed by plateau.dtd (cost/benefit), e.g. <ETY> <NOT-ME> <PHRASE> <ABBR> etc.
      • Evaluate additional linking (cost/benefit), e.g. from phrase elements to headwords.
      • Evaluate more consistent tagging (cost/benefit); e.g. tagging <USG> elements not signalled by italics.
      • Alter Plateau.dtd needed.
      • Figure out how to add the cost/beneficial tags and do it.
    2. HB
      • Complete review of HB batches, including new OD-col batches.
      • Expand bibliographic abbreviations (journal titles, etc.)
      • Add new material to expansion scripts as needed.
      • Alter DTDs, etc., as needed to accommodate expansions.
      • Communicate expansions to InfoRetrieval staff.
      • Respond to error reports, internal and external; resolve backlog.
      • Re-order entries in HB as necessary to accommodate 'page-turner'.
    3. MED/HB harmonization
      • Map MED stencils to existing HB stencils throughout.
      • Create new (or expand old) HB entries to account for all undocumented MED stencils.
      • Modify ambiguous or irregular MED stencils to allow safe merger.
      • Map modified stencils to HB stencils.
      • Use merger process to regularize all MED stencils to canonical HB form and create links from MED to HB.
      • Go through it all again for remaining stencils (leftovers; products of Ibid.-removal).
      • Test and refine map/merge process toward transparency and routine execution.
      • Convert MED stencil tagging to HB standard.
      • Modify HB standard to provide MED-standard display.
    4. CME
      • Resist the temptation to expand or interpret at all.
      • Put correction mechanism in place responding to output of QC processes.
  4. GENERAL INFRASTRUCTURE MAINTENANCE