CME FAQ Sheet #7: Useful Strategies and Tips

These problems/corrections were encountered in texts already completed. They are listed here against the possibility of usefulness in other texts.

  1. Fractions in Dates
  2. Search for fraction entities in Author/Editor. (Check the Entity Set: Local & Active under "Entities/Insert Entities." The fractions listed there, e.g. "frac12," "frac34," etc., are the only ones that need checking.) Some dates have been recorded (in the original texts) as, for example, 133¾. meaning 1333/4. The vendor has in some cases mimicked the ¾ with a fraction entity. These should be changed to 3/4. Some dates where fractions occur have been marked with $x$. Change these to the correct form as above.

  3. Some Abbreviations
  4. "dj." (with a swung dash (~) above the "j") =

    <ABBR>dj</ABBR>

    "&c9." (where "9" is a character that looks more or less like a superscript "9") =

    ABBR EXPAN="et cetera">&c.</ABBR>

    "l." (with a loop like a backward "c" crossing the upright, then looping up and back over it) =

    <ABBR EXPAN="et cetera"><GAP></ABBR>

  5. pinarg.pl
  6. A Perl script called pinarg.pl exists for removing excess paragraph tags from <ARGUMENT> elements. (It was devised for the Oseney Register, but may prove useful elsewhere.)

    1. Using Command Prompt, go to the proper directory for the .sgm file. E.g.:
      cd C:\Markup\mecorp\text\XX\toproof
      where XX is the two-letter alphabetic identifier for the file.
    2. At the prompt
      C:\Markup\mecorp\text\XX\toproof>
      run the Perl script
      perl -i.bak C:\Markup\code\perl\cmeB\pinarg.pl xxxxxxx.sgm
      where xxxxxxx is the seven-letter/digit NOTIS identifier.

    This will produce a new .sgm file with a single <P> per <ARGUMENT> and save the previous .sgm file as xxxxxxx.sgm.bak.

  7. Proofing Tables
  8. For a quick look at tables in order to see if they are properly handled and generally reflect what is in the original text, do the following:

    1. Copy the tables from the .sgm file in TextPad.
    2. Open a new document and paste the copy there.
    3. Change all SGML <CELL> and </CELL> tags to HTML <td> and </td> tags.
    4. Change all SGML <ROW> and </ROW> tags to HTML <tr> and </tr> tags.
    5. Make the temporary file an .html file and open it for viewing and proofing in a browser.

  9. labinarg.pl
  10. A Perl script called labinarg.pl exists for removing <LIST>, <LABEL>, and <ITEM> tags from <ARGUMENT> elements.

    1. Using Command Prompt, go to the proper directory for the .sgm file. E.g.:
      cd C:\Markup\mecorp\text\XX\toproof
      where XX is the two-letter alphabetic identifier for the file.
    2. At the prompt
      C:\Markup\mecorp\text\XX\toproof>
      run the Perl script
      perl -i.bak C:\Markup\code\perl\cmeB\labinarg.pl xxxxxxx.sgm
      where xxxxxxx is the seven-letter/digit NOTIS identifier.

    This will produce a new .sgm file with <LIST>, <LABEL>, and <ITEM> tags removed from <ARGUMENT>s. It will save the previous .sgm file as xxxxxxx.sgm.bak.

  11. Deleting Tags in A/E
  12. For quick removal of specific tags in Author/Editor: place cursor directly to right of opening tag and type Ctrl. + D.

  13. italand.pl: for removing italic ands
  14. A Perl script called italand.pl exists for removing italic "and"s from Roman texts.

    1. Using Command Prompt, go to the proper directory for the .sgm file. E.g.:
      cd C:\Markup\mecorp\text\XX\toproof
      where XX is the two-letter alphabetic identifier for the file.
    2. At the prompt
      C:\Markup\mecorp\text\XX\toproof>
      run the Perl script
      perl -i.bak C:\Markup\code\perl\cmeB\italand.pl xxxxxxx.sgm
      where xxxxxxx is the seven-letter/digit NOTIS identifier.

    This will produce a new .sgm file with italic "and"s removed; it will save the previous .sgm file as xxxxxxx.sgm.bak.

  15. Using Structure View in Author/Editor to Add TYPE Attributes to <DIV>s
  16. Show Structure View (hotkey: F11) is useful not only for seeing how a document has been organized into <DIV>s and other elements, but also for adding missing TYPEs (and other attributes) to those elements more quickly than to a complete document. The additions can be done individually by placing the cursor to the right of the <DIV> and opening Edit Attributes (hotkey: F6) under Markup. Or additions (or changes) can be made globally by opening Find and Replace (hotkey: Ctrl-F) under Find. E.g.:

    Find: <DIV2

    Replace: <DIV2 TYPE="chapter"

  17. Find and Replace for Numbers: Basic Tool for TextPad
  18. Using TextPad, in order to find and replace globally an item that includes a sequence of numbers, the following is useful: N="\([0-9]+\)" E.g.:

    Find: N="\([0-9]+\)"

    Replace: N="\1a"

    The replacement will add the letter "a" to each number in the sequence.

  19. <GAP>s and <ABBR>s
  20. Use the <GAP> element to indicate printed marks that are otherwise unpresentable. The DESC attribute is available to describe the marks but in general is used only to name unpresentable languages, e.g., Greek, Hebrew, etc. Thus passages in Greek may be marked as <GAP DESC="Greek">.

    Use the <ABBR> element to indicate abbreviations. (These may include printed marks that are otherwise unpresentable, and which may be indicated with a <GAP> within the <ABBR> and </ABBR> tags.) The EXPAN attribute is available to spell out the unabbreviated word. Some examples:

    1. The mark for "dram" resembles the number "3". This should be presented as:
      <ABBR EXPAN="[dram]"><GAP></ABBR>
    2. A word in a text reads "malis". In a note to the line the word is given as "mal" with an additional unidentified mark. The word in the note should be presented as:
      <ABBR>mal</ABBR>
    3. Common abbreviations such as Ihc, Ihu, Jhc, and other variants should be presented as:
      <ABBR>Ihc</ABBR>

Index