( Title: Line-File Input File: lfinput.fs Author: David N. Williams License: LGPL Version: 0.5.6 Last revision: August 21, 2021 ) \ Copyright (C) 2004, 2006, 2021 by David N. Williams ( This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or at your option any later version. This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. If you take advantage of the option in the LGPL to put a particular version of this library under the GPL, the author[s] would regard it as polite if you would put any direct modifications under the LGPL as well, and include a copy of this request near the beginning of the modified library source. A "direct modification" is one that enhances or extends the library in line with its original concept, as opposed to developing a distinct application or library which might use it. ) \ *** PREAMBLE ( The words in this file are mostly ANS Forth compatible up to case dependence and [UNDEFINED]. The general idea is to support an analog of input stream parsing for lines read from a file not directly attached to the input stream by the normal ANS Forth include mechanism. This implementation is based on READ-LINE, with line input into a dedicated buffer required to be large enough for the whole line, plus line end overhead. An error is thrown when a line doesn't fit with a single read. A nice part of the specification for READ-LINE is that it efficiently handles the case where a buffer is shorter than a line. That gets in the way of the input stream analogy here, so we don't use it. The scheme was devised after digesting the enormous "ANS READ-LINE question" thread in comp.lang.forth, June, 1999. ) decimal \ *** USER PARAMETERS ( The default values of the constants USE-PFE-PARSING, MAX-FILE-DEPTH and MAX-/LINE may be changed by defining them prior to loading this code. When MAX-FILE-DEPTH has its default value one, code for nesting the line-file input stream is omitted. When its value is larger than one, a file stack is allocated and code to implement the saving and restoring of the line-file input stream gets compiled. ) [UNDEFINED] USE-PFE-PARSING \ default uses ANS definitions [IF] false CONSTANT USE-PFE-PARSING [THEN] [UNDEFINED] MAX-FILE-DEPTH \ default is no nesting [IF] 1 constant MAX-FILE-DEPTH [THEN] [UNDEFINED] MAX-/LINE \ non-eol chars, used for line-file input [IF] 256 constant MAX-/LINE [THEN] \ The following is for SKIP, SCAN, SEPARATE, S-AFTER, \ FIRST-WORD, and SEPARATE-WORD: USE-PFE-PARSING [IF] loadm parsing \ external module parsing-ext.c [ELSE] s" parsing.fs" included [THEN] \ *** WORDS ( Line-file input data: LFID LFPOS LFIB >LFIN #LFIB Line-file parse area: LFSOURCE LFPARSE-AREA@ LFPARSE-AREA! EMPTY-LFPARSE-AREA LFPARSE LFPARSE-NAME Line-file input stream: LFREFILL EMPTY-LFINPUT REWIND-LFINPUT NEXT-LFINPUT-NAME S-LFINPUT-AFTER ) \ *** LFINSTREAM NESTING WORDS \ (compiled if MAX-FILE-DEPTH > 1) ( Compiled if USE-PFE-PARSING is false: FILE-S0 FILE-SP FILE-STACK >FILE-STACK FILE-STACK> FILE-STACK-DEPTH FILE-SP0! SAVE-LFIN RESTORE-LFIN ) \ *** TERMINOLOGY AND NOTATION ( "fstring": ANS Forth string represented as [addr len] "s", or ".s" suffix: In stack comments, short for [addr len] for an fstring. "lfin", "lfinstream": Synonyms for line-file input or line-file input stream. "eol": End of line. "lfib": Line-file input buffer, a buffer with maximum logical size MAX-/LINE characters, actually followed by room for an overflow character and two possible eol characters. "lfin position": The character position at offset >LFIN from the beginning of the lfib. "exhausted lfin": The lfin position is just after the end of the text in the lfib; i.e., >LFIN = #LFIB. The naming policy for words is intended to be consistent with that in parsing-strings.fs and parsing-input.fs, with an extra "LF" prefix to distinguish parsing line-file input from parsing the ordinary input stream. ) \ *** LINE-FILE INPUT DATA 0 value lfid \ current line-file id 2variable lfpos \ line-file position 0. lfpos 2! 0 value >lfin \ lfib current character offset 0 value #lfib \ lfib number of non-eol characters ( Two of the extra characters ALLOTed below are for possible READ-LINE eols. The third is for a possible overflow character to accommodate the "complete line" signal used by LFREFILL: # non-eol chars read < # requested That is, LFREFILL requests a read of up to one more non-eol characters than the allowed line length, and throws an error if it actually gets it. ) create lfib MAX-/LINE 3 + allot \ *** LINE-FILE PARSE AREA : lfsource ( -- 'lfib #lfib ) lfib #lfib ; : lfparse-area@ ( -- unparsed.s ) ( Leave the line-file parse area portion of the lfib as an fstring. ) lfsource ( &inbuf #inbuf) >lfin ( #parsed) /string ; : lfparse-area! ( unparsed.s -- ) ( The fstring unparsed.s is assumed to lie at the end of the lfib. Advance the lfinstream so that unparsed.s is the line-file parse area. ) ( addr len) #lfib swap - to >lfin ( addr) drop ; : empty-lfparse-area ( -- ) ( Exhaust the line-file input buffer. ) #lfib to >lfin ; : lfparse ( char -- parsed.s flag ) ( The fstring parsed.s starts at the current lfin position and ends just before the first occurrence of char, or of whitespace in case char is BL, in the line file parse area. If char is found, flag is true, and the lfin position is set to just after the first instance of char. If not found, flag is false, parsed.s represents the initial parse-area, and the parse-area is left exhausted. ) lfparse-area@ rot separate ( after.s parsed.s true | parsed.s false) IF 2swap lfparse-area! true ELSE empty-lfparse-area false THEN ; : lfparse-name ( -- word.s ) lfparse-area@ separate-word 2swap lfparse-area! ; \ *** LINE-FILE INPUT STREAM : lfrefill ( -- success.flag ) ( Set LFPOS to the current LFID file position, for a possible SAVE-LFIN. Use READ-LINE to read up to MAX-/LINE + 1 non-eol characters into the lfib plus extra bytes at its end, starting at the file position for LFID. If READ-LINE returns a nonzero ior, throw an error. If more than MAX-/LINE non-eol characters are read, throw an error. Otherwise set #LFIB to the number of characters returned by READ-LINE [which does not include eols], and set >LFIN to zero. The flag is false if and only if the line was both empty and not followed by eol characters, so the end of the file was reached. #LFIB and >LFIN are still set [to zero]. ) lfid file-position throw lfpos 2! lfib [ MAX-/LINE 1+ ] literal lfid read-line throw swap ( flag #chars) \ At this point ANS says flag is false iff end of file was \ reached before any chars were read, including eols. ( #chars) dup [ MAX-/LINE 1+ ] literal = ABORT" ***Line too long." ( flag #chars) to #lfib 0 to >lfin ; : empty-lfinput ( -- ) lfid file-size throw lfid reposition-file throw ; : rewind-lfinput ( -- ) 0. lfid reposition-file throw ; : next-lfinput-name ( -- word.s ) ( Parse the next word from the lfinstream across lines. If word.s is empty, at most whitespace was found. Based on Wil Baden's NEXT-WORD. ) BEGIN lfparse-name dup IF EXIT THEN lfrefill WHILE 2drop REPEAT ; : s-lfinput-after ( s -- flag ) ( Advance the lfinstream to just after the next occurrence of the string across lines, and leave true if found. If the string is not found before the end of the file, leave the lfinstream positioned there with false. ) ( s) 2>r BEGIN lfparse-area@ 2r@ ( area.s s) search ( area.s false | &found #rem true) 0= WHILE \ not found ( area.s) 2drop lfrefill 0= IF \ end of file ( s) 2r> 2drop false EXIT THEN REPEAT \ found ( &found #rem) 2r> ( s) nip /string lfparse-area! true ; MAX-FILE-DEPTH 1 u> [IF] \ BEGIN LFINSTREAM NESTING \ *** LINE-FILE STACK USE-PFE-PARSING [IF] ( The following loads two pfe external modules, implemented in execution-ext.c, xstacks-ext.c. ) requires xstacks-pfe.fs MAX-FILE-DEPTH create-xstack file-stack file-stack >x: >file-stack file-stack x>: file-stack> file-stack xdepth: file-stack-depth file-stack xsp0!: file-sp0! \ DEBUG [ELSE] create file-stack \ file stack break MAX-FILE-DEPTH cells allot here constant file-s0 variable file-sp : file-sp0! ( -- ) file-s0 file-sp ! ; file-sp0! : >file-stack ( fid -- fs: fid ) file-sp @ dup file-stack = ABORT" ***File stack overflow." [ -1 cells ] literal + dup file-sp ! ( fid [file-sp]) ! ; : file-stack> ( fs: fid -- s: fid ) file-sp @ dup file-s0 = ABORT" ***File-stack underflow" [ 1 cells ] literal file-sp +! ( [file-sp].orig) @ ; : file-stack-depth ( -- u ) file-sp @ file-s0 - [ 1 cells ] literal / ; [THEN] \ *** SAVE/RESTORE LFINSTREAM ( In this implementation, SAVE-LFIN ... RESTORE-LFIN leaves the lfinstream parse area unchanged, but moves the file position to the beginning of the parse area, including the case where the parse area is empty. ) : save-lfin ( -- fs: lfid ) ( Set the LFID file position to correspond to >LFIN, and push LFID onto the file stack. Clear the lfib and update LFPOS. In normal use this word would be closely followed by words that switch to a new lfinstream, but it does leave the current lfinstream in a valid state. ) >lfin s>d lfpos 2@ d+ ( new.pos.d) 2dup lfid reposition-file throw lfid >file-stack ( new.pos.d) lfpos 2! 0 to #lfib 0 to >lfin ; : restore-lfin ( fs: lfid -- ) ( Pop the fid previously pushed by SAVE-LFIN and store it in LFID. Read or start reading into the lfib the rest of the line starting at the fid file position. Set #LFIB to the number of characters read, and set >LFIN to zero. It is the user's responsibility to close the file whose fid is written over by this word. ) file-stack> to lfid lfrefill ( flag) 0= ABORT" ***Can't restore lfinstream" ; [THEN] \ END LFINSTREAM NESTING