(       Title:  Line-File Input
         File:  lfinput.fs
       Author:  David N. Williams
      License:  LGPL
      Version:  0.5.6
Last revision:  August 21, 2021
)
\ Copyright (C) 2004, 2006, 2021 by David N. Williams
(
This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or at your option any later version.

This software is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU Lesser General Public License for more details.

If you take advantage of the option in the LGPL to put a
particular version of this library under the GPL, the author[s]
would regard it as polite if you would put any direct
modifications under the LGPL as well, and include a copy of this
request near the beginning of the modified library source.  A
"direct modification" is one that enhances or extends the
library in line with its original concept, as opposed to
developing a distinct application or library which might use it.
)

\ *** PREAMBLE
(
The words in this file are mostly ANS Forth compatible up to
case dependence and [UNDEFINED].

The general idea is to support an analog of input stream parsing
for lines read from a file not directly attached to the input
stream by the normal ANS Forth include mechanism.

This implementation is based on READ-LINE, with line input into
a dedicated buffer required to be large enough for the whole
line, plus line end overhead.  An error is thrown when a line
doesn't fit with a single read.  A nice part of the
specification for READ-LINE is that it efficiently handles the
case where a buffer is shorter than a line.  That gets in the
way of the input stream analogy here, so we don't use it.

The scheme was devised after digesting the enormous "ANS
READ-LINE question" thread in comp.lang.forth, June, 1999.
)
decimal


\ *** USER PARAMETERS
(
The default values of the constants USE-PFE-PARSING,
MAX-FILE-DEPTH and MAX-/LINE may be changed by defining them
prior to loading this code.

When MAX-FILE-DEPTH has its default value one, code for nesting
the line-file input stream is omitted.  When its value is larger
than one, a file stack is allocated and code to implement the
saving and restoring of the line-file input stream gets
compiled.
)
  [UNDEFINED] USE-PFE-PARSING             \ default uses ANS definitions
  [IF] false CONSTANT USE-PFE-PARSING [THEN]

  [UNDEFINED] MAX-FILE-DEPTH  \ default is no nesting
  [IF] 1 constant MAX-FILE-DEPTH [THEN]

  [UNDEFINED] MAX-/LINE  \ non-eol chars, used for line-file input
  [IF] 256 constant MAX-/LINE [THEN]

\ The following is for SKIP, SCAN, SEPARATE, S-AFTER,
\ FIRST-WORD, and SEPARATE-WORD:
USE-PFE-PARSING [IF]
  loadm parsing  \ external module parsing-ext.c
[ELSE]
  s" parsing.fs" included 
[THEN]


\ *** WORDS
(
  Line-file input data:
  LFID  LFPOS
  LFIB  >LFIN  #LFIB

  Line-file parse area:
  LFSOURCE  LFPARSE-AREA@  LFPARSE-AREA!
  EMPTY-LFPARSE-AREA
  LFPARSE  LFPARSE-NAME

  Line-file input stream:
  LFREFILL  EMPTY-LFINPUT  REWIND-LFINPUT
  NEXT-LFINPUT-NAME  S-LFINPUT-AFTER
)

\ *** LFINSTREAM NESTING WORDS
\     (compiled if MAX-FILE-DEPTH > 1) 
(
  Compiled if USE-PFE-PARSING is false:
  FILE-S0  FILE-SP

  FILE-STACK  >FILE-STACK  FILE-STACK>
  FILE-STACK-DEPTH  FILE-SP0!
  SAVE-LFIN  RESTORE-LFIN
)

\ *** TERMINOLOGY AND NOTATION
(
"fstring":  ANS Forth string represented as [addr len]

"s", or ".s" suffix:  In stack comments, short for [addr len]
  for an fstring.

"lfin", "lfinstream":  Synonyms for line-file input or line-file
  input stream.

"eol":  End of line.

"lfib":  Line-file input buffer, a buffer with maximum logical
  size MAX-/LINE characters, actually followed by room for an
  overflow character and two possible eol characters.

"lfin position":  The character position at offset >LFIN from
  the beginning of the lfib.

"exhausted lfin":  The lfin position is just after the end of the
  text in the lfib; i.e., >LFIN = #LFIB.

The naming policy for words is intended to be consistent with
that in parsing-strings.fs and parsing-input.fs, with an extra
"LF" prefix to distinguish parsing line-file input from parsing
the ordinary input stream.
)

\ *** LINE-FILE INPUT DATA

  0 value lfid       \ current line-file id
2variable lfpos      \ line-file position
    0. lfpos 2!
  0 value >lfin      \ lfib current character offset
  0 value #lfib      \ lfib number of non-eol characters
(
Two of the extra characters ALLOTed below are for possible
READ-LINE eols.  The third is for a possible overflow character
to accommodate the "complete line" signal used by LFREFILL:

   # non-eol chars read < # requested

That is, LFREFILL requests a read of up to one more non-eol
characters than the allowed line length, and throws an error if
it actually gets it.
)
  create lfib MAX-/LINE 3 + allot


\ *** LINE-FILE PARSE AREA

: lfsource  ( -- 'lfib #lfib )  lfib #lfib ;

: lfparse-area@  ( -- unparsed.s )
(
Leave the line-file parse area portion of the lfib as an
fstring.
)
  lfsource ( &inbuf #inbuf) >lfin ( #parsed) /string ;

: lfparse-area!  ( unparsed.s -- )
(
The fstring unparsed.s is assumed to lie at the end of the lfib.
Advance the lfinstream so that unparsed.s is the line-file parse
area.
)
  ( addr len) #lfib swap - to >lfin ( addr) drop ;

: empty-lfparse-area  ( -- )
(
Exhaust the line-file input buffer.
)
  #lfib to >lfin ;

: lfparse  ( char -- parsed.s flag )
(
The fstring parsed.s starts at the current lfin position and
ends just before the first occurrence of char, or of whitespace
in case char is BL, in the line file parse area.  If char is
found, flag is true, and the lfin position is set to just after
the first instance of char.  If not found, flag is false,
parsed.s represents the initial parse-area, and the parse-area
is left exhausted.
)
  lfparse-area@ rot
  separate ( after.s parsed.s true | parsed.s false)
  IF   2swap lfparse-area! true
  ELSE empty-lfparse-area false THEN ;

: lfparse-name  ( -- word.s )
  lfparse-area@ separate-word 2swap lfparse-area! ;


\ *** LINE-FILE INPUT STREAM

: lfrefill  ( -- success.flag )
(
Set LFPOS to the current LFID file position, for a possible
SAVE-LFIN.

Use READ-LINE to read up to MAX-/LINE + 1 non-eol characters
into the lfib plus extra bytes at its end, starting at the file
position for LFID.

If READ-LINE returns a nonzero ior, throw an error.

If more than MAX-/LINE non-eol characters are read, throw an
error.

Otherwise set #LFIB to the number of characters returned by
READ-LINE [which does not include eols], and set >LFIN to zero.

The flag is false if and only if the line was both empty and not
followed by eol characters, so the end of the file was reached. 
#LFIB and >LFIN are still set [to zero].
)
  lfid file-position throw lfpos 2!
  lfib [ MAX-/LINE 1+ ] literal lfid read-line throw
  swap ( flag #chars)

\ At this point ANS says flag is false iff end of file was
\ reached before any chars were read, including eols.

  ( #chars) dup [ MAX-/LINE 1+ ] literal =
      ABORT" ***Line too long."
  ( flag #chars) to #lfib   0 to >lfin ;

: empty-lfinput  ( -- )
  lfid file-size throw
  lfid reposition-file throw ;

: rewind-lfinput  ( -- )
  0. lfid reposition-file throw ;

: next-lfinput-name  ( -- word.s )
(
Parse the next word from the lfinstream across lines.  If word.s
is empty, at most whitespace was found.  Based on Wil Baden's
NEXT-WORD.
)
  BEGIN lfparse-name dup IF EXIT THEN lfrefill
  WHILE 2drop REPEAT ;

: s-lfinput-after  ( s -- flag )
(
Advance the lfinstream to just after the next occurrence of the
string across lines, and leave true if found.  If the string is
not found before the end of the file, leave the lfinstream
positioned there with false.
)
  ( s) 2>r
  BEGIN
    lfparse-area@ 2r@ ( area.s s)
    search ( area.s false | &found #rem true)
  0= WHILE  \ not found
    ( area.s) 2drop
    lfrefill 0= IF  \ end of file
      ( s) 2r> 2drop false EXIT THEN
  REPEAT  \ found
  ( &found #rem) 2r> ( s) nip /string lfparse-area! true ;


MAX-FILE-DEPTH 1 u> [IF]  \ BEGIN LFINSTREAM NESTING
\ *** LINE-FILE STACK

USE-PFE-PARSING [IF]
(
The following loads two pfe external modules, implemented in
execution-ext.c, xstacks-ext.c.
)
  requires xstacks-pfe.fs

MAX-FILE-DEPTH create-xstack file-stack
file-stack     >x: >file-stack
file-stack     x>: file-stack>
file-stack xdepth: file-stack-depth
file-stack xsp0!: file-sp0!  \ DEBUG

[ELSE]

create file-stack  \ file stack break
    MAX-FILE-DEPTH cells allot
here constant file-s0
variable file-sp

: file-sp0!  ( -- )  file-s0 file-sp ! ;

file-sp0!

: >file-stack  ( fid -- fs: fid )
  file-sp @ dup file-stack = ABORT" ***File stack overflow."
  [ -1 cells ] literal + dup file-sp !
  ( fid [file-sp]) ! ;

: file-stack>  ( fs: fid -- s: fid )
  file-sp @ dup file-s0 = ABORT" ***File-stack underflow"
  [ 1 cells ] literal file-sp +!
  ( [file-sp].orig) @ ;

: file-stack-depth ( -- u )
  file-sp @ file-s0 - [ 1 cells ] literal / ;

[THEN]
 

\ *** SAVE/RESTORE LFINSTREAM
(
In this implementation, SAVE-LFIN ... RESTORE-LFIN leaves the
lfinstream parse area unchanged, but moves the file position to
the beginning of the parse area, including the case where the
parse area is empty.
)

: save-lfin  ( -- fs: lfid )
(
Set the LFID file position to correspond to >LFIN, and push LFID
onto the file stack.  Clear the lfib and update LFPOS.

In normal use this word would be closely followed by words that
switch to a new lfinstream, but it does leave the current
lfinstream in a valid state.
)
  >lfin s>d lfpos 2@ d+
  ( new.pos.d) 2dup lfid reposition-file throw
  lfid >file-stack
  ( new.pos.d) lfpos 2!   0 to #lfib   0 to >lfin ;

: restore-lfin  ( fs: lfid -- )
(
Pop the fid previously pushed by SAVE-LFIN and store it in LFID.
Read or start reading into the lfib the rest of the line
starting at the fid file position.  Set #LFIB to the number of
characters read, and set >LFIN to zero.

It is the user's responsibility to close the file whose fid is
written over by this word.
)
 file-stack> to lfid
 lfrefill ( flag) 0= ABORT" ***Can't restore lfinstream" ;

[THEN]  \ END LFINSTREAM NESTING