The following does not exhaust what one might want. However, it is a restrained upgrade to the existing operations which provides most or all of the functionality required by high-level users. All of the following are scheme shell functions and can't be called from inside co-processor functions. ================================================================ A draft specification for additional binary file operations. ================================================================ Claim: there is no good reason to allow users to overwrite positions in the middle of files. Therefore -- There is no reason to open files for input/output access. It is sufficient to support input and output separately (with append and overwrite options for output). -- Seek operations can be restricted to INPUT file pointers, because there is no good reason to reposition the file pointer for an output file. I can think of well-motivated examples for system-type database applications. However, in our application areas at a high-level, it seems more appropriate to read the file into memory, make the changes (typically >> 1) and write it out. Bulk file I/O is very fast. If we ever need to allow writes at random locations, it would be easy to add a third function for opening files input/output and remove the restriction on seek. I. Interactions with file system (remove-file filename) removes file (including symbolic links) (move-file filename1 filename2) (copy-file filename1 filename2) make sure to set right flags so this works on directories as well as files (link-file filename1 filename2) makes filename2 a new additional name for filename1 by creating a symbolic link (make-directory filename) (remove-directory filename) errors if directory is not empty (remove-directory-and-contents filename) ==> Perl manual may be helpful for names of relevant unix system calls. (file-exists? filename) does filename exist (as either directory or ordinary file)? (file-readable? filename) (file-writable? filename) (file-is-directory? filename) is filename the name of a directory? (file-size filename) returns size of file, in bytes. ==> See the unix/perl stat, statx commands. (directory-contents directoryname) returns a list of filenames in this directory error if filename is not a directory or cannot be listed ==> I'm not sure if there's a system call or if we have to read the directory as a file and parse it. There's an example of the latter in Kernighan and Ritchie. II. Opening binary files The function open-binary-output file is removed and replaced by the following two functions. When combined with file-exists? (see above) and error, scheme users should be able to create all the usual alternative behaviors. (open-file-for-overwrite filename) Opens file for output. If file already exists, prior contents are flushed. Returns file pointer at start of file. Errors if file cannot be opened. (open-file-for-append filename) Opens file for output, returning a file pointer. If file already exists, existing contents are kept and returned file pointer points to end of file. Errors if file cannot be opened. Notice that C/Unix explicitly allows two open append pointers for the same file. The output is intermixed as it comes in from the two pointers, which might even belong to different processes. Unlikely to be what the users intended, but it won't upset the operating system. III. Additional reading options for binary files (read-binary-file filename) Reads entire file into a 1D integer-grid and returns the grid. (seek-from-start file-pointer position) (seek-from-end file-pointer position) (seek-from-current file-pointer position) Moves file-pointer to specified position relative to start of file, end of file, or current file position. Position is a (signed) integer. It may be good (cf. the documentation for rewind in the C manual) for the C implementation to call clearerr(fp) after fseek, in case the previous location might have been past the end of the file. It's not clear what happens if the new position is outside the bounds of the file. Some obvious cases can be caught by our code, e.g. the input to seek-from-start must be non-negative and the input to seek-from-end must be non-positive. However, for other cases we may need to use C's error checking: catch whatever errors C reports and transmit them to scheme. I'm not sure if C flags an error when the fp moves outside the file, or only when one attempts to subsequently read from such an fp. (report-current-position file-pointer) Returns a (non-negative) integer specifying the position of the file pointer relative to the start of the file. That is, it calls the C function ftell. This can be called on output as well as input file pointers. In case of error, C's ftell returns -1. Report-current-position should probably return a missing value in such cases. ================================================================ ================================================================