CHAPTER THREE

File manipulation

File manipulation in Software Tools typically follows the same pattern as standard input/output manipulation.

Basic file I/O

message reports a message to the user. Rather than reusing the functions above, I dug into the Prelude.

message s = putStrLn s

getcf is the equivalent of getc, operating on a file handle, IO.Handle.

getcf :: Handle -> IO (Maybe Char)
getcf h = do
          eof <- hIsEOF h
          if eof
             then return Nothing
             else do
                  c <- hGetChar h
                  return (Just c)

Likewise, putcf is the equivalent of putc, and an alias for hPutChar.

putcf = hPutChar

Somewhat like getcf, getline reads a file handle, returning Nothing if end-of-file has been reached and Just a string if there is input. The newline is tentatively include with the line.

Although I have heretofore avoided doing so, it is perfectly OK to call getc or getcf after the end of the file: the isEOF check ensures that it never tries to read past the end of the file or do anything else undefined. That is, assuming isEOF is acceptable to call repeatedly after it returns true. In any case, doing so is necessary since the first call to getline that reaches the end of a file will return the current line, while the second will return Nothing.

getline :: Handle -> IO (Maybe String)
getline h = getline' ""
    where
    getline' line = do
                    ch <- getcf h
                    case ch of Nothing -> if length(line) == 0
                                          then return Nothing
                                          else return (Just line)
                               Just '\n' -> return (Just (line ++ "\n"))
                               Just c -> getline' (line ++ [c])

An alternative getline, with better performance, would look like:

getline :: Handle -> IO (Maybe String)
getline h = do
            eof <- hIsEOF h
            if eof
               then return Nothing
               else do
                    l <- hGetLine h
                    return (Just (l++"\n"))

(Keep in mind, the original definition of getline is used when performance is discussed later. More improvement could be gained by not using Strings.)

The inverse of getline is putstr. (More or less; it does not handle the newline itself.)

putstr :: Handle -> String -> IO ()
putstr h = mapM_ (putcf h)

mustopen opens a file, reporting a somewhat-useful error message if the operation fails.

mustopen :: String -> IOMode -> IO Handle
mustopen fn mode = openFile fn mode `catch` printError
    where
    printError = error $ fn ++ ": cannot open file"

The inverse of mustopen is an alias for hClose.

close = hClose

Higher-level operations

Aside from mustopen and close, the operations above provide a low-level, imperative interface to file operations. Haskell's higher-level alternatives are the lazy input functions getContents and hGetContents (the latter needing a Handle). getContents' type is IO String; it is an IO action that returns a string lazily read from the standard input. Likewise, hGetContents reads from a file handle. Both of these functions require some care to use correctly, since the file contents are read lazily but other file manipulations such as hClose take effect immediately. (See the discussion in the next chapter.)

File comparison

Given the basic file I/O operations, the first tool is file comparison: showing the differences between two files.

PROGRAM

  compare - compare files for equality

USAGE

  compare file1 file2

FUNCTION

  compare performs a line-by-line comparison of file1 and file2,
  printing each pair of differing lines, preceded by a line
  containing the offending line number and a colon.  If the files are
  identical, no output is produced.  If one file is a prefix of the
  other, compare reports end of file on the shorter file.

EXAMPLE

  compare old new

BUGS

  compare can produce voluminous output for small differences.

The first step is to identify the files to be compared, from the program's command line arguments. Any less or more than two arguments causes a usage message and an error; inability to open one of the files will also report an error.

parseargs :: IO (String,String)
parseargs =
    do
      args <- getArgs
      case args of (f1:f2:[]) -> return (f1,f2)
                 _ -> error "usage: compare file1 file2"

diffmsg accepts a line number and two lines from the input files. Unlike the original, I have made diffmsg responsible for determining whether line1 and line2 differ as well as generating the output. This simplifies the program's body.

diffmsg :: Int -> String -> String -> String
diffmsg ln l r | l /= r = (show ln) ++ ":\n" ++ l ++ "\n" ++ r ++ "\n"
diffmsg _  _ _          = ""

The compare program puts together the operations by parsing the command line arguments, comparing the two file contents, and reporting whether one file ended before the other. compare' runs through the files' contents, calling diffmsg to compare lines and returning the appropriate message.

(The standard function compare prevents compare' from using the obvious name.)

compare' :: Int -> [String] -> [String] -> String
compare' _ [] (_:_)       = "compare: end of file on file1\n"
compare' _ (_:_) []       = "compare: end of file on file2\n"
compare' _ [] []          = ""
compare' ln (l:ls) (r:rs) = diffmsg ln l r ++ compare' (ln+1) ls rs

main = do
       (file1, file2) <- parseargs
       infile1 <- mustopen file1 IO.ReadMode
       infile2 <- mustopen file2 IO.ReadMode
       text1 <- hGetContents infile1
       text2 <- hGetContents infile2
       putStr $ compare' 1 (lines text1) (lines text2)
       hClose infile1
       hClose infile2

File inclusion

Pascal, at least the portable subset used by K&P, requires the entire program to be in one source file. This is clearly suboptimal. In the spirit of RATFOR (from the original Software Tools), they built tools to act as extensions to Pascal, making programming easier.

Many of the main tools of Software Tools are the subroutines used to build the programs, which are frequently reused. To support reuse, K&P introduce include, a filter which implements the #include directive. (Ultimately, this is a predecessor of the C preprocessor, as well as several later macro tools.)

PROGRAM

  include - include copies of subfiles

USAGE

  include

FUNCTION

  include copies its input to its output unchanged, except that each
  line beginning

    #include "filename"

  is replaced by the contents of the file whose name is filename.  included
  files may contain further #include lines, to arbitrary depth.

EXAMPLE

  To piece together a Pascal (or Haskell) program such as include:

    #include "include.p"

BUGS

  A file that includes itself will not be diagnosed, but will
  eventually cause something to break.

The include program is:

include :: IO ()
include = finclude stdin

finclude is an IO action (since it proceses a file handle and potentially recurses on subsequently-opened filehandles:

finclude :: Handle -> IO ()
finclude h =
    do
      text <- hGetContents h
      mapM_ doLine $ lines text
    where
      doLine :: String -> IO ()
      doLine line = if first == "#include"
                    then do
                          h' <- mustopen (dequote second) IO.ReadMode
                          finclude h'
                          hClose h'
                    else putStrLn line
          where
            (first, second) = twoWords $ words line

            twoWords :: String -> (String, String)
            twoWords (a:b:_) = (a,b)
            twoWords _ = ("","")

            dequote :: String -> String
            dequote ('"':s) | (last s) == '"' = init s
            dequote s = s

finclude gets the contents of the input and processes each line of the text. If the line begins "#include word ...", the line represents an include directive, and causes the file named by word to be opened and finclude to recurse on it. (The word can be quoted; quotation marks are stripped off if present). Otherwise, doLine prints the line to its standard output.

If doLine were "String -> IO String" and returned the string read from the included file, that string would be corrupted by the hClose. Instead it uses an IO action to consume the string before closing the file.

File concatenation

Another classic program, copying a number of files to standard output.

PROGRAM

  concat  concatenate files

USAGE

  concat file ...

FUNCTION

  concat writes the contents of its file arguments in turn to
  its output, thus concatenating them into one larger file.
  Since concat performs no reformatting or interpretation of
  the input files, it is useful for displaying the contents of
  a file.

EXAMPLE

  To examine a file:

    concat file

My first attempt at code for concat looked something like:

-- main = do
--        args <- getArgs
--        mapM_ concat' args
--     where
--     concat' fn = do
--                  fd <- mustopen fn IO.ReadMode
--                  fcopy fd stdout
--                  close fd

However, the file printer program below shares the same basic layout. In fact, the original states:

The actual code for [the print program] is identical to [the concat program] except for calling fprint instead of fcopy.

so the common parts were factored out into processfiles. The current code uses that higher-order function.

This is one area where a higher-order language shines. The processfiles function is similar to the Template Pattern in object-oriented design: it provides a socket to plug functionality in to. processfiles becomes a main program that collects the command line arguments, assuming they are file names, and calls an argument function, fcn, on the file name and a file descriptor opened on each.

processfiles :: (String -> Handle -> IO ()) -> IO ()
processfiles fcn = do
                   args <- getArgs
                   if (length args) > 0
                      then mapM_ processfile args
                      else fcn "" stdin
    where
    processfile fn = do
                     fd <- mustopen fn IO.ReadMode
                     fcn fn fd
                     close fd

The fcopy utility function is a version of copy that can be provided with the input and output filehandles; it is used to provide the concat-guts to processfiles.

fcopy :: Handle -> Handle -> IO ()
fcopy hin hout = hGetContents hin >>= hPutStr hout

As a result, the main function of concat is very simple.

main = processfiles (\name handle -> fcopy handle stdout)

This implementation of fcopy is shortened by using the bind monadic operator from:

-- fcopy hin hout = do { text <- hGetContents hin; hPutStr hout text }

A third alternative implementation, based largely on the typings and higher-order equational thinking, is:

-- openstrings :: IO [String] -> IO [IO Handle]
-- openstrings = liftM (map ((flip mustopen) IO.ReadMode))
--
-- filehandles :: IO [IO Handle]
-- filehandles = openstrings getArgs
--
-- copynclose :: IO Handle -> IO (IO ())
-- copynclose = liftM (\h -> do { fcopy h stdout; close h })
--
-- copynclosehandles :: IO [IO Handle] -> IO [IO ()]
-- copynclosehandles = liftM (map (join . copynclose))
--
-- main = join ((liftM sequence_) (copynclosehandles filehandles))
  • openstrings constructs a function that maps a list of strings to a list of open filehandles, then lifts that function into the IO monad. (A good bit of the complexity involves the fact that the result is an action that returns a list of actions opening file handles.)
  • filehandles applies openstrings to the standard getArgs.
  • copynclose builds a function which takes a file handle, copies the contents to stdout, then closes it. Then, it lifts that function into the IO monad, resulting in a function that takes an action returning a file handle and produces a function returning an action which produces a void IO action.
  • copynclosehandles uses join to fix the nested actions of copynclose, and maps the resulting function across a list of IO Handles. Then, it lifts that function into the IO monad.
  • Finally, main applies copynclosehandles to filehandles, resulting in an action that produces a list of actions. Then sequence_ (which deals with lists of actions) is lifted into the IO monad and applied, and join is used to fix the resulting nesting.

After looking at this alternative (and seeing it work), and looking at the original Pascal (and the Haskell translation), I am now driving a truck for a living.

A fourth, less radical, alternative is:

-- openstrings :: [String] -> [IO Handle]
-- openstrings = map ((flip mustopen) IO.ReadMode)
--
-- copynclose :: Handle -> IO ()
-- copynclose h = do { fcopy h stdout; close h }
--
-- main = do
--        args <- getArgs
--        handles <- sequence (openstrings args)
--        sequence_ (map copynclose handles)

It is still pretty weird.

File printer

Printing (somewhat) prettily a number of files to stdout really is fairly similar to concatenating them.

PROGRAM

  fileprint - print files with headings

USAGE

  fileprint [ file ... ]

FUNCTION

  fileprint copies each of its argument files in turn to its
  output, inserting page headers and footers and filling the
  last page of each file to full length.  A header consists of
  two blank lines, a line giving the filename and page number,
  and two more blank lines; a footer consists of two blank
  lines.  Pages for each file are numbered starting at one.  If
  no arguments are specified, fileprint prints its standard
  input; the file name is null.

  The text of each file is unmodified---no attempt is made to
  fold long lines or expand tabs to spaces.

EXAMPLE

  fileprint print.p fprint.p

fprint lazily reads the contents of the incoming handle, then prints each page by first printing the page header followed by a page's worth of text followed by the footer; it recurses to collect all of the incoming text.

fprint :: String -> Handle -> IO ()
fprint name handle = do
                     text <- hGetContents handle
                     putStr $ onePage 1 $ lines text
    where
    onePage :: Integer -> [String] -> String
    onePage _ [] = ""
    onePage p lns = head ++ unlines this_page ++ foot ++ onePage (p+1) rest
        where
        (this_page,rest) = splitAt page_lines lns
        head = heading p
        foot = footing (footer_sz + page_lines - (length this_page))

    -- heading: a simple page heading
    heading pn = margin1 ++ name ++ " Page " ++ (show pn) ++ "\n" ++ margin2
        where
        margin1 = replicate fp_margin1 '\n'
        margin2 = replicate fp_margin2 '\n'

    -- footer: a simple page footer
    footing lns = replicate lns '\n'

    -- formatting constants
    page_lines = fp_bottom - (fp_margin1 + fp_margin2 + 1)
    footer_sz  = fp_pagelen - fp_bottom
    fp_margin1 = 2
    fp_margin2 = 2
    fp_bottom  = 64
    fp_pagelen = 66

Given fprint and processfiles, the main function is as simple as concat's.

main = processfiles fprint

Creating files dynamically

The distinction between open and create in the original is due to system dependencies. I have used mustopen for mustcreate as well.

mustcreate = mustopen

Putting it all together: archive

archive acts as a simple file archiver: it collects a group of files into a single archive file. This is the first fairly complex application in Software Tools.

PROGRAM

  archive  maintain file archive

USAGE

  archive -cmd aname [ file ... ]

FUNCTION

  archive manages any number of member files in a single file, aname,
  with sufficient information that members may be selectively added,
  extracted, replaced, or deleted from the collection.  -cmd is a code
  that determines the operation to be performed:

    -c  create a new archive with named members
    -d  delete named members from the archive
    -p  print named members on standard output
    -t  print table of archive contents
    -u  update named members or add at end
    -x  extract named members from archive

  In each case, the "named members" are the zero or more filenames
  given as arguments following aname.  If no arguments follow, then
  the "named members" are taken as all of the files in the archive,
  except for the delete command -d, which is not so rash.  archive
  complains if a file is named twice or cannot be accessed.

  The -t command writes one line to the output for each named member,
  consisting of the member name and a string representation of the
  file length, separated by a blank.

  The create command -c makes a new archive containing the named
  files.  The update command -u replaces existing named members and
  adds new files onto the end of an existing archive.  Create and
  update read from, and extract writes to, files whose names are the
  same as the member names in the archive.  An intermediate version fo
  the new archive file is first written to the file artemp; hence this
  filename should be avoided.

  An archive is a concatenation of zero or more entries, each
  consisting of a header and an exact copy of the original file.  The
  header format is

    -h- name length

EXAMPLE

  To replace two files in an existing, add a new one, then print the
  table of contents:

    archive -u archfile old1 old2 new1
    archive -t archfile

K&P describe "left corner" construction in this section.

The idea is to nibble off a small, manageable corner of the program---a part that does something useful---and make that work. Once it does, more and more pieces are added until the whole thing is done....

The beauty of left-corner construction is that the program does some part of its job very early in the game. By implementing the most useful functions first, you get an idea of how valuable the program will be before investing any time in the difficult or esoteric services....

Working through this book is bringing back memories. I read the original Software Tools early in my career. (No, a great while after it was originally printed. I'm not that old.) It had a great influence on how I prefer to do things, and this is one of those instances. I do not remember reading this passage, and I will have to look it up in the RATFOR copy to make sure it is there, but this very well describes how I prefer to build system: get something useful working, then elaborate it. Move from one working state to the next. Get feedback on how it is working.

The original Software Tools philosophy was agile before Agile was cool.

Tool functions

These functions provide a higher level of file manipulation useful to the archiver, including:

  • file renaming,
  • removal,
  • sizing,
  • moving withing a file, and
  • copying part of a file to another file.

As the original discusses, it would be more convenient to use a filesystem operation to rename a file in order to move it, rather than copying it. The original does not do this in the name of portability, and I continue to replicate K&P's functionality:

fmove :: String -> String -> IO ()
fmove n1 n2 = do
              h1 <- mustopen n1 IO.ReadMode
              h2 <- mustcreate n2 IO.WriteMode
              fcopy h1 h2
              close h2
              close h1

This function could be implemented using copyFile from System.Directory, but it is likely that copyFile is not terribly different.

The remove function is implemented by removeFile from System.Directory, since there is no alternative based on the operations already seen.

remove :: String -> IO ()
remove = removeFile

fsize would normally be "a primitive, a service of the local file system," such as hFileSize. However, an implementation is interesting:

fsize :: String -> IO Int
fsize file = do
             fd <- mustopen file IO.ReadMode
             sz <- fsize' fd 0
             close fd
             return sz
    where
    fsize' fd n = do
                  ch <- getcf fd
                  case ch of Nothing -> return n
                             Just _  -> fsize' fd $! n + 1

This fsize implementation uses the low-level, imperative operations defined earlier. Using a higher-level IO operation such as hGetContents would lead to problems:

-- fsize :: String -> IO Int
-- fsize file =
--     do
--       fd <- mustopen file IO.ReadMode
--       text <- hGetContents fd
--       close fd
--       return $ length text

The specific problem is that nothing ever forces the file to be read (and the length computed) before the filehandle is closed.

Given a file handle fd open for reading, fskip reads and skips n characters. This function is useful to skip through an archive file, bypassing uninteresting contents, since the length of each contained file is known from its header.

fskip :: Handle -> Int -> IO ()
fskip fd n = fskip' n
    where
      fskip' n | n == 0    = return ()
               | otherwise = do
                             ch <- getcf fd
                             case ch of Nothing -> return ()
                                        Just _ -> fskip' $ n-1

acopy copies at most size characters from src to dst, ending when size runs out or the end of src is reached. This is useful to copy a file from one archive to another, or to recover a file from the archive.

acopy :: Handle -> Handle -> Int -> IO ()
acopy src dst size | size <= 0 = return ()
                   | otherwise = do
                                 ch <- getcf src
                                 case ch of Nothing -> return ()
                                            Just c -> do
                                                      putcf dst c
                                                      acopy src dst $ size-1

Command line handling

Printing a usage message:

help = error "usage: archive -[cdptux] archname [ files ... ]"

Commands are represented by a data type, separating the command line parsing (which identifies the command) from the rest of the archive code, which executes it.

data Command = Create          [ String ]                        -- -c
             | Delete          [ String ]                        -- -d
             | Print           [ String ]                        -- -p
             | TableOfContents [ String ]                        -- -t
             | Update          [ String ]                        -- -u
             | Extract         [ String ]                        -- -x

The command line processing is rudimentary. It only recognizes single commands with hyphens: '-c', '-d', etc.

getcommand (cmd:arch:files) =
    if (length (nub files)) < (length files)
       then error "archive: duplicate file name"
       else if cmd == "-c" then (arch, Create files)
       else if cmd == "-d" then (arch, Delete files)
       else if cmd == "-p" then (arch, Print files)
       else if cmd == "-t" then (arch, TableOfContents files)
       else if cmd == "-u" then (arch, Update files)
       else if cmd == "-x" then (arch, Extract files)
       else help
getcommand _ = help

Temporary file handling

archive uses a temporary file when performing operations which update an archive. This temporary is "artemp" in the current directory. This is not at all safe, and really should use some approved temporary filename. But, it is what the original does, so....

tempfile = "artemp"

The standard library's bracket creates an IO action which executes the first IO action argument, capturing the result, then passes that result to the third and second IO action arguments, in that order. It ensures that the second argument is executed even if the third fails. with_temp originally used the standard library's bracket to create a function which takes an archive file name and a transformation, and performs that transformation on the archive file in a moderately safe manner:

-- with_temp :: String -> (Handle -> IO ()) -> IO ()
-- with_temp arch act = bracket opentemp closetemp action
--     where
--     opentemp = mustcreate tempfile IO.WriteMode
--     action tfd = do
--                  act tfd
--                  fmove tempfile arch
--     closetemp tfd = do
--                     close tfd
--                     remove tempfile

The problem is that the call to fmove attempted to re-open an already open file, tempfile. That fails on Windows. So, I created a custom version of bracket to do the same basic operation, but safely do the move after a successful result.

This function is one of the keys to the archiver. It encapsulates the operation of creating, operating on, and closing the temporary file, as well as updating the archive; all of the operations below (create, update, delete) which change the archive are based on it.

with_temp :: String -> (Handle -> IO ()) -> IO ()
with_temp arch action = do
                        tfd <- mustcreate tempfile IO.WriteMode
                        res <- try (action tfd)
                        close tfd
                        case res of
                                 Right r -> do
                                            fmove tempfile arch
                                            remove tempfile
                                            return r
                                 Left e -> do
                                           remove tempfile
                                           ioError e

Archive utilities

Create a file header:

makehdr :: String -> IO String
makehdr file = do
               size <- fsize file
               return (makehdr' file size)

makehdr' :: String -> Int -> String
makehdr' file size = "-h- " ++ file ++ " " ++ (show size) ++ "\n"

Read a file header from an archive:

gethdr :: Handle -> IO (Maybe (String,Int))
gethdr fd = do
            ln <- getline fd
            case ln of
                    Nothing -> return Nothing
                    Just l ->
                        case words l of
                          ["-h-", file, sz] -> return $ Just (file,read sz)
                          _ -> error "archive not in proper format"

Add file to an archive represented by a file handle, tfd:

addfile :: Handle -> String -> IO ()
addfile tfd file = do
                   nfd <- mustopen file IO.ReadMode
                   hdr <- makehdr file
                   putstr tfd hdr
                   fcopy nfd tfd
                   close nfd

Copy a file from one archive to another. The order of the arguments is a little strained, but it puts copy_file in the same format as the operator parameter to with_archive below.

copy_file :: Handle -> String -> Int -> Handle -> IO ()
copy_file dest filename size src = do
                                   putstr dest (makehdr' filename size)
                                   acopy src dest size

filearg is true if

  1. files is empty, or
  2. file is in files.

This is an alternative, slightly strange version of list membership, but it is actually used in several archive operations.

filearg :: [String] -> String -> Bool
filearg files file = (length files) == 0 || isJust (elemIndex file files)

The alternate predicate is also useful. not_given is true if filearg is not true (this is not properly defined in the case where files is empty).

not_given :: [String] -> String -> Bool
not_given files = not . (filearg files)

tprint prints a table line, when printing the table of contents of the archive.

tprint :: String -> Int -> IO ()
tprint file size = putstr stdout (file ++ " " ++ (show size) ++ "\n")

notfound prints messages for files in arguments that have not been found in the archive.

notfound :: [String] -> IO ()
notfound files =
    mapM_ (\f -> message (f ++ ": not in archive")) files

Given:

  • A predicate on filenames,
  • An operation (taking a filename, file size, and file descriptor pointing to the file in the archive),
  • A "notfound" action (see notfound above),
  • The filename of an archive, and
  • A list of file names,

with_archive performs the operation on the files in the archive for which the predicate returns true. Following the processing, the notfound operation is passed the list of filenames specified by the list of file names parameter which were not seen in the archive.

with_archive is the read-only counterpart to with_temp above, for those operations which do not modify the archive.

The combination of with_archive and with_temp is exceptionally powerful: See the delete and update operations below.

with_archive ::
    (String -> Bool)                                             -- test
        -> (String -> Int -> Handle -> IO ())                    -- operator
        -> ([String] -> IO ())                                   -- notfound
        -> String                                                -- archive
        -> [String]                                              -- files
        -> IO ()
with_archive test op nf ar files = bracket open' close do_loop
    where
    open' = mustopen ar IO.ReadMode
    do_loop fd = do
                 seen <- loop fd []
                 nf (files \\ seen)
    loop fd seen = do
                   hdr <- gethdr fd
                   case hdr of
                            Nothing -> return seen
                            Just (file,size) -> do
                                                if test file
                                                   then op file size fd
                                                   else fskip fd size
                                                loop fd $ file:seen

Operations

The original conflates update and addition, extracting and printing. This is, to my mind, undesirable.

Further, the error checking is different: the original reports as many errors as possible before terminating, using the temporary file to avoid mangling the archive file in case of errors. This code fails with an exception on the first problem it encounters.

Archive creation

create_archive :: String -> [String] -> IO ()
create_archive arch files = with_temp arch addfiles
    where
    addfiles :: Handle -> IO ()
    addfiles tfd = mapM_ (addfile tfd) files

Archive contents

table_archive :: String -> [String] -> IO ()
table_archive arch files =
    with_archive (filearg files) printskip notfound arch files
    where
    printskip :: String -> Int -> Handle -> IO ()
    printskip file size fd = do
                             tprint file size
                             fskip fd size

Extract files

extract_archive :: String -> [String] -> IO ()
extract_archive arch files =
    with_archive (filearg files) extract notfound arch files
    where
    extract :: String -> Int -> Handle -> IO ()
    extract file size fd = do
                           dest <- mustcreate file IO.WriteMode
                           acopy fd dest size
                           close dest

Delete files

This next gets hairy. delete_archive calls with_temp to safely update the archive, using delete_files as the operation. delete_files uses with_archive to open the existing archive and process the contents. It uses not_given as the test: this reverses the normal filearg predicate and invokes copy_file on any file that is not listed in the command line files. copy_file copies the file from the existing archive to the new temporary file. When completed successfully, with_temp updates the archive from the temporary.

delete_archive arch files = if (length files) == 0
                            then error "archive: -d requires file names"
                            else with_temp arch delete_files
    where
    delete_files :: Handle -> IO ()
    delete_files tfd =
        with_archive (not_given files) (copy_file tfd) notfound arch files

Archive update

Again, with the hair.

Updating an archive, as the original observes,

breaks cleanly into two stages: replacing existing members with new versions, and adding to the end any files named as arguments but not present in the archive.

This is correct: in breaks into two stages. However, I have identified those stages somewhat differently. Rather than replacing existing members when they are named on the command line, I merely copy those members which are not listed. The second stage then becomes adding all of the listed members.

K&P's method retains the original ordering of the updated files in the archive, my method does not. However, I am willing to make that trade off since it simplifies updating considerably.

So, here is update_archive. It uses with_temp to safely update the archive, with update_files as the action. update_files in turn uses with_archive to access the existing archive, copying the files to the new archive which are not listed on the command line. Unlike the functions above which used with_archive, update_files does not specify notfound as the listed-files-not-seen handler. Instead, it uses additions, which performs the roughly the same operation as create_archive above; it adds all of the listed files to the archive.

To sum up:

  • Existing members that are not named are copied from the archive.
  • Existing members that are named are not copied.
  • All named files are added to the end of the new archive. This step updates any named, existing members as well as adding new members.
update_archive :: String -> [String] -> IO ()
update_archive arch files = with_temp arch update_files
    where
    update_files tfd =
        with_archive (not_given files)
                     (copy_file tfd)
                     (additions tfd)
                     arch files
    additions tfd _ = mapM_ (addfile tfd) files

Main

Finally, the main function. This has the payoff for getcommand way back up and the individual operations above, and is simply a branch to the appropriate operation.

main = do
       args <- getArgs
       let (arch, cmd) = getcommand args
       case cmd of
                Create files          -> create_archive arch files
                Delete files          -> delete_archive arch files
                Print files           -> print_archive arch files
                TableOfContents files -> table_archive arch files
                Update files          -> update_archive arch files
                Extract files         -> extract_archive arch files

gloria i ad inferni
faciamus opus

Return to Top | About this site...
Last edited Sat Aug 8 03:29:10 2009.
Copyright © 2005-2016 Tommy M. McGuire