Articles Archive
Articles Search
Director Wiki
 

Reading Binary Files with FileIO

June 12, 2002
by David Pollock

I recently had a project which required reading directly from a database stored as a binary file. At the time, it appeared that I would need an Xtra such as BinaryIO or BinaryMaster in order to even read data on disk, but it turns out that FileIO, with a little work, is capable of reading and writing data to binary files.

FileIO was built with one thing in mind: to read and write ASCII-based text files. The main reason FileIO doesn't seem suited to reading binary file data is that whenever a byte with a value of zero is encountered, it becomes ASCII character 0 -- which is the null character -- and is treated as an end-of-file marker. And because binary files tend to have plenty of zeroes in them, you won't get much data out of a file if FileIO thinks its job is done. That, at least, is what happens if you are using FileIO's readFile function.

FileIO's getLength and readChar Functions

Fortunately, FileIO contains a couple of other functions that allow us to get around this limitation. The first is the getLength function. This allows us to see exactly how many bytes are stored in a file. (In a text file, one byte represents one ASCII character. In a binary file, one byte can represent different things, but we'll worry about interpreting binary data later.)

The second FileIO function we'll be using, which will replace the readFile function, is readChar. The basic function of readChar is to read one byte -- or character -- of a file, then move to the next byte and wait. The important thing for us about readChar is that it doesn't care if it encounters a byte whose value is zero; it will just return an empty string to Lingo, and move to the next byte in the file.

Binary Data

So now we have a way to determine how many bytes are in a file, and we can read each byte, of the file. The next thing we need to realize is what kind of data we are looking to retrieve from a binary file. If we try to store all of the retrieved bytes as a string in Lingo, we run into a problem similar to the one we have with the readFile function, in that all of the bytes with a zero value are equivalent to EMPTY, or "", in Lingo, and wouldn't add anything to a string variable. All of the zero-value data would be lost, which isn't helpful.

The obvious alternative to storing all of our data as a string in Lingo is to use a linear list (though there are other options, such as using an image object), and to convert all of the string data returned from FileIO's readChar into its integer equivalent using Lingo's charToNum function. Now, the value of every byte of the file can be appended to our linear list, and will later allow us to manipulate that data.

Here is the resulting Lingo handler so far:

on readBinaryFile filePath
 -- Call handler using the full file path

  byteList = []
  -- This list will contain all of the byte values retrieved
  fileObj = new(xtra "fileIO")
  fileObj.openFile(filePath, 1)
  -- Open the file with read-only access

  if fileObj.status() = 0 then
    fileLength = fileObj.getLength()
    -- Find the length of the file in bytes
    repeat with index = 1 to fileLength
      byteList.append(charToNum(fileObj.readChar()))
      -- Append the value of each byte of the file to byteList
    end repeat
  end if

  fileObj.closeFile()
  -- Close the file
  fileObj = 0
  -- Clear the FileIO instance from memory
  return byteList

end

The key thing to be aware of is that the repeat loop, in conjunction with the readChar function, forces FileIO to keep moving through the file, regardless of the value of the byte it encounters. Now we have a way to read any file in its entirety.

Reading Large Files

One problem you may have thought of with the readBinaryFile handler above is that large files would return an enormous byteList variable, and would use up a lot of memory. If we knew which portion of the binary file we wanted, we could retrieve just that portion by slightly modifying the handler like this:

on readBinaryFile filePath, startByte, endByte
  -- Optional parameters for specifying positions in the file
  byteList = []
  fileObj = new(xtra "fileIO")
  fileObj.openFile(filePath, 1)

  if fileObj.status() = 0 then
    fileLength = fileObj.getLength()
    
    if startByte.voidP then
      -- Read entire file if parameters aren't supplied
      startByte = 1
      endByte = fileLength
    end if
    
    if (fileLength >= startByte) and (fileLength >= endByte) then
      fileObj.setPosition(startByte-1)
      -- Position FileIO to read from the correct point in the file
      repeat with index = startByte to endByte
        byteList.append(charToNum(fileObj.readChar()))
      end repeat
    end if
  end if

  fileObj.closeFile()
  fileObj = 0
  return byteList
end

Example: Reading MP3 Tags

A real-world use of this handler is in reading tags from an MP3 file. The tags in an MP3 file are stored in the last 128 bytes (for "version 1" tags). The example file contains a Director movie that illustrates retrieving tags from an MP3 file (both require the FileXtra & FileIO Xtras). (A great resource for finding-out how different file types are formatted is available at: Programmers File Format Collection).

This handler from the example movie uses the final bytes of the selected MP3 file to build the tags property list containing the information extracted for the song title, artist, album title, year, and genre (extracted from the pre-defined list of genres in the MP3 spec). Once the data's been read in from the file, it's really a simple matter of building a 128-character string (tagStr) from the binary data, then accessing the parts of the string that are used for each item.

on readMP3tags filePath

  fileLength = getFileSizeInBytes (filePath)
  if fileLength > 0 then
    byteList = readBinaryFile (filePath, fileLength - 127, fileLength)
    tagStr = ""
    repeat with index = 1 to byteList.count
      tagStr = tagStr & numToChar (byteList[index])
    end repeat
    
    tags = [#title: tagStr.char[4..33], #artist: tagStr.char[34..63], #album: tagStr.char[64..93], #year: tagStr.char[94..97], #genre: getGenre (charToNum (tagStr.char[128]) + 1)]
    
    repeat with index = 1 to tags.count
      memberName = string (tags.getPropAt (index))
      newText = clipString (tags[index])
      member (memberName).text = newText
    end repeat
  end if

end

 

A sample Director 8.5 movie is available for download in Mac or Windows format. This movie uses the older version of the FileXtra, avalable from Kent Kersten at http://kblab.net/xtras/old/

All colorized Lingo code samples have been processed by Dave Mennenoh's brilliant HTMLingo Xtra, available from his site at http://www.crackconspiracy.com/~davem/

David Pollock has been working with Director and Lingo for about nine years. He currently lives in Atlanta, and works for Allure Fusion Media, a company specializing in digital signage.

Copyright 1997-2024, Director Online. Article content copyright by respective authors.