Reading Binary Files with FileIO
June 12, 2002
by David Pollock
FileIO was built with one thing in mind: to read and write ASCII-based text files. The main reason FileIO doesn't seem suited to reading binary file data is that whenever a byte with a value of zero is encountered, it becomes ASCII character 0 -- which is the null character -- and is treated as an end-of-file marker. And because binary files tend to have plenty of zeroes in them, you won't get much data out of a file if FileIO thinks its job is done. That, at least, is what happens if you are using FileIO's readFile function.
FileIO's getLength and readChar Functions
Fortunately, FileIO contains a couple of other functions that allow us to get around this limitation. The first is the getLength function. This allows us to see exactly how many bytes are stored in a file. (In a text file, one byte represents one ASCII character. In a binary file, one byte can represent different things, but we'll worry about interpreting binary data later.)
The second FileIO function we'll be using, which will replace the readFile function, is readChar. The basic function of readChar is to read one byte -- or character -- of a file, then move to the next byte and wait. The important thing for us about readChar is that it doesn't care if it encounters a byte whose value is zero; it will just return an empty string to Lingo, and move to the next byte in the file.
Binary Data
So now we have a way to determine how many bytes are in a file, and we can read each byte, of the file. The next thing we need to realize is what kind of data we are looking to retrieve from a binary file. If we try to store all of the retrieved bytes as a string in Lingo, we run into a problem similar to the one we have with the readFile function, in that all of the bytes with a zero value are equivalent to EMPTY, or "", in Lingo, and wouldn't add anything to a string variable. All of the zero-value data would be lost, which isn't helpful.
The obvious alternative to storing all of our data as a string in Lingo is to use a linear list (though there are other options, such as using an image object), and to convert all of the string data returned from FileIO's readChar into its integer equivalent using Lingo's charToNum function. Now, the value of every byte of the file can be appended to our linear list, and will later allow us to manipulate that data.
Here is the resulting Lingo handler so far:
on readBinaryFile filePath
-- Call handler using the full file path
byteList = []
-- This list will contain all of the byte values retrieved
fileObj = new(xtra "fileIO")
fileObj.openFile(filePath, 1)
-- Open the file with read-only access
if fileObj.status() = 0 then
fileLength = fileObj.getLength()
-- Find the length of the file in bytes
repeat with index = 1 to fileLength
byteList.append(charToNum(fileObj.readChar()))
-- Append the value of each byte of the file to byteList
end repeat
end if
fileObj.closeFile()
-- Close the file
fileObj = 0
-- Clear the FileIO instance from memory
return byteList
end
The key thing to be aware of is that the repeat loop, in conjunction with the readChar function, forces FileIO to keep moving through the file, regardless of the value of the byte it encounters. Now we have a way to read any file in its entirety.
Reading Large Files
One problem you may have thought of with the readBinaryFile handler above is that large files would return an enormous byteList variable, and would use up a lot of memory. If we knew which portion of the binary file we wanted, we could retrieve just that portion by slightly modifying the handler like this:
on readBinaryFile filePath, startByte, endByte
-- Optional parameters for specifying positions in the file
byteList = []
fileObj = new(xtra "fileIO")
fileObj.openFile(filePath, 1)
if fileObj.status() = 0 then
fileLength = fileObj.getLength()
if startByte.voidP then
-- Read entire file if parameters aren't supplied
startByte = 1
endByte = fileLength
end if
if (fileLength >= startByte) and (fileLength >= endByte) then
fileObj.setPosition(startByte-1)
-- Position FileIO to read from the correct point in the file
repeat with index = startByte to endByte
byteList.append(charToNum(fileObj.readChar()))
end repeat
end if
end if
fileObj.closeFile()
fileObj = 0
return byteList
end
Example: Reading MP3 Tags
A real-world use of this handler is in reading tags from an MP3 file. The tags in an MP3 file are stored in the last 128 bytes (for "version 1" tags). The example file contains a Director movie that illustrates retrieving tags from an MP3 file (both require the FileXtra & FileIO Xtras). (A great resource for finding-out how different file types are formatted is available at: Programmers File Format Collection).
This handler from the example movie uses the final bytes of the selected MP3 file to build the tags property list containing the information extracted for the song title, artist, album title, year, and genre (extracted from the pre-defined list of genres in the MP3 spec). Once the data's been read in from the file, it's really a simple matter of building a 128-character string (tagStr) from the binary data, then accessing the parts of the string that are used for each item.on readMP3tags filePath
fileLength = getFileSizeInBytes (filePath)
if fileLength > 0 then
byteList = readBinaryFile (filePath, fileLength - 127, fileLength)
tagStr = ""
repeat with index = 1 to byteList.count
tagStr = tagStr & numToChar (byteList[index])
end repeat
tags = [#title: tagStr.char[4..33], #artist: tagStr.char[34..63], #album: tagStr.char[64..93], #year: tagStr.char[94..97], #genre: getGenre (charToNum (tagStr.char[128]) + 1)]
repeat with index = 1 to tags.count
memberName = string (tags.getPropAt (index))
newText = clipString (tags[index])
member (memberName).text = newText
end repeat
end if
end
A sample Director 8.5 movie is available for download in Mac or Windows format. This movie uses the older version of the FileXtra, avalable from Kent Kersten at http://kblab.net/xtras/old/
All colorized Lingo code samples have been processed by Dave Mennenoh's brilliant HTMLingo Xtra, available from his site at http://www.crackconspiracy.com/~davem/
Copyright 1997-2024, Director Online. Article content copyright by respective authors.