Using the BinaryIO Xtra to create binary files
April 20, 1999
by Michael Geary
Have you perhaps seen some of the recent Direct-L postings referring to the BinaryIO Xtra and thought to yourself, "I wonder how the heck that helps multimedia development"? Perhaps you're not terribly comfortable with the thought of dabbling with binary files? Until recently, both of these statements described me. With a bit of experimentation, however, and some great support from Gretchen Macdowall and Glenn Picher, I have been able to significantly improve the functionality of some of my projects by working with binary files.
What are binary files?
Let's cover some of the groundwork first. We've all heard that computers, deep down, all really just know zeros and ones, right? All computers are really just huge (and I mean huge) arrays of switches. Switches can only be ON or OFF, right? That's how computers keep track of things. Every switch has an address, and every switch is either on or off. Now, this might seem a bit limiting, but by looking at switches in a base-2 format (binary), we can group switches together and draw some greater meaning from them.
In order to make sense of these ones and zeros, we've taught computers to turn certain combinations of zeros and ones into letters. For Roman-Alphabet languages, it takes a computer eight zeros or ones to form a character (some languages require double-byte alphabets, but let's not talk about that here). These eight bits are called a byte. For example, the letter 'a' looks like this to a computer:
0 1 1 0 0 0 0 1 128 64 32 16 8 4 2 1
so: 64+32+1 = 97
Each digit is a placeholder in a column. This is actually exactly what we do when we look at the number 97 in base-10, or decimal format:
tens ones 10 1 9 7
So (9 * 10) + (1 * 7) = 97
Because 01100001 takes up a lot of space and is hard to read, it is often more useful to convert this base-2 (binary) number into a friendlier base. Because we (most of us) have ten fingers, it is comfortable for us to use base-10. Some computer programmers apparently have sixteen fingers, because they like to look at things in base-16, or as hexadecimals (hex). I suspect they are just trying to make things complicated. ;)
To recap, then: as a base-10 (decimal) number 01100001 looks like this: 97. Now you might be thinking to yourself, "hmm, that number seems familiar." On a hunch, you pull up Director and type the following into the message window:
put NumToChar(97). --"a"
Yup, that is the ascii code for the letter 'a'. What does all this have to do with reading and writing binary files? Well, in an ascii text file, every eight BITs builds a character. When working with text files, Director can't take those eight zeros and ones to mean anything but their literal decimal character equivalent. This is actually quite a restriction.
In a binary file, however, we've got a lot more freedom. We can tell the computer to look at just one byte, we can take two bytes together to represent a 'short' integer, or we can tell the computer that four consecutive bytes represent one potentially very big number, rather than four letters. The chart below displays how one sequence of bytes could be interpreted:
Decimal Byte Sequence
68 79 85 71
In a text editor:
D O U G
As individual bytes:
68 79 85 71
As two 'short' integers each set of four characters is grouped and evaluated so:
6879 is evaluated as (6*8) + (8*4) + (7*2) + (9*1) = 17487 and
8571 is evaluated as (8*8) + (5*4) + (7*2) + (1*1) = 21831
As one 'long' integer all eight characters are grouped and evaluated:
(6*128) + (8*64) + (7*32) + (9*16) + (8*8) + (5*4) + (7*2) + (1*1) = 1146049863
As you can see, we can store information much more efficiently, and we don't have to work in eight bit blocks.
File Specifications
Now let's tie this back to the real world. Binary files such as GIF graphics, AIFF sound files, or MOV movies, rely upon a file specification to store their information. Programs like Director 'know' how to interpret these file formats, specification, and can therefore make sense of these and other binary files when you link or import them.
A file specification can be thought of as a map to a binary file format. It indicates what information can be found at various byte 'addresses' in the file. Without this map, the file looks like only so much garbage. With this map, however, the world is at your feet. The file specifications are published for many common file formats. This means that we mere mortals can also read, create and alter binary files. To illustrate this, I'll discuss the project which prompted me to learn about binary files.
Real World Example
I have created a tool in Director which automatically generates HTML front-end web pages for ColdFusion/Access databases. I simply enter in my database specs, and my tool creates nicely-formatted ColdFusion pages which I can simply upload to our server. Problem was, I had to go through the hassle of defining a database BOTH on my macintosh, in order to generate these administration pages, and on the WindowsNT server, where the actual database was created. If you're making a database with ten tables, each of which has fifteen fields, you're talking about a big, brain-numbing waste of time, where it's easy to make a million mistakes.
"You know," I thought to myself, "If I could have this tool of mine create some file on my mac which contained all the information my server-side database (Access) needed, I'd only have to go into that freezing cold server room once, and I'd only have to stay there for the couple of seconds it would take to import this file."
Yes, I could have programmed my tool to generate tab-delimited text files, but these files don't actually contain any information about the database. Database programs need to know how large a field is, whether or not it is indexed, whether that index is unique, and what sort of information (text, numbers, dates) can go in that field. None of this is communicated in a tab-delimited text file. I needed a binary file.
Enter the DBF file specification
After doing a bit of research, I concluded that if I could somehow turn the database table descriptions in my Director tool into DBF files, I would be able to import these into Microsoft Access (or any other database program, for that matter).
DBF files are binary files. The format was originally developed by Borland back in 1491 on a special request from Queen Isabella of Spain. Since then, the specification has been opened, and there are a couple of versions. DBF files are a bit primitive: field names can only be up to 10 characters long, one DBF file can only describe one database table (it is not relational), and memo fields (text fields which contain more than 255 characters) have to be attached in an external file.
However, despite these limitations, its antiquity has led DBF to become one of the only file formats that is supported by just about every database program under the sun. Because I was only trying to get my database description from one machine to another, DBF was the answer for me.
There's a great website which contains files and links relating to all sorts of standard file formats. It is http://www.wotsit.org. From this site I downloaded the file spec for DBF files. It took a bit of figuring, but it is essentially a simple format. Here it is for your reference:
Thanks to Peter Mikalajunas for this specification.
DBF File Structure BYTES DESCRIPTION 00 FoxBase+, FoxPro, dBaseIII+, dBaseIV, no memo - 0x03 FoxBase+, dBaseIII+ with memo - 0x83 FoxPro with memo - 0xF5 dBaseIV with memo - 0x8B dBaseIV with SQL Table - 0x8E 01-03 Last update, format YYMMDD 04-07 Number of records in file (32-bit number) 08-09 Number of bytes in header (16-bit number) 10-11 Number of bytes in record (16-bit number) 12-13 Reserved, fill with 0x00 14 dBaseIV flag, incomplete transaction Begin Transaction sets it to 0x01 End Transaction or RollBack reset it to 0x00 15 Encryption flag, encrypted 0x01 else 0x00 Changing the flag does not encrypt or decrypt the records 16-27 dBaseIV multi-user environment use 28 Production index exists - 0x01 else 0x00 29 dBaseIV language driver ID 30-31 Reserved fill with 0x00 32-n Field Descriptor array n+1 Header Record Terminator - 0x0D FIELD DESCRIPTOR ARRAY TABLE BYTES DESCRIPTION 0-10 Field Name ASCII padded with 0x00 11 Field Type Identifier (see table) 12-15 Displacement of field in record 16 Field length in bytes 17 Field decimal places 18-19 Reserved 20 dBaseIV work area ID 21-30 Reserved 31 Field is part of production index - 0x01 else 0x00 FIELD IDENTIFIER TABLE ASCII DESCRIPTION C Character D Date, format YYYYMMDD F Floating Point G General - FoxPro addition L Logical, T:t,F:f,Y:y,N:n,?-not initialized M Memo (stored as 10 digits representing the dbt block number) N Numeric P Picture - FoxPro addition
Note all dbf field records begin with a deleted flag field. If record is deleted - 0x2A (asterisk) else 0x20 (space) End of file is marked with 0x1A
How to use a File Specification
This file specification is, as mentioned above, a map to a DBF file. It tells us what information needs to be in what positions so a database program can understand it. Look at the first address:
00 FoxBase+, FoxPro, dBaseIII+, dBaseIV, no memo - 0x03 FoxBase+, dBaseIII+ with memo - 0x83 FoxPro with memo - 0xF5 dBaseIV with memo - 0x8B dBaseIV with SQL Table - 0x8E
This means that the first Byte, 00, needs to contain either 03,83,F5,8B or 8E, depending on the kind of DBF file we are describing. (Note, most file specs refer to byte values in Hexadecimal. See sixteen-finger comment above) No, I don't know what significance the various numbers have, and the beauty of it is, I don't need to know. All I need to know is that if I am working with a farily generic DBF file (which I am), I just need to start my file with the byte 03.
By working through the address descriptions, I was able, with some trial and error, to come up with a Director handler which can take a property list database description and generate a healthy DBF file. Here is a list of resources you'll need to build your own binary files:
Requirements:
Glenn Picher's BinaryIO Xtra
without this puppy you won't get past the 12th byte, because that byte requires the hex value 00, which is the same as NumToChar(0) in Director, which is a character meaning "text file stops here." Director just isn't capable of dealing with binary files on its own. Get the demo version (or better yet, buy the full version -- you'll be amazed at what you can do!) at UpdateStage.
A simple Hex/Decimal converter
There are a million of these out there. Simply go to your favorite shareware site and do a search. For my project I used Josh Goldfoot's Hex/Dec (Macintosh, available at the Info-Mac Hyperarchive).
A sample file in the format you are creating or editing
Consider this your case study. If you are trying to create a TIFF file, export a typical (or not) TIFF from Photoshop and compare it with your own. A picture (even in Hexadecimal code) is worth a thousand bits. For my project, I first exported a database from Filemaker Pro that looked exactly like the one I was trying to create with my tool. Then I compared.
A Decimal- or Hex-Editor
This will allow you to compare the files you create with those that 'real' programs produce. Trust me, without this, you won't get far. Most of the freeware/shareware programs out there are hex editors, yet the BinaryIO xtra only understands decimal values. Depending on how much you are writing, and what your source is, it may make sense to write a lingo routine that will do the number conversion for you. Again, search your favorite shareware site. I use HexEdit by Jim Bumgardner and Lane Roathe (Macintosh, available at the Info-Mac Hyperarchive).
Gettin' your fingers dirty
Here is a sample property list that I use to describe a database. This is a simple definition of a database with two tables, Table1 and Table2, each of which contains two fields, Name & Address and Phone & Fax respectively. All fields are string fields.
["Table1": [[#name: "name", #type: "string", #size: "", #indexed: ¬ 0, #Duplicates: 0, #range: [], #display: 0], [#name: "address", ¬ #type: "string", #size: "", #indexed: 0, #Duplicates: 0, #range: ¬ [], #display: 0]], "Table2": [[#name: "phone", #type: "string", ¬ #size: "", #indexed: 0, #Duplicates: 0, #range: [], #display: 0], ¬ [#name: "fax", #type: "string", #size: "", #indexed: 0, #Duplicates: ¬ 0, #range: [], #display: 0]]]
I can now loop through each item in this list and create an appropriate DBF file. Below is some code to give you an idea of how to do this:
--this is our database definition shown above global gTableArray on WriteDBFfile TargetDir --initialize a BIO instance set BFile = new(xtra "binaryio") --we start with the first table and loop through the database set TableOffset = 1 repeat with thisTable in gTableArray --get the tablename set TableName = getPropAt(gtablearray,TableOffset) --make a name for the DBF file set FilePath = TargetDir & TableName & ".dbf" --overwrite this file if it already exists (*caution*) if FileExists(FilePath) = 0 then DeleteFile(FilePath) end if openFile(BFile,2,FilePath,"FMP3","DBF ") --address 00 --set our database type to DBASE IV with no memo set DBtype = 3 --write it writeChar(BFile,DBtype) --address 01-03: the date (today) set WholeYear = the year of (the systemDate) set LastTwo = value(char 3 to 4 of string(WholeYear)) writeChar(BFile,LastTwo) set theMonth = (the month of (the systemDate)) writeChar(BFile,theMonth) set theDay = (the day of (the systemDate)) writeChar(BFile,theDay) --address 04-07 (32-bit number) --how many records in the file? --since we just want a database definition with no records, this is zero set RecordCount = 0 writeUnsignedLong(BFile,RecordCount) . . . end
This code, along with the entire DBF creator, is available for download in Mac or PC format. Note that this tool also relies upon Paul Ferry's fine OSUtil Xtra and Yair Sageev's excellent TextCruncher Xtra.
As you can see, all you need to do to write your own binary files is to step through each address in the file specification and write your own information in the format that is expected.
A few final words
Few things will strengthen your understanding of how computers work with data better than trying to write a binary file. This is real nitty-gritty stuff. The beauty of it is that we now have the power to read and interpret any file whose specification we know.
As we all find useful ways of creating and modifying binary files, our capabilites as Director developers will grow significantly. The world is leaning more and more towards Open File Formats (even Al Gore's web site is Open Source), which means all we need are a map and some smarts to be able to alter and create these files to meet our own needs. I am anxious to find out what binary files you want to create, and how it goes for you. Feel free to drop me a line at mgeary / at / antwerpes.de.
Now Grasshopper, go forth and conquer!
Special thanks to Gretchen Macdowall and Zac for some great article feedback.
Copyright 1997-2024, Director Online. Article content copyright by respective authors.