Articles Archive
Articles Search
Director Wiki
 

Using the BinaryIO Xtra to create binary files

April 20, 1999
by Michael Geary

Have you perhaps seen some of the recent Direct-L postings referring to the BinaryIO Xtra and thought to yourself, "I wonder how the heck that helps multimedia development"? Perhaps you're not terribly comfortable with the thought of dabbling with binary files? Until recently, both of these statements described me. With a bit of experimentation, however, and some great support from Gretchen Macdowall and Glenn Picher, I have been able to significantly improve the functionality of some of my projects by working with binary files.

What are binary files?

Let's cover some of the groundwork first. We've all heard that computers, deep down, all really just know zeros and ones, right? All computers are really just huge (and I mean huge) arrays of switches. Switches can only be ON or OFF, right? That's how computers keep track of things. Every switch has an address, and every switch is either on or off. Now, this might seem a bit limiting, but by looking at switches in a base-2 format (binary), we can group switches together and draw some greater meaning from them.

In order to make sense of these ones and zeros, we've taught computers to turn certain combinations of zeros and ones into letters. For Roman-Alphabet languages, it takes a computer eight zeros or ones to form a character (some languages require double-byte alphabets, but let's not talk about that here). These eight bits are called a byte. For example, the letter 'a' looks like this to a computer:

0  1       1       0       0       0       0       1
128     64      32      16      8       4       2       1

so: 64+32+1 = 97

Each digit is a placeholder in a column. This is actually exactly what we do when we look at the number 97 in base-10, or decimal format:

tens    ones
10       1
9        7

So (9 * 10) + (1 * 7) = 97

Because 01100001 takes up a lot of space and is hard to read, it is often more useful to convert this base-2 (binary) number into a friendlier base. Because we (most of us) have ten fingers, it is comfortable for us to use base-10. Some computer programmers apparently have sixteen fingers, because they like to look at things in base-16, or as hexadecimals (hex). I suspect they are just trying to make things complicated. ;)

To recap, then: as a base-10 (decimal) number 01100001 looks like this: 97. Now you might be thinking to yourself, "hmm, that number seems familiar." On a hunch, you pull up Director and type the following into the message window:

put NumToChar(97).
--"a"

Yup, that is the ascii code for the letter 'a'. What does all this have to do with reading and writing binary files? Well, in an ascii text file, every eight BITs builds a character. When working with text files, Director can't take those eight zeros and ones to mean anything but their literal decimal character equivalent. This is actually quite a restriction.

In a binary file, however, we've got a lot more freedom. We can tell the computer to look at just one byte, we can take two bytes together to represent a 'short' integer, or we can tell the computer that four consecutive bytes represent one potentially very big number, rather than four letters. The chart below displays how one sequence of bytes could be interpreted:

Decimal Byte Sequence

68 79 85 71

In a text editor:

D O U G

As individual bytes:

68 79 85 71

As two 'short' integers each set of four characters is grouped and evaluated so:
6879 is evaluated as (6*8) + (8*4) + (7*2) + (9*1) = 17487 and
8571 is evaluated as (8*8) + (5*4) + (7*2) + (1*1) = 21831

As one 'long' integer all eight characters are grouped and evaluated:
(6*128) + (8*64) + (7*32) + (9*16) + (8*8) + (5*4) + (7*2) + (1*1) = 1146049863

As you can see, we can store information much more efficiently, and we don't have to work in eight bit blocks.

File Specifications

Now let's tie this back to the real world. Binary files such as GIF graphics, AIFF sound files, or MOV movies, rely upon a file specification to store their information. Programs like Director 'know' how to interpret these file formats, specification, and can therefore make sense of these and other binary files when you link or import them.

A file specification can be thought of as a map to a binary file format. It indicates what information can be found at various byte 'addresses' in the file. Without this map, the file looks like only so much garbage. With this map, however, the world is at your feet. The file specifications are published for many common file formats. This means that we mere mortals can also read, create and alter binary files. To illustrate this, I'll discuss the project which prompted me to learn about binary files.

Real World Example

I have created a tool in Director which automatically generates HTML front-end web pages for ColdFusion/Access databases. I simply enter in my database specs, and my tool creates nicely-formatted ColdFusion pages which I can simply upload to our server. Problem was, I had to go through the hassle of defining a database BOTH on my macintosh, in order to generate these administration pages, and on the WindowsNT server, where the actual database was created. If you're making a database with ten tables, each of which has fifteen fields, you're talking about a big, brain-numbing waste of time, where it's easy to make a million mistakes.

"You know," I thought to myself, "If I could have this tool of mine create some file on my mac which contained all the information my server-side database (Access) needed, I'd only have to go into that freezing cold server room once, and I'd only have to stay there for the couple of seconds it would take to import this file."

Yes, I could have programmed my tool to generate tab-delimited text files, but these files don't actually contain any information about the database. Database programs need to know how large a field is, whether or not it is indexed, whether that index is unique, and what sort of information (text, numbers, dates) can go in that field. None of this is communicated in a tab-delimited text file. I needed a binary file.

Enter the DBF file specification

After doing a bit of research, I concluded that if I could somehow turn the database table descriptions in my Director tool into DBF files, I would be able to import these into Microsoft Access (or any other database program, for that matter).

DBF files are binary files. The format was originally developed by Borland back in 1491 on a special request from Queen Isabella of Spain. Since then, the specification has been opened, and there are a couple of versions. DBF files are a bit primitive: field names can only be up to 10 characters long, one DBF file can only describe one database table (it is not relational), and memo fields (text fields which contain more than 255 characters) have to be attached in an external file.

However, despite these limitations, its antiquity has led DBF to become one of the only file formats that is supported by just about every database program under the sun. Because I was only trying to get my database description from one machine to another, DBF was the answer for me.

There's a great website which contains files and links relating to all sorts of standard file formats. It is http://www.wotsit.org. From this site I downloaded the file spec for DBF files. It took a bit of figuring, but it is essentially a simple format. Here it is for your reference:

Thanks to Peter Mikalajunas for this specification.

DBF File Structure
BYTES   DESCRIPTION
00      FoxBase+, FoxPro, dBaseIII+, dBaseIV, no memo - 0x03
        FoxBase+, dBaseIII+ with memo - 0x83
        FoxPro with memo - 0xF5
        dBaseIV with memo - 0x8B
        dBaseIV with SQL Table - 0x8E
01-03   Last update, format YYMMDD
04-07   Number of records in file (32-bit number)
08-09   Number of bytes in header (16-bit number)
10-11   Number of bytes in record (16-bit number)
12-13   Reserved, fill with 0x00
14      dBaseIV flag, incomplete transaction
        Begin Transaction sets it to 0x01
        End Transaction or RollBack reset it to 0x00
15      Encryption flag, encrypted 0x01 else 0x00
        Changing the flag does not encrypt or decrypt the records
16-27   dBaseIV multi-user environment use
28      Production index exists - 0x01 else 0x00
29      dBaseIV language driver ID
30-31   Reserved fill with 0x00
32-n    Field Descriptor array
n+1     Header Record Terminator - 0x0D
FIELD DESCRIPTOR ARRAY TABLE
BYTES   DESCRIPTION
0-10    Field Name ASCII padded with 0x00
11      Field Type Identifier (see table)
12-15   Displacement of field in record
16      Field length in bytes
17      Field decimal places
18-19   Reserved
20      dBaseIV work area ID
21-30   Reserved
31      Field is part of production index - 0x01 else 0x00
FIELD IDENTIFIER TABLE
ASCII   DESCRIPTION
C       Character
D       Date, format YYYYMMDD
F       Floating Point
G       General - FoxPro addition
L       Logical, T:t,F:f,Y:y,N:n,?-not initialized
M       Memo (stored as 10 digits representing the dbt block number)
N       Numeric
P       Picture - FoxPro addition

Note all dbf field records begin with a deleted flag field. If record is deleted - 0x2A (asterisk) else 0x20 (space) End of file is marked with 0x1A

How to use a File Specification

This file specification is, as mentioned above, a map to a DBF file. It tells us what information needs to be in what positions so a database program can understand it. Look at the first address:

00  FoxBase+, FoxPro, dBaseIII+, dBaseIV, no memo - 0x03
    FoxBase+, dBaseIII+ with memo - 0x83
    FoxPro with memo - 0xF5
    dBaseIV with memo - 0x8B
    dBaseIV with SQL Table - 0x8E

This means that the first Byte, 00, needs to contain either 03,83,F5,8B or 8E, depending on the kind of DBF file we are describing. (Note, most file specs refer to byte values in Hexadecimal. See sixteen-finger comment above) No, I don't know what significance the various numbers have, and the beauty of it is, I don't need to know. All I need to know is that if I am working with a farily generic DBF file (which I am), I just need to start my file with the byte 03.

By working through the address descriptions, I was able, with some trial and error, to come up with a Director handler which can take a property list database description and generate a healthy DBF file. Here is a list of resources you'll need to build your own binary files:

Requirements:

Glenn Picher's BinaryIO Xtra

without this puppy you won't get past the 12th byte, because that byte requires the hex value 00, which is the same as NumToChar(0) in Director, which is a character meaning "text file stops here." Director just isn't capable of dealing with binary files on its own. Get the demo version (or better yet, buy the full version -- you'll be amazed at what you can do!) at UpdateStage.

A simple Hex/Decimal converter

There are a million of these out there. Simply go to your favorite shareware site and do a search. For my project I used Josh Goldfoot's Hex/Dec (Macintosh, available at the Info-Mac Hyperarchive).

A sample file in the format you are creating or editing

Consider this your case study. If you are trying to create a TIFF file, export a typical (or not) TIFF from Photoshop and compare it with your own. A picture (even in Hexadecimal code) is worth a thousand bits. For my project, I first exported a database from Filemaker Pro that looked exactly like the one I was trying to create with my tool. Then I compared.

A Decimal- or Hex-Editor

This will allow you to compare the files you create with those that 'real' programs produce. Trust me, without this, you won't get far. Most of the freeware/shareware programs out there are hex editors, yet the BinaryIO xtra only understands decimal values. Depending on how much you are writing, and what your source is, it may make sense to write a lingo routine that will do the number conversion for you. Again, search your favorite shareware site. I use HexEdit by Jim Bumgardner and Lane Roathe (Macintosh, available at the Info-Mac Hyperarchive).

Gettin' your fingers dirty

Here is a sample property list that I use to describe a database. This is a simple definition of a database with two tables, Table1 and Table2, each of which contains two fields, Name & Address and Phone & Fax respectively. All fields are string fields.


["Table1": [[#name: "name", #type: "string", #size: "", #indexed: ¬
  0, #Duplicates: 0, #range: [], #display: 0], [#name: "address", ¬
  #type: "string", #size: "", #indexed: 0, #Duplicates: 0, #range: ¬
  [], #display: 0]], "Table2": [[#name: "phone", #type: "string", ¬
  #size: "", #indexed: 0, #Duplicates: 0, #range: [], #display: 0], ¬
  [#name: "fax", #type: "string", #size: "", #indexed: 0, #Duplicates: ¬
  0, #range: [], #display: 0]]]

I can now loop through each item in this list and create an appropriate DBF file. Below is some code to give you an idea of how to do this:

--this is our database definition shown above
global gTableArray

on WriteDBFfile TargetDir
  
  --initialize a BIO instance
  set BFile = new(xtra "binaryio")
  
  --we start with the first table and loop through the database
  set TableOffset = 1
  
  repeat with thisTable in gTableArray
    
    --get the tablename
    set TableName =  getPropAt(gtablearray,TableOffset)
    
    --make a name for the DBF file
    set FilePath = TargetDir & TableName & ".dbf"
    
    --overwrite this file if it already exists (*caution*)
    if FileExists(FilePath) = 0 then
      DeleteFile(FilePath)
    end if
    
    openFile(BFile,2,FilePath,"FMP3","DBF ")
    
    --address 00
    --set our database type to DBASE IV with no memo
    set DBtype = 3
    
    --write it
    writeChar(BFile,DBtype)
    
    --address 01-03: the date (today)
    set WholeYear = the year of (the systemDate)
    set LastTwo = value(char 3 to 4 of string(WholeYear))
    writeChar(BFile,LastTwo)
    
    set theMonth = (the month of (the systemDate))
    writeChar(BFile,theMonth)
    
    set theDay = (the day of (the systemDate))
    writeChar(BFile,theDay)
    
    --address 04-07 (32-bit number)
    --how many records in the file?
    --since we just want a database definition with no records, this is zero
    set RecordCount = 0
    writeUnsignedLong(BFile,RecordCount)
.
.
.
end

This code, along with the entire DBF creator, is available for download in Mac or PC format. Note that this tool also relies upon Paul Ferry's fine OSUtil Xtra and Yair Sageev's excellent TextCruncher Xtra.

As you can see, all you need to do to write your own binary files is to step through each address in the file specification and write your own information in the format that is expected.

A few final words

Few things will strengthen your understanding of how computers work with data better than trying to write a binary file. This is real nitty-gritty stuff. The beauty of it is that we now have the power to read and interpret any file whose specification we know.

As we all find useful ways of creating and modifying binary files, our capabilites as Director developers will grow significantly. The world is leaning more and more towards Open File Formats (even Al Gore's web site is Open Source), which means all we need are a map and some smarts to be able to alter and create these files to meet our own needs. I am anxious to find out what binary files you want to create, and how it goes for you. Feel free to drop me a line at mgeary / at / antwerpes.de.

Now Grasshopper, go forth and conquer!

Special thanks to Gretchen Macdowall and Zac for some great article feedback.

Michael Geary started working with Director at version 4. His pet technologies include multimedia databases, dynamic PDF generation, Binary file generation and XML. After tromping around the world for a while, he has settled down in Utah again. Boy, those mountains sure are big.... Michael can be reached at michael.geary@seranova.com

Copyright 1997-2024, Director Online. Article content copyright by respective authors.