CS211 Lesson 24

CS 211 Lesson 24

Binary (unformatted) File Input/Output

Quote:

Some succeed because they are destined to; most succeed because they are determined to. Unknown

Lesson Objectives:

Be able to read (input) and write (output) data using the load and save MATLAB functions
Understand how to write data to a binary file using fwrite
Understand how to read data from a binary file using fread
Understand the syntax differences between issuing commands and calling functions

Lesson:

I. MATLAB Concepts

A. Saving/Loading Workspace Variables

save writes (outputs) the contents of one or more workspace variables to a specified file in binary format.
For example:

    save('mydata.mat', 'Alpha', 'Beta')

stores the contents of the variables Alpha and Beta to a file called mydata.mat

load reads (inputs) variable data values from a specified file.
For example:

    load('mydata.mat', 'Beta')

creates a variable called beta in the current workspace and initializes its value from the data in the file mydata.mat .

B. Binary File I/O - fwrite and fread

fwrite(File_id, Variable, 'Precision') writes (outputs) the contents of a variable to a file in binary format. A file must be opened with fopen() before executing any fwrite() commands.

The precision (data type) of the values written to the file are controlled by the third argument. Possible data types include:

'schar' signed character (1 byte)

'uchar' unsigned character (1 byte)

'char' UNICODE character (2 bytes)

'int8' 8-bit integer

'int16' 16-bit integer

'int32' 32-bit integer

'int64' 64-bit integer

'uint8' unsigned 8-bit integer

'uint16' unsigned 16-bit integer

'uint32' unsigned 32-bit integer

'uint64' unsigned 64-bit integer

'float32' fractional number with 7 significant digits

'float64' same as double

'double' fractional number with 15 digits of accuracy

For example,

File_id = fopen('mydata.dat', 'w');
fwrite(File_id, Alpha, 'double')
fclose(File_id);

stores the contents of the variable Alpha to a file called "mydata.dat" in double precision format (8 bytes per value).

fread(File_id, Number_of_elements, 'File_precision=>Memory_precision') reads (inputs) the binary data stored in a file into a variable. The data type of each element is converted as specified by the third argument. The possible precision options are shown in the table above. A file must be opened with fopen()before executing any fread() commands.

IMPORTANT: You must know two (2) things about the binary data in the file before you can read the data:
         1) what binary format (precision) was used to store the data
         2) the exact number of values in the file to read.

For example,

File_id = fopen('mydata.dat', 'r');
Alpha = fread(File_id, 5, 'int32=>double')
fclose(File_id);

reads 5, 32-bit integers from the file mydata.dat and stores the values as double precision numbers into the variable Alpha.

Use fread and fwrite when:

You need to store data in different data types (e.g., int8, int16, etc.) to save memory and/or disk space for large data sets.

You need to store less than a full vector or matrix to a file. For example:

File_id = fopen('MyApp.dat', 'w');
fwrite(File_id, Alpha(2:end,3:5), 'double');
fclose(File_id);

IMPORTANT: A binary file is a series of bits. The bits will be misinterpreted if you write the data in one format and then read the data back using a different format. It is your job as the programmer to remember the structure of data in a binary file. If you read a binary file incorrectly you will end up with "garbage values" as a result. Consider the following example of what NOT to do!

A = rand(1,5); % 5 random fractions -- all double precision
disp(A);

File_id = fopen('test.dat', 'w');
fwrite(File_id, A, 'double'); % writes 5 double precision values
fclose(File_id);

File_id = fopen('test.dat', 'r');
A = fread(File_id, 40, 'int8=>int8'); % reads 40 8-bit integers
fclose(File_id);

disp(A);

C. Commands vs. Functions (unrelated to file I/O)

Many MATLAB "commands" have a command line format and a function format.

Commands use the format:

    commandName arg1 arg2 arg3 ...

Functions use the format:

    functionName(arg1, arg2, arg3, ...)

For example - the help system defines the "load" command as shown below. The first 7 lines describe the function format, while the last line describes the command format.

load
load('filename')
load('filename', 'X', 'Y', 'Z')
load('filename', '-regexp', exprlist)
load('-mat', 'filename')
load('-ascii', 'filename')
S = load(...)

function format

load filename -regexp expr1 expr2 ...

command format

The arguments to commands and functions sometimes require different formats. For example, both of the lines below will load the contents of a MATLAB binary data file into a variable called alpha. Note that the function call format requires quotes around the variable name while the command format does not.

        load('MyData.mat', 'Alpha'); % function call

    load 'MyData.mat' Alpha      % command

It is recommended that you always use the function call format of MATLAB built-in functions to avoid confusion. In addition, in many cases, the function call format also allows for greater control over how a particular MATLAB procedure performs its work.

II. Good Programming Practices

Use save() to write the entire contents of one or more variables from a program (actually workspace) to a file.
Use load() to read the entire contents of one or more variables from a file into a program.
Use fwrite() to write the partial contents of one or more variables from a program to a file, or to save data to a file in a different data type.
Use fread() to read the partial contents of one or more variables from a file into a program, or to load data from a file and change its data type while it is being read.
If you write data in a binary format to a file, you should typically write the number of values "in front of" the actual data. This allows the code that reads the data to know how many values are stored in the file. (See the example in the algorithms section below.)

III. Algorithms

Suppose you need to store a series of "values", each of a different size and data type, to a binary file. To be able to read the data from the binary file, the size of each "value" must be known. Therefore, when you write binary data to a file, typically you need to write the size of the data size as well.
For example, the following code writes a series of "values" related to cadets to a binary file.
```
Cadets = { { 'Fred Smith ', '1234567890', [96 87 76 89]  }, ...
           { 'Mary Taylor', '5678901234', [100 94 89 93] }, ...
           { 'Red Brown  ', '8901234567', [95 100 85 92] } };
```
File_id = fopen('Cadets.dat', 'w');
for j = 1:length(Cadets)
fwrite(File_id, length(Cadets{j}{1}), 'int16'); % writes data "size"
fwrite(File_id, Cadets{j}{1}, 'char');          % writes cadet name

fwrite(File_id, length(Cadets{j}{2}), 'int16'); % writes data "size"
fwrite(File_id, Cadets{j}{2}, 'char');          % writes cadet ID

fwrite(File_id, length(Cadets{j}{3}), 'int16'); % writes data "size"
fwrite(File_id, Cadets{j}{3}, 'int8');          % writes cadet grades
end
fclose(File_id);

The following code reads the binary file created by the code above.

File_id = fopen('Cadets.dat', 'r');
Number_cadets = 0;
while true
  % Read the number of characters in the cadet name
  N = fread(File_id, 1, 'int16=>double');
  if isempty(N)
    break; % The file is empty - stop reading the file
  end
  % Read the cadet name
  Number_cadets = Number_cadets + 1;
  Cadets{Number_cadets}{1} = fread(File_id, N, 'char=>char');

  % Read the cadet ID
  N = fread(File_id, 1, 'int16=>double');
  Cadets{Number_cadets}{2} = fread(File_id, N, 'char=>char');

  % Read the cadet grades
  N = fread(File_id, 1, 'int16=>double');
  Cadets{Number_cadets}{3} = fread(File_id, N, 'int8=>double');

end    
fclose(File_id);

for J = 1:length(Cadets)
  for K = 1:length(Cadets{J})
    disp(Cadets{J}{K})
  end
end

Lab Work: Lab 24

References: Chapman Textbook: sections 8.2, 8.5

'schar'	signed character (1 byte)
'uchar'	unsigned character (1 byte)
'char'	UNICODE character (2 bytes)
'int8'	8-bit integer
'int16'	16-bit integer
'int32'	32-bit integer
'int64'	64-bit integer
'uint8'	unsigned 8-bit integer
'uint16'	unsigned 16-bit integer
'uint32'	unsigned 32-bit integer
'uint64'	unsigned 64-bit integer
'float32'	fractional number with 7 significant digits
'float64'	same as double
'double'	fractional number with 15 digits of accuracy

load load('filename') load('filename', 'X', 'Y', 'Z') load('filename', '-regexp', exprlist) load('-mat', 'filename') load('-ascii', 'filename') S = load(...)	function format
load filename -regexp expr1 expr2 ...	command format