Section 6.4
6 Files
6.4 Random Access Files
Records are written and retrieved from disk in adirect or random way.
Random file organisation requires direct access storage (DAS) media.
The program that stores and retrieves the records has to specify the address
of the record first of all.
A field is selected to be the key for each record.
An algorithm (a set of instructions) turns the value of the records key into an
address for the record.
6.4.1 Address Generation
The simplest method uses the value of the key field as the record address.
Example
Record Address
|
Customer Number
|
Customer Name
|
1
|
1
|
<empty record>
|
2
|
2
|
<empty record>
|
104
|
104
|
Davis
|
208
|
208
|
Peterson
|
405
|
405
|
Franks
|
408
|
408
|
Black
|
However, with this method, records are often too spaced out.
6.4.2 Hashing Algorithms
Hashing algorithms convert a records key into an address for the records.
With numeric keys a common hashing algorithm is:
The Division-Remainder Method
- Estimate the number of records to store.
- Find the first prime number greater than the number of records.
- The key of the record to be stored is divided by this prime number
and the remainder is used as the address.
For alpha-numeric key fields a common way of hashing the string to a record
address is to add up the ASCII values of the characters and find the remainder
on division of the sum.
Pseudocode
INPUT record details
sum = 0
FOR letter = 1 TO length of record's key
    extract character
    find ASCII code
    add it to sum
NEXT letter
address = sum MOD nearest prime
When two record keys are hashed to the same address, we say that they are
synonyms.
Possible Solutions
- Put the second in the next avaliable space.
- Use a separate overflow area for such records.
As with indexed sequential files, at some point we may need to reorganise the
file.
6.4.4 Composite Data Types
As well as the standard data types in QuickBASIC we can also define our own data
types using the TYPE
statement.
Example
TYPE MyRecord
    Aname AS STRING * 12
    Phone AS STRING * 12
    Units AS INTEGER
    Price AS SINGLE
    Amount AS DOUBLE
END TYPE
This is a composite data type.
We can now dimension variables and arrays as this new data type.
DIM details AS MyRecord
We can now store several items of data in one variable.
details.Aname = "James Bond"
details.Amount = 0.07
Likewise, we can do the same for arrays.
Example
DIM detailsarray (1 TO 10) AS MyRecord
Each element of the 1D array would have the composite parts as defined in the
TYPE
statement.
detailsarray(1).Aname = "Another Person"
detailsarray(1).Amount = 0.95
6.4.5 Data Storage In Random Access Files
Although it is possible to have variable length records with random access files
it is simpler to work with fixed length records.
Random access records are stored in a different way to sequential records.
Example
Field Name
|
Data Type
|
Example
|
Customer's Name
|
$
|
Jones P.
|
Telephone Number
|
$
|
01503 123456
|
Phone Units Used
|
%
|
428
|
Price Per Unit
|
!
|
8.0
|
Total To Pay
|
#
|
3424.0
|
These data types are combined into a composite data type.
We need to decide how many characters to allow for each field.
The customer's name and telephone number are simple as they are strings. We
decide that no customers have names over 20 characters long. We allow 12 characters
for the phone number.
Storing Numeric Data
In sequential files numbers were stored as a series of ASCII characters. For
example: 17,002 is stored using five bytes; one for each digit.
This is wasteful of memory, and in random access files, numbers are saved in a
compressed binary format.
In general:
Integers take:
|
2 bytes
|
Long integers take:
|
4 bytes
|
Floating point (single precision):
|
4 bytes
|
Double precision floating point:
|
8 bytes
|
So we can now complete our example:
Field Name
|
Data Type
|
Bytes Required
|
Example
|
Customer's Name
|
$
|
20
|
Jones P.
|
Telephone Number
|
$
|
12
|
01503 123456
|
Phone Units Used
|
%
|
2
|
428
|
Price Per Unit
|
!
|
4
|
8.0
|
Total To Pay
|
#
|
8
|
3424.0
|
6.4.6 Inserting Data Into A Random Access File
Step 1
First, the field structure of each record is defined by means of the TYPE - END TYPE
statement.
TYPE MyRecord
    Aname AS STRING * 12
    Phone AS STRING * 12
    Units AS INTEGER
    Price AS SINGLE
    Amount AS DOUBLE
Step 2
An array or variable is explicity declared.
DIM Phonebill AS MyRecord
Step 3
Open our random file.
OPEN "G:Raphone.dat" FOR RANDOM AS #n LEN = L%
where n is the number and L% is the length of each record in bytes.
Note that random files are opened for input and output simultaneously.
Step 4
Now we assign data to our variable, phonebill
.
phonebill.Aname = "Adams M"
phonebill.Phone = "01802 123456"
Step 5
We can now store this data into our file by
PUT #n, m, phonebill
where n is the channel number and m is the record address.
Step 6
Finally the file is closed.
CLOSE #n
As with inserting data we need to have declared a variable or array with a composite
data type that matches the files field structure.
As before we open the file using
OPEN "G:Raphone.dat" FOR RANDOM AS #n LEN = L%
where n is the channel number and L% is the length of each record in bytes.
Now we can retrieve the data using
PUT #n, m, phonebill
where n is the channel number and m is the record address.
When we've finished retrieving data we close the channel.
CLOSE #n
6.4.7 Retrieving Data From A Random Access File
If we wish to find out how many bytes are in a random file we can use
LOF(n)
If this is divided by the byte length of each record, the number of records can be
calculated.
6.4.8 Variable Length Records In A Random Access File
If the number of characters in a string field varies greatly or the number of fields
in the record varies then the use of variable length records is appropriate.
There are two ways of implementing:
- Pick your own end of field and/or record markers.
- The first byte of each field is a character count.