Section 6.5
6 Files
6.5 Overview of File Processing
In this final part we shall consider operations on files and their uses.
The following terms are often used:
Interrogating/Referencing - Searching to find a particular
key.
Maintenance - Updating various records plus adding and
deleting records.
Sorting - Changing the sequence of records.
6.5.1 Updating Files
Updating By Overlay
Records in indexed sequential files and random files can be accessed directly,
modified and written back to their original locations.
Updating By Copying
This method involves copying the records one by one to a new file, making
modifications as needed.
The result is two versions (or generations) of the file.
6.5.2 File Backup And Generations
Each time a master file is updated another, out of date, generation is left.
It is common to keep three generations:
- Grandfather
- Father
- Son (Current version)
6.5.3 Choosing Between Serial And Direct Access Files
The choice of file organisation is a vital consideration. The following questions
need to be answered:
- What is the most suitable storage medium for the volume of data
involved?
- Must the information always be up-to-date>
- Do users require immediate access to data>
- Can requests for information be grouped together and be batched
processed?
- Are report required in a particular sequence?
- What is the hit rate?
- How volatile is the file?
6.5.4 Hit Rate
This is the measure of how many records are accessed out of the total number,
usually expressed as a percentage.
Example
Updating a payroll master file.
During the process 190 out of 200 employee records are updated.
(190 / 200) × 100 = 95%
6.5.5 Volatility
This is the frequency at which records are added or deleted from a file.
If this frequency is high, then the file is said to be volatile.
6.5.6 Uses Of Different File Organisations
Serial Files
Serial file organisation is mainly used for transaction files. As events in the
real world take place, relevant data records are written to a transaction file.
Mainly used in:
- Sales in a shop.
- Customer's withdrawing money from an ATM.
- Postal orders arriving at a mail order company.
The transactions may be batched and the master file updated later. Alternatively,
the master file may be updated as soon as each event occurs (in real-time). The
transaction file is then kept as a record of what occured in case the master file
corrupted and its father needs to be updated.
Sequential Files
Sequential file organisation is used for master files in high-hit rate
applications.
May be used in:
- Payroll
- Direct mailing (a.k.a. Junk mailing)
Indexed Sequential Files
Indexed sequential files can be processed either sequentially or randomly. This
is very useful because when most of the records need to be processed then they can
be sequentially processed. When only a few need to be updated then they can be
directly accessed.
May be used in:
The stock file would be directly accessed when the customer makes a purchase.
The master file would be accessed using a multi-level index to find the relevant
record. The description and the price would then be printed on the receipt and
the quantity in stock would be updated right away.
The file would be sequentially processed if a report of all the stock or sales
is needed in stock code sequence. Processing the file this way is fast, but it is
not as fast as processing a sequential file.
Random Files
Random files are used when extremly fast access is required to individual
records. Becuase the hashing algorithm generates the record address when it's
applied to the record's key no time is taken looking through various levels of
index.
May be used in:
- Utility programs to validate user names and passwords (on a
network).
- Airline booking systems.
If reports are needed containing all the records in key sequence, these will
take a long time to generate.