Friday 30 July 2010

SQL Defragmentation Technique






The nature of an SQL database file

A database file is a data storage centre that consists of an organized collection of data for one or multiple users. The size of this data centre grows as the user's gradually introduce new data into it. When a database file is first created, the operating system will assign an area of the hard disk or RAID large enough to store this file. Due to the nature of the database, this initial area will not be large enough for the gradually rising amount of data. The operating system will then assign a new area to store the new data belonging to the database. Obviously, this operation will be continued as typically, the user(s) are adding to the database everyday. As a result, file fragments are generated.


Why existing methods of recovery do not work

As we explained in the last article, when a database file is accidentally deleted, the metadata of this file is deleted at the same time. The fragments belonging to this deleted database will become anonymous without the associated metadata. To recover this file, the IT manager/DR technician will typically use a conventional recovery application. However, these applications are designed to retrieve files that have been accidentally deleted - they will be useless when being used to recover fragmented files, especially *.mdf file fragments.


MDF Page Structure and Deframentation Technique

MDF file uses a standard database file format to store the tables and contents. The minimum storage unit is a Page. A page is an integer multiple of sectors. The number of sectors in a page is determined by different database application. For example, a Microsoft MDF file has 16 sectors in a page. So the page size is 8KB. Within a MDF file, it has a page structure as shown below:

Page Sector Offset

Page ID

0

Page ID = 0

16

Page ID = 1

32

Page ID = 2

48

Page ID = 3

64

Page ID = 4

16*N

Page ID = N



MDF pages will be arranged in a sequential order based on the Page ID stored in a page header within a MDF file. Obviously, pages will be fragmented because of the nature of a database file as mentioned above. The SQL Defragmentation Technique is going to locate each piece of fragments by looking at the Page IDs in the page headers and then resemble them in a correct order.


Written by: Zijian Xie (R&D Manager, BEng, MSc)








No comments:

Post a Comment