RSS

Category Archives: Compression

Compression

Three kinds of compression

  • Row level data compression
  • Page level data compression
  • Backup compression

Benefits of compression

  1. Reduces the amount of physical disk space required to store data—and the disk I/O needed to write and read it
  2. reduces the amount of Buffer Cache memory needed to store data in the Buffer Cache. This in turn allows more data to be stored in the Buffer Cache, reducing the need for SQL Server to access the disk to get data, as the data is now more likely to be in memory than disk, further reducing disk I/O.

Disadvantage

Using compression uses up additional CPU cycles. If your server has plenty to spare, then you have no problem. But if your server is already experiencing a CPU bottleneck, then perhaps compression is better left turned off.

Row Level Compression works as follows:

  • Reducing the amount of metadata used to store a row
  • Storing fixed length numeric data types as if they were variable-length data types. For example, if you store the value 1 in a bigint data type, storage will only take 1 byte, not 8 bytes, which the bigint data types normally takes.
  • Storing CHAR data types as variable-length data types. For example, if you have a CHAR (100) data type, and only store 10 characters in it, blank characters are not stored, thus reducing the space needed to the store data.
  • Not storing NULL or 0 values

Note:offers less compression than page-level data compression, but it also incurs less overhead, reducing the amount of CPU resources required to implement it.

Page Level Compression Works in 3 stages follows:

  • First does a row-level data compression to get as many rows as it can on a single page.
  • Next, prefix compression is run. Essentially, repeating patterns of data at the beginning of the values of a given column are removed and substituted with an abbreviated reference that is stored in the compression information (CI) structure that immediately follows the page header of a data page.
  • And last, dictionary compression is used. Dictionary compression searches for repeated values anywhere on a page and stores them in the CI. One of the major differences between prefix and dictionary compression is that prefix compression is restricted to one column, while dictionary compression works anywhere on a data page.

Note: The amount of compression provided by page-level data compression is highly dependent on the data stored in a table or index. If a lot of the data repeats itself, then compression is more efficient. If the data is more random, then little benefits can be gained using page-level compression. Page-level data compression offers greater compression, but at the expense of greater CPU utilization

Backup Compression

Backup compression does not use row-level or page-level data compression. Instead, backup compression occurs only at the time of a backup, and it uses its own proprietary compression technique. Backup compression can be used when using, or not using, data compression, although using backup compression on a database that is already compressed using data compression may not offer additional benefits.

Important
Remember, compression is based on a per object basis. If you want to compress the non-clustered indexes for a table, you will have to compress each one, one at a time. Once you have compressed a table (a clustered index), keep in mind that any non-clustered indexes that this table may have are not automatically compressed for you.

Source: http://www.bradmcgehee.com/2010/03/an-introduction-to-data-compression-in-sql-server-2008/

 
Leave a comment

Posted by on October 2, 2011 in Compression, SQL Server