RSS

Category Archives: High Availibilty

Basic High Availability summary points

Failover clustering provides failure transparency. Although the inventory database is unavailable for approximately 10-15 seconds during a failover, the database comes back online within the same instance name. The fi rst step in the failover process is to restart SQL Server on the same node. The instance is restarted on the same node because the cluster fi rst assumes that a transient error caused the health check to fail. If the restart does not respond immediately, the SQL Server group fails over to another node in the cluster (the secondary node). The network name of the server running SQL Server is unregistered from DNS. The SQL Server IP address is bound to the network interface card (NIC) on the secondary node. The disks associated to the SQL Server instance are mounted on the secondary node. After the IP address is bound to the NIC on the secondary node, the network name of the SQL Server instance is registered into DNS. After the network name and disks are online, the SQL Server service is started. After the SQL Server service is started, SQL
Server Agent and Full Text indexing are started. Failover clustering provides automatic failover without affecting transaction performance. Failover clustering provides failure transparency, has a wait of 10-15 seconds on average for a failover to occur. In addition, a cluster failover requires a restart of the Microsoft SQL Server instance. This restart causes all caches to start empty and affects performance until the cache is repopulated.

Database mirroring can fail over to the mirror automatically. However, transactions have to be committed on both the principal and mirror, which affects the performance of applications. Although database mirroring can fail over to the mirror automatically, the mirror database is on a separate instance. During the failover, the server name to reach the inventory database changes and breaks the drop shipping service until connections can be moved and recreated. Database mirroring configured in High Availability operating mode provides automatic detection and automatic failover. Failover generally occurs within 1-3 seconds. By using the capabilities of the Microsoft Data Access Components (MDAC) libraries that ship with Microsoft Visual Studio 2005 and later,  applications can use the transparent client redirect capabilities to mask any failures from users. Database mirroring also contains technology that enables the cache on the mirror to be maintained in a semi-hot state so that application performance is not appreciably affected during a failover.

Replication and log shipping require manual detection as well as manual failover.  Users cannot be connected to a database when it is being restored in Log shipping mode

 
 

Replication

Transactional replication begins with an initial snapshot being applied to the subscriber to ensure that the two databases are synchronized. As subsequent transactions are issued against the publisher, the replication engine applies them to the subscriber. The incremental transaction flow from publisher to subscriber makes transactional replication a good choice for maintaining a secondary copy of a database for availability or to offload reporting operations. The most common configuration for transactional replication is in a server-to-server environment.

Transactional Replication uses Log reader agent to move transactions from the transaction log on the publisher to the distribution database. The distribution agent then moves transactions from the distribution database to each subscriber.

Transactional replication propagates data in one direction – from publisher to suscriber. You can configure transactional replication with two optional modes—immediate updating subscribers and queued updating subscribers—that enable transactions to be issued against a subscriber. Transactional Replication can be configured in three different architectures – default architecture is to have one publisher with one ore more subscribers. Alternatively, transactional replication can be configured by using either bi-directional or peer to peer architecture.

Immediate Updating Subscriber Option
The Immediate Updating Subscriber option allows changes to be made at the subscriber that are propagated back to the publisher. The change is then picked up by the transactional engine and sent to all other subscribers, but it does not get reapplied to the subscriber that originated the change. To prevent the change from being sent back to the originating subscriber, a timestamp column is required on each table that is participating in replication.

The process of applying changes at a subscriber confi gured with the immediate updating
subscriber option is the following:
1. The application issues the transaction at the subscriber.
2. The trigger is fired.
3. The trigger enlists the Microsoft Distributed Transaction Coordinator (MS DTC) to
connect to the publisher and reissue the transaction.
4. The transaction commits on the publisher.
5. The trigger commits on the subscriber.
6. The transaction commits on the subscriber.

The main issue in high-availability architectures with immediate updating subscribers is that changes must be applied to the publisher. If the publisher is not available, the distributed transaction fails. Because the distributed transaction is being executed from a trigger, the initiating transaction also fails and rolls back. Therefore, the immediate updating subscriber option is an incompatible replication option for high availability.

Queued Updating Subscriber Option
The Queued Updating Subscriber option also allows changes to be made at the subscriber and then propagated back to the publisher, but it does so via a much different mechanism than the immediate updating subscriber option.

The process by which a change is applied to a subscriber using this option is as follows:
1. The application issues a transaction.
2. The trigger is fired.
3. The trigger logs the transaction into a queue (a table within the database).
4. The trigger commits.
5. The transaction commits.
The Queue Reader Agent periodically transfers the queue back to the publisher and issues all the transactions. A change transferred to the publisher is not reapplied to the subscriber that initiated the transaction. This requires a timestamp column on the table.

Because of the asynchronous processing of a queued update, data confl icts can occur. To mitigate data confl icts, when you use the queued updating subscriber option in high- availability architectures, changes are applied to the subscriber only during a failure scenario. This ensures that changes are always made to a single copy of the data, thereby enabling normal application processing to handle confl icting changes. The advantage of the queued updating subscriber option is that when the publisher is brought back online, all the changes that occurred at the subscriber during the outage can be flushed to the publisher automatically. All the planning and processes required to fail back with other architectures is eliminated because the queued updating subscriber option has the built-in capability to bring the publisher up to date again following a failover. The only step
required to fail back to the publisher is to repoint the applications.

The push Distribution Agent requires the following permissions to function properly:

  • dbo_owner of the subscription database,
  • Read permission of the snapshot folder, and
  • member of the publication access list (PAL).

Tracer tokens enable an administrator to gather timing statistics within the replication form publisher to distributor, and also from distributor to subscriber, to monitory point to point latency.

Source: MCTS Training Kit – Mike Hotek

 
 

Log Shipping

Log shipping allows you to automatically send transaction log backups from a primary database on a primary server instance to one or more secondary databases on separate secondary server instances.

Log Shipping Operations
  • Back up the transaction log at the primary server instance
  • Copy the transaction log file to the secondary server instance.
  • Restore the log backup on the secondary server instance.

Important: A log shipping configuration does not automatically fail over from the primary server to the secondary server. If the primary database becomes unavailable, any of the secondary databases can be brought online manually.

Configuration consists of
  • Primary server and database – must use the full or bulk-logged recovery model;
  • Secondary server and databases (can have more than one)
  • Optional monitor server

The optional monitor server tracks all of the details of log shipping, including:

  • When the transaction log on the primary database was last backed up
  • When the secondary servers last copied and restored the backup files.
  • Information about any backup failure alerts.

Log Shipping Jobs – handled by dedicated SQL Server Agent jobs

  • Backup job – created on the primary server instance for each primary database. It performs the backup operation, logs history to the local server and the monitor server, and deletes old backup files and history information
  • Copy job – created on each secondary server instance in a log shipping configuration. This job copies the backup files from the primary server to a configurable destination on the secondary server and logs history on the secondary server and the monitor server
  • Restore Job – created on the secondary server instance for each log shipping configuration. This job restores the copied backup files to the secondary databases. It logs history on the local server and the monitor server, and deletes old files and old history information
  • Alert job – created on the monitor server instance (if used). This alert job is shared by the primary and secondary databases of all log shipping configurations using this monitor server instance. Any change to the alert job (such as rescheduling, disabling, or enabling the job) affects all databases using that monitor server. If a monitor server is not used, alert jobs are created locally on the primary server instance and each secondary server instance.

Enabling and configuring Log Shipping

Source http://msdn.microsoft.com/en-us/library/ms190640.aspx

 
 

Database Mirroring

  • Works only if the database is in full recovery model
  • master, msdb, tempdb, or model databases cannot be mirrored
  • Two copies of a single database that must reside on different server instances of SQL Server Database Engine.
  • One server instance serves the database to clients (the principal server).
  • The other instance acts as a hot or warm standby server (the mirror server), depending on the configuration and state of the mirroring session.
  • When a database mirroring session is synchronized, database mirroring provides a hot standby server that supports rapid failover without a loss of data from committed transactions.
  • When the session is not synchronized (asynchronized), the mirror server is typically available as a warm standby server (with possible data loss).
  • Under asynchronous operation, the transactions commit without waiting for the mirror server to write the log to disk, which maximizes performance – high performance mode. Under synchronous operation, a transaction is committed on both partners, but at the cost of increased transaction latency.- high safety mode

Transaction safety is controlled by the SAFETY property of the ALTER DATABASE statement. On a database that is being mirrored, SAFETY is either FULL or OFF.

  • If the SAFETY option is set to FULL, database mirroring operation is synchronous, after the initial synchronizing phase. If a witness is set in high-safety mode, the session supports automatic failover.
  • If the SAFETY option is set to OFF, database mirroring operation is asynchronous. The session runs in high-performance mode, and the WITNESS option should also be OFF.

Role Switching

Within the context of a database mirroring session, the principal and mirror roles are typically interchangeable in a process known as role switching. Role switching involves transferring the principal role to the mirror server. In role switching, the mirror server acts as the failover partner for the principal server. When a role switch occurs, the mirror server takes over the principal role and brings its copy of the database online as the new principal database. The former principal server, if available, assumes the mirror role, and its database becomes the new mirror database. Potentially, the roles can switch back and forth repeatedly.

The following three forms of role switching exist.

  • Automatic failoverThis requires high-safety mode and the presence of the mirror server and a witness. The database must already be synchronized, and the witness must be connected to the mirror server.

    The role of the witness is to verify whether a given partner server is up and functioning. If the mirror server loses its connection to the principal server but the witness is still connected to the principal server, the mirror server does not initiate a failover.

  • Manual failoverThis requires high-safety mode. The partners must be connected to each other, and the database must already be synchronized.
  • Forced service (with possible data loss)Under high-performance mode and high-safety mode without automatic failover, forcing service is possible if the principal server has failed and the mirror server is available.
    Important note Important
    High-performance mode is intended to run without a witness. But if a witness exists, forcing service requires that the witness is connected to the mirror server.

In any role-switching scenario, as soon as the new principal database comes online, the client applications can recover quickly by reconnecting to the database.

 
 

Database Snapshots

What is it:
  • A read-only, static view of a database
  • Consistent with the source database at the moment of the snapshot’s creation.
  • When database snapshot is created, the source database will typically have open transactions. Before the snapshot becomes available, the open transactions are rolled back to make the database snapshot transactionally consistent.

Uses:

  • Maintaining historical data for report generation
  • Safeguarding data against administrative error
  • Managing a test database – Before running the first round, an application developer or tester can create a database snapshot on the test database. After each test run, the database can be quickly returned to its prior state by reverting the database snapshot.
  • Safeguarding data against user error – To recover from a user error, you can revert the database to the snapshot immediately before the error. Reverting is potentially much faster for this purpose than restoring from a backup; however, you cannot roll forward afterward
Restrictions:
  • You can’t back up, restore, or detach a database snapshot
  • The snapshot must be on the same instance as the source database.
  • Full text indexes are not supported – though u can create a snapshot against a database that contains full text indexes
  • FILESTREAM is not supported and any FILESTREAM data is inaccessible through the database snapshot – though you can create a snapshot against a database that is enabled for FILESTREAM.
  • you can’t create a snapshot against a system database
  • you cant’ drop, restore, or detach a source database that has a snapshot created against it
  • you can’t reference filegroups that are offline, defunct or restoring.
When a Database Snapshot is created, SQL Server doesn’t allocate space on disk equivalent to the current size of the data fi les in the source database. Instead, SQL Server takes advantage of an operating system feature called sparse fi les. A sparse fi le is essentially an entry in the file allocation table and consumes almost no space on disk. As data is added to the fi le, the file automatically grows on disk. By using sparse fi les, the creation time for a Database Snapshot is independent of the size of the source database.
TIP DATABASE SNAPSHOT MAXIMUM SIZE
Because SQL Server maintains the Database Snapshot at the point in time that the Database Snapshot was created, the maximum size of the Database Snapshot is the amount of data that existed in the source database at the time of creation.
Source: From BOL
MCTS 2008 Training kit – Mike Hotek