Hosting SQL Server on VMWare

One of our clients was investigating the logistics of running SQL Server on VMWare. In discussions with Microsoft, the response included four (4) important links. I personally have seen various SQL Server performance issues when virtualizing especially in a SharePoint environment and I would never trust that configuration. There have been many enhancements with SQL Server 2005 and now SQL Server 2008 in order to consolidate servers and take advantage of reducing hardware needs and license costs. A good SQL Server 2008 consolidation white paper can be found here.

Here is the info from the Microsoft technician:

If you are running ESX V3.5 the latest hotfixes from VMware to disable write-caching are included in update3 and update 4.


1. The Microsoft Virtual Server support policy


2. The VMware knowledge base article that you can contact VMware about for hosting SQL Server


3. This article that explains the issue and discusses the problem domain


4. Here is an presentation from VMWare about how to disable the disk write issue.

To MERGE or not to MERGE in SQL Server 2008

I was reviewing some demo material for the MERGE statement that Jim Pletscher put together. It was a standard scenario where we wanted to UPDATE, DELETE, or INSERT rows from a source table to a target table. A general overview of this is available here.

Inserting is harmless but what if sometimes we don't want to update or delete something from our target table?

My first step was to add a new column to the target table. I named this column "DoNotTouch" and made it a bit defaulting to zero (0). I then set this value to one (1) on one of the target table rows.

So now how do we tell the MERGE statement to not touch anything where this bit is flipped on? You cannot use a WHERE clause in the MERGE statement. The answer is within the WHEN clause of the MERGE statement. All you need to do is add an additional condition to each WHEN for the update and delete operations:

(TargetTable is the name of the target table and SourceTable is the name of the source table)

WHEN MATCHED AND TargetTable.DoNotTouch<> 1 THEN


               UPDATE SET TargetTable.FirstName = SourceTable.firstname,

                                        TargetTable.LastName = SourceTable.lastname

WHEN NOT MATCHED BY SOURCE AND TargetTable.DoNotTouch<> 1 THEN
DELETE


So that worked! It did not touch the row where I set the DoNotTouch flag to one (1).

Introduction to Change Data Capture

The Change Data Capture feature is a new feature in SQL Server 2008 that allows processes to
quickly and easily identify changes to a table in SQL Server 2008. It primarily works by watching the SQL server transaction logs, and running processes that pull changes from the logs and moving them into tables.

Change Data Capture Setup

Enabling Change Data Capture is a two step process. First, you must enable it at the database level, then you must enable it for each table for which changes must be recorded.

The first step in setting up Change Data Capture is to enable the option at the Database Level. The easiest way to do this is to use the system stored procedure sys.sp_cdc_enable_db.
Simply execute this stored procedure from the desired database, and it will enable change data capture for the database. This stored procedure requires sysadmin privilege in order to execute it.

(click to enlarge)














Once it is enabled at the database level, the Capture Change Data option must be enabled for each table for which changes should be captured. Use the sys.sp_cdc_enable_table stored procedure to enable the feature, passing the table name as an argument in order to enable the feature. By default, all of the columns in the source table are identified as captured columns. If only a subset of columns need to be tracked, such as for privacy or performance reasons, use the @captured_column_list parameter to specify the subset of columns.

(click to enlarge)




















Once Change Data Capture (CDC) is enabled for a table, several things occur. First, an associated capture instance is created to support the collection of the change data in the source table. The capture instance consists of a change table and up to two query functions.

All inserts, updates, and deletes to the source table are stored in the change table. The two query functions are specific to the source table, and can be used to fetch changes from the change table.

The first five columns of a change data capture change table are metadata columns. These provide additional information that is relevant to the recorded change. The remaining columns mirror the identified captured columns from the source table in name. These columns hold the captured column data that is gathered from the source table.

Each insert or delete operation that is applied to a source table appears as a single row within the change table. The data columns of the row that results from an insert operation contain the column values after the insert. The data columns of the row that results from a delete operation contain the column values before the delete. An update operation requires one row entry to identify the column values before the update, and a second row entry to identify the column values after the update.

In additon to the Change instance, two SQL Agent Jobs are created: one that is used to populate the database change tables, and one that is responsible for change table cleanup. Both jobs consist of a single step that runs a Transact-SQL command. The Transact-SQL command that is invoked is a change data capture defined stored procedure that implements the logic of the job.

(click to enlarge)






















The jobs are created when the first table of the database is enabled for change data capture. The Cleanup Job is always created. The capture job will only be created if there are no existing transactional publications for the database.

Using Change Capture Data


Change data is made available to change data capture prcoesses via the two functions that are created when the change instance is created. Both functions require Log Sequence Numbers (LSNs) to help define the date range of change data to return in the resultset.


The function cdc.fn_cdc_get_all_changes_ returns all changes that occurred for the specified interval. Entries are always returned sorted, first by the transaction commit LSN of the change, and then by a value that sequences the change within its transaction.

The function cdc.fn_cdc_get_net_changes_ returns one change per modified source table row. If more than one change is logged for the row during the specified time period, the column values will reflect the final contents of the row.

Several functions are provided to help determine appropriate LSN values for use in querying the change data capture functions. For example, the functions sys.fn_cdc_map_time_to_lsn and sys.fn_cdc_map_lsn_to_time are available to help place LSN values on a conventional timeline.

To illustrate, insert a single record into sample table created earlier, then update the columns.




































Now we can use the cdc.fn_cdc_get_all_changes to set how the change table is populated:

















Note that the resulting change table contains four records, with the source table columns populated with the table data surrrounding the change. The first record reflects the initial insert. The 2nd and 3rd records reflect the update (a delete followed by new insert), while the fourth record reflects the final delete.
Note that running the cdc.fn_cdc_get_net_changes returns only a single row, which represents the newest version of any modified row. This procedure will only return one change table record per source record changed, regardless of how many changes were made to the source record.