Tips for Optimizing XML in SQL Server

by: Donabel Santos

I’ve worked on a project that used XML heavily inside SQL Server. We really utilized SQL Server’s XML support almost to the full extent, but with some repercussions. As we did our load testing, performance did degrade and we had to step back and adjust how we were using our XML data.

If you are using XML in your database, you may want to consider these few tips to optimize your overall queries and performance related to your XML data. 1. Promote frequently used elements and attributes into relational columns

If you find yourself always pulling out scalar values from your XML column in order to join to another table, you should consider “promoting” this value into a column.

Pro: This value can now be indexed, therefore can boost performance of your queries.

Con: It’s an additional management overhead. If you need to change the value in the XML document, you will also need to change the value in your relational column. You can consider doing this in the application level (ie change the value in both places at once), or creating a persisted calculated column that uses a UDF that extracts the scalar value for you, or even triggers (be careful! make sure you test before you implement in production). 2. Add a schema to your XML columns

XML is really just a text document. This poses an overhead to SQL Server because everytime you do operations, SQL Server will need to “guess” which data types may be appropriate for your operation before it does an implicit conversion. You can eliminate this step by creating typed XML columns, or XML that is bound to a schema (XSD).

Pro: Processing your XML will be faster than if you’re using untyped XML. This eliminates the data type guess work from SQL Server.

Con: This can create a management overhead. Your XML now suddenly becomes not-so-flexible anymore. Everytime you need to add a new element or attribute or new nested fragment, you will need to ALTER the schema first before you can make changes. This change may need to cascade down to all your stored procedures that use this column.
3. Use XML Indexes

You must first create a Primary XML index, then create secondary index(es).

– sample primary XML indexCREATEPRIMARY XML INDEX invoiceidx ON[Sales].[salesxml](xmlcontent) GO

There are 3 main secondary XML indexes:

– PATH Secondary XML Index– useful if you use the path, and if you do not have wildcardsSELECT xmlcontent.value(‘(sales/order[@ord_num="6871"])[1]‘,’varchar(20)’), othercols FROM [Sales].[salesxml]WHERE xmlcontent.exist(‘(sales/order[@ord_num="6871"])’)=1– PROPERTY Secondary XML Index– useful if searching for multiple properties, but may not have the full path– sampleSELECT othercols FROM [Sales].[salesxml]WHERE xmlcontent.exist(‘(//title_id)’)=1– VALUE Secondary XML Index– useful if you know exact value looking for, but may not have full pathSELECT othercols FROM [Sales].[salesxml]WHERE xmlcontent.exist(‘/sales/order[@ord_num="6871"]/text()[. = "Something"]‘)=1SELECT othercols FROM [Sales].[salesxml]WHERE xmlcontent.exist(‘(//title_id/@*[. = "special"])’)=1

Pro: As with regular indexes, faster searches.

Con: As with regular indexes, occupies more storage, and needs more resources needed to reprocess your indexes.

..to view more http://sqlserverperformance.idera.com/tsql-optimization/tips-foroptimizing-xml-sql-server/

This entry was posted in Technology and tagged , , , . Bookmark the permalink.

Comments are closed.