DBA

The Database Countryside Code: Best Practices for BI & SQL Users

Those ‘City Folk’ among you may not be aware but in Rural England we have what is called The Countryside Code, it’s a set of guidelines that everyone should follow in order to keep the countryside clean, tidy and a nice place to visit.  You may be asking – what does this have to do with Business Intelligence and Database Administration?  Well, I think it’s vital – if we all follow a fairly simple but broad set of guidelines then all classes of database user will have a better experience from Developers to DBAs and Analysts to CIOs.  This isn’t really about making your databases perform better, it’s about working better with each-other and taking other people’s perspectives on board.  Having been in most of the related roles over the years this is what I’d put into The Database Countryside Code…

1. Enjoy the countryside and respect its life and work
Whether your application is an ‘out of the box’ software suite, a Business Intelligence package that can be tweaked on implementation or a hand-crafted bespoke solution if you’re running against a database maintained by someone else or shared with other applications you need to take heed of this point.  Remember that cooperation is key and if you build a good relationship with the DBA and the other key users of the database you’ll have a much better time of things and if there are any critical issues you’ll be included in the remediation process and may even be able to help your own users get back online faster.  It’s easy to see DBAs as grouchy, narrowly focused sorts who tend to view all user activity as bothersome (I can say that as I’ve been one myself) but generally speaking if the DBA is aware of user activity at all the chances are that there’s already a problem as it’s the long running, resource intensive activity that will stand out in alerts and performance reports.  Before your application goes live you should do some testing, run your designs and SQL statements / stored procedures past the DBA for some advice (but remember, you don’t have to take it) and establish some sort of procedure for reporting issues, and remember that an SLA can work both ways as you may need the DBA’s help as much as they might need yours.

 

2. Guard against all risk of fire
Security is a huge issue and as exploit frameworks and toolkits become more and more prevalent and feature-rich the likelihood of vulnerabilities being discovered in our applications should be treated more like a certainty.  If you’re developing bespoke applications and especially web apps you’ll need to pay close attention to the OWASP Top 10 application security risks but from a database perspective the most notable threat is SQL Injection - the art of passing SQL into an application so that it might be executed by the database (as a good starting point check out OWASP’s SQL Injection Prevention Cheat Sheet).  If you’re deploying packaged apps or BI tools don’t think that you’ve gotten away with it, the primary responsibility may be on software developers to avoid exploits but if they’re baked into an application you’re implementing it will affect your users and your business, so…

 

3. Protect wildlife, plants and trees
The most important security contribution we as implementers can bring to the table is to review and limit the privileges required by our applications.  Many install guides and expensive external consultants ask for a ‘dbo’ (database owner) level user and some even ask for ‘sa’ (system administrator) or ‘root’ level privileges but don’t hand these out like candy on halloween.  In most cases these high-level privileges are only required during setup and install and can be removed afterwards but often basic read/write access is all that is required (and for BI tools often read-only), it may only be achievable through a few frustrating rounds of trial and error but if you assign your applications the lowest possible permissions you will significantly reduce the risk of compromise in the future.  Another important step during implementation is to make sure that your permissions are segregated, where possible have a separate user for each service and an entirely separate user for accessing each database not shared by any other application.  Whilst it may seem excessive this setup will allow you to audit any security issues and identify which user was compromised and exactly what they had access to.

 

4. Fasten all gates
Many Business Intelligence tools include some degree of control over connection management and if you’re developing your own application you’ll have complete control over all database connections, the decision to be made is whether connections are ‘pinned’ open, closed after x minutes or closed at the end of each transaction.  The preference will vary depending on the load and the usage, in most Business Intelligence use cases there tend to be a large number of users, not always connecting concurrently and issuing fairly large queries against the database followed by periods of quiet whilst a report is read – in this case there is usually no need to keep the connection open for long.  On the other hand if you have users issuing a constant stream of small transactions (e.g. a Point of Sale system) the overhead of creating and dropping connections might actually add load to the database so it would be more effective in this scenario to maintain the connection.

 

5. Keep your dogs under close control
This applies more to developers and BI architects where your dogs are your users, if you are deploying an application that creates load on somebody else’s database you should do whatever you can to limit each user’s ability to cause long running queries – in some BI tools you are handed an option to let a query time out after x minutes and perhaps limiting the number of rows returned.  If you are developing your own application you should include both of these options but make sure that you kill the query at the database level rather than just killing the thread in your application that made the request otherwise it’s equally bad if not worse since the user may simply re-issue the offending query.  The actual limits are bound to vary from database to database but that’s where the first point comes in, discuss this with both your users and the DBA.

 

6. Keep to public paths across farmland / Use gates and stiles to cross fences, hedges and walls
When it comes to solving problems try to stick within the basic and simple boundaries of an ordinary user, avoid using undocumented stored procedures, excessive use of user defined functions, custom data types, plugins and extended stored procedures or anything else that strays too far from a standard install of the database platform.  Obviously you’ve got an app to deploy and you want to solve your problems in whatever way is best for your users but the further you are from a standard deployment the more issues you’re likely to encounter, both you and the DBA might be fully aware of this amazing new setting you tweaked to make things run better but a couple of years down the line during a disaster recovery will it all come flooding back quite as easily?  What if one or both of you that setup the application have moved on to other roles?  Thinking outside the box is great but be conscious of introducing risk and if you do feel that it is necessary then make sure that it’s well documented in the Run Book or the corporate wiki.

 

7. Leave livestock, crops and machinery alone
Since you may already have elevated privileges on your own database, a shared database or even the server you may be tempted from time to time to perform maintenance tasks or make minor ‘improvements’ to indexes or configuration settings – do not do so without the DBA’s blessing.  If you’re following the rules above you’ll probably have a fairly good rapport with the DBA already so it’s likely that you’ll be granted some level of trust not to mess things up but be careful not to overreach, the DBA will be ‘in the loop’ of many changes and other requirements (e.g. critical deadlines, disaster recovery tests, unplanned maintenance) whereas you may not be aware of them so before you make any changes run them past the DBA – just in case.

 

8. Take your litter home / Help to keep all water clean
If you’ve ever been a DBA you’ll have seen, on more than one occasion, tables popping up called tmpSomethingorOther, tblToBeDeleted or TableName_bak but when it comes to the key questions (How long have these been around?  Are they still required?) nobody seems to have a straight answer.  I know myself that whilst I’ve been developing data warehouses I’ve created these sorts of tables and subsequently forgotten what they were used for, not too much of a problem if you’re ‘the guy’ but in a large team or with personnel changes over time it can be hard to know what is required and what isn’t – I came to a database once with temporary tables over five years old which had not been deleted out of fear that they were important.  The moral here is an obvious one, clean up after yourself or if the table must exist for some short period of time put a note in your diary to come back and cull it.

 

9. Make no unnecessary noise
Be mindful of what errors you raise and what you write to public logs, if your application causes a large amount of data to be written to database or other centrally collated logs you may inadvertently make it harder to detect genuine issues which will hurt both you and and other users of the database.  If you do occasionally need exhaustive logs consider adding a ‘debug mode’ into your application which can be turned on or off via a configuration setting, that way you can turn it on whilst you’re tracing a fault and need more verbose logging then turn it off when you’re done.

 

10. Take special care on country roads
There can be plenty of unexpected hazards on country roads so don’t always rush around everywhere at 60mph, acknowledge that whist you might want everything to go as fast as possible you could be causing some other critical process to slow or stop.  Driving at night can be treacherous too as you might come across an unexpected backup window or import/export process, talk to your DBA and coordinate the major tasks.  If it’s a shared server make sure you have access to the task list so that you know where to slot in your jobs and that those jobs get put back into the master list.

Really it comes down to one thing, as the great and wise Jerry Springer oft said, “take care of yourselves, and each other”.

Be the first to comment - What do you think?  Posted by Ash - 20111230 at 14:01

Categories: Business Intelligence, DBA   Tags: , , , , ,

Quick Tip – PostgreSQL Equivalent of ISNUMERIC()

Very much like my previous MySQL ISNUMERIC() post I have recently been setting up a data source to collect records with telephone numbers from a Postgres database and one of the essential validation tests is to make sure that the field really does contain a number.

Despite the fact that many regard Postgres as the best open source database platform I find myself frustrated by it’s lack of standard functions.  I understand that Postgres is designed to be extensible and that user defined functions can be built but I need my code to be both portable and read-only so I have to work with what I’m given.  Ideally what I’d be looking for is an equivalent of Microsoft SQL Server’s ISNUMERIC() or Excel’s ISNUMBER() functions but very much like MySQL I had to turn to regular expressions although as you’ll see, Postgres does not have a clean and clear REGEXP() function…

SELECT DISTINCT contact_number
FROM customers
WHERE (contact_number ~ ‘^[0-9]+$’)

I hope that helps any of you out there that encounter the same problem, thanks to the poster here for my original answer.

Be the first to comment - What do you think?  Posted by Ash - 20111201 at 10:03

Categories: DBA, PostgreSQL   Tags: , , , , , ,

Using MySQL BLOB Data via ODBC in SSIS, SQL Server & Business Objects

Whilst trying to build a centralised cross-platform alerting system I spotted a peculiar issue when trying to move the output of a SHOW FULL PROCESSLIST command on MySQL via ODBC.  It seems that the output of the SHOW FULL PROCESSLIST command returns both integers and binary (BLOB) data types even though to they eye (that is, in the MySQL Query Browser) most of the columns appear to be short text fields.

Despite the fact that the data looks like text whenever I tried to return the data into an application, I tried SSIS, SQL Server Linked Servers and Business Objects, each time the data would come back unusable or an error would be returned.  Business Objects gave me the key by declaring “This is a BLOB.” as you can see in the following screenshots…

SQL Server Integration Services

SQL Server Linked Server

Business Objects Desktop Intelligence

As with my recent post about loading data into MySQL with SSIS the saviour turns out to be an ODBC configuration setting, this time in the Metadata tab of the MySQL ODBC driver.  All you have to do is check the “Always handle binary function results as character data” and instantly your problems will be solved…

Out of a crazy fit of completeness I also took screenshots of the final results and it’d be a shame to waste them so here they are…

SQL Server Integration Services

SQL Server Linked Server


 

Business Objects Desktop Intelligence

 

Be the first to comment - What do you think?  Posted by Ash - 20110310 at 21:58

Categories: Business Objects, DBA, Microsoft SQL Server, MySQL, Open Source, SSIS   Tags: , , , , , , , , , ,

Using SQL Server 2008 R2 Linked Servers with PostgreSQL 64-bit

Having setup a Linked Server in Management Studio talking to a PostgreSQL 8 database I encountered the following error when attempting to run any valid query:

Msg 7399, Level 16, State 1, Line 1
The OLE DB provider "MSDASQL" for linked server "PG_SERVER" reported an error. The provider reported an unexpected catastrophic failure.
Msg 7350, Level 16, State 2, Line 1
Cannot get the column information from OLE DB provider "MSDASQL" for linked server "PG_SERVER".

After some digging I came across a handy article on Microsoft Connect describing the same issue, with thanks to Nenea Nelu here’s the solution…

  • Expand Server Objects > Linked Servers > Providers. 
  • Right-click on MSDASQL and select Properties…
  • In the Properties dialogue un-check “Allow inprocess” as follows…
  • Click OK and re-run your query. 

Hopefully that should solve your problem, please note that this will affect all Linked Servers using that provider however as the Connect article points out – this is best practice for linked servers anyway.

1 comment - What do you think?  Posted by Ash - 20110228 at 15:48

Categories: DBA, Microsoft SQL Server, PostgreSQL   Tags: , , , , , ,

SQL Server Backup Compression vs. Quest Litespeed Engine

As more and more functionality is built into products like SQL Server it’s always worthwhile reviewing third-party tools and utilities when you’re considering an upgrade to see (a) if they’re still required and (b) if the tools themselves need to be upgraded.  With the introduction of Backup Compression in SQL Server 2008 R2 Standard Edition you could begin to think that the future is grim for Quest’s backup compression software LiteSpeed so I thought I’d do some testing to see exactly how it stacks up against the native compression.

LiteSpeed Engine

I’ve been using LiteSpeed on and off for a few years now and it has always been a great tool but I’ve always found it a bit of a drag to have to use the GUI to administer and setup jobs however in January 2010 Quest launched the LiteSpeed Engine for SQL Server which allows you to administer jobs using the native SQL Server tools.  The LiteSpeed Engine acts as a driver and the configuration tool allows you to define a variety of configuration profiles based on file extension and from that point onwards you can use the Management Studio to setup backup jobs, maintenance plans, etc. and all you have to do is specify the file extension of the profile you wish to use.

The configuration tool allows you specify the compression level from 1 to 8, encryption level including various bit-length versions of RC2, RC4, 3DES and AES though as you’ll see later the overhead of adding the highest level (256-bit AES) isn’t that great so I’d always shoot for the maximum.

Benchmark Structure

The test is relatively unscientific since I used only one database but it was carried out systematically, the data comes from a transactional billing system which I chose as it has a mix of strucured tables and raw transactions and comes in at about 6.5GB so it wouldn’t take too long to test.  I used the following configurations…

Benchmark Results

On my test database the baseline SQL Server native compression reduced the 6.2GB database to 765MB (12.2% of the original size) and took less than half the time (43%), to achieve the same level of compression using LiteSpeed I had to use Level 2 which gave me 12.2% of the original size and 40% of the original duration.

At first this doesn’t look great for the third-party tool but the benefit of using a mature backup compression engine is the flexibility and LiteSpeed’s configurations allow you to tweak the performance to solve whatever problem you have in your environment whether that be the absolute size of the backup, the backup window time or a mixture of the two.

If it’s size you’re after then Level 8 really did seem to work wonders on my test DB bringing the size down to 5.6% of the original at only 352MB though it did take 2.6 times the original duration, if it’s the backup window you’re looking to reduce then the basic Level 1 did manage to improve on the native compression by taking 0nly 37% of the original duration whilst still compressing to 13% of the original size.  If like most people you’re looking to have your cake and eat it (i.e. reducing size and backup window) I’d suggest that Level 3 is the best compromise giving 10.9% of the original size at 77% of the original duration so you get some benefit in both areas, though Level 4 takes compression a bit further and still gave a slight time reduction.

Clearly, the real answer is testing and since I’m at the beginning of data warehousing project I’m not in the position to make any firm decisions but I think that even if you don’t run out and purchase it now LiteSpeed is a very valuable tool to have in your mental arsenal so that if you come up against backup size/window issues or you’re faced with older versions of SQL Server you’ve got a solution in mind already.  Quest have an odd policy of keeping pricing quite opaque but I believe that the full Enterprise version (including the LiteSpeed Engine) retails for around £1,800 ($2,800) which isn’t too bad if you need that level of flexibility.

SQL Server Native Compression

CompressionSize (MB)Time (s)Size (%)Time (%)
Disabled6,26170100%100%
Enabled7653012.2%43%

LiteSpeed Compression (No Encryption)

CompressionSize (MB)Time (s)Size (%)Time (%)
None6,26270100%100%
Level 18132613.0%37%
Level 27612812.2%40%
Level 36805410.9%77%
Level 46496110.4%87%
Level 55961229.5%174%
Level 65861519.4%216%
Level 73871786.2%254%
Level 83521855.6%264%

LiteSpeed Compression (With Encryption)

CompressionSize (MB)Time (s)Size (%)Time (%)
Level 18134613.0%66%
Level 27613112.2%44%
Level 36806010.9%86%
Level 46496710.4%96%
Level 55961269.5%180%
Level 65861569.4%223%
Level 73871826.2%260%
Level 83521905.6%271%

3 comments - What do you think?  Posted by Ash - 20101231 at 14:50

Categories: DBA, Microsoft SQL Server, Tools & Utilities   Tags: , , , , , , ,

SQLBits 7 – Friday Conference Rundown

Many people only attend the free ‘Community Day’ of SQLBits and I can understand why given the cost (£125) for the Friday sessions but if SQL Server is how you make your living I really do think it’s worth the money.  It’s not even that the Friday sessions are significantly different in content, it’s really just more of the same high level of quality you get on Saturday but when it comes to SQLBits more is definitely better.

It’s always a tough choice picking which sessions to attend so it’s often best to go with speakers you know will be good so despite having spent the entire previous day with Maciej Pilecki in the SQLBits Training Day I made my first session Maciej’s SQL Server Statistics talk.  Despite a few initial technical gremlins the talk went well and gave a few insights into how statistics are used by the query optimiser with the key takeaways being to always keep both AUTO_CREATE_STATISTICS and AUTO_UPDATE_STATISTICS turned on, to consider turning on AUTO_UPDATE_STATISTICS_ASYNC (does not force queries to wait for stats to be updated but subsequent queries will benefit) and to run sp_updatestats after any major updats or to reindex your tables periodically.

My next session was Brent Ozar‘s Virtualisation and SAN talk, this gave me a whole load of questions to go back to my SAN Administrator with as well as a whole load of tests I intend to perform before I deploy my next Data Warehouse on a Hyper-V guest.  One concept that was completely new to me was the Balloon Driver that hypervisors use to encourage Windows to free-up RAM, since SQL Server is a good citizen it can end-up flushing the entire Buffer Pool and wrecking your performance – the solution is to ensure that Dynamic Memory is disabled in the Hyper-V Manager.  Some great related resources can be found at…

The lunchtime sponsor talk I chose was the one from Quest that covered IT Horror Stories, it was a brilliant session with plenty of audience interaction and steered clear of pimping any specific Quest products but instead just showed that the people that work there are experienced, pragmatic and generally just nice guys.  I think this approach is far better than the extended product demos that many software companies tend to give as their lunchtime sessions as they’ll only be of interest if you’re genuinely considering the product and if you’re not they’ll do little to increase brand awareness with a room full of bored people on Twitter of Facebook.

After lunch I went for Buck Woody‘s talk on Business Continuity which provided a few simple paths and the crucial tasks to help get people started on a business-relevant disaster recovery strategy.  I was particularly impressed with one of the central themes of the talk which was (I’m reading between the lines a little) that even if you think it’s ‘not your job’ to put a DR plan in place, it’s likely that as the company’s ‘Data Professional’ people will still look to you in times of failure and if you’ve already done all of the planning you’ll be the guy with a calm head solving the problem and if you’re not that guy – start getting your CV ready.  Despite having heard the name and having read a few of his blog posts over the years I’d never heard Buck speak and he’s great so if you get the chance to see him you definitely should.

Well that wraps-up the day nicely, I’ll be posting Saturday’s round up soon after I’ve written it!

Be the first to comment - What do you think?  Posted by Ash - 20101002 at 18:00

Categories: DBA, Events, Microsoft SQL Server   Tags: , , , , , , , , , , ,

Next Page »