Category Archives: Challenges

T-SQL Tuesday #73 – Those Darn Elves

t-sqltuesdaySince I have not blogged in a while, and I saw it was T-SQL Tuesday, I thought I would participate in this FOR THE FIRST TIME EVER! This month SQLBalls asks us if our SQL Servers are on the naughty or nice list. Since I have recently transitioned to a new role, I am still getting familiar with the servers and the environments I am working with. The servers of my past are but a memory, but a fresh one. I cannot even begin to think of all the different “naughty” things that were done on those servers, and how some days it felt like a losing battle. How can you fight a security or code change when you have been over ridden by management, only to have that change come back and bite you weeks or months later? The “I told you so” you might feel like popping off will fall on deaf ears and you will just be stuck fixing the problem…until it gets to be too much and you decide it is time for action.

I have done this before. The day had been a long one and I thought I was finally going home when I got drug into a call on a data issue – something was changed that shouldn’t have been. When and by whom? I didn’t know or have any way of finding this information (days later when I tried to get a backup that was several months old to validate this data from when it was deployed; there was no backup – all all – but that is a completely different issue for another time). I had my suspicions but no proof. Could have been a naughty developer elf logging in with an elevated SQL account they knew the password for. Changing the password? “Out of the question” they say. It is everywhere.

If this were the only event that had happened THAT DAY. It wasn’t. This one was production (hence the conference call). The others (yes, more than one) were pre-production. Messes that had to be cleaned up because the developer elves thought they knew better than the DBA and that they could do it on their own. This production issue was the last straw. Elves were running a muck and had to be reigned in, and they weren’t going to like it.

All the elevated permissions in the pre-production servers – gone. I didn’t care if it was a DEV server. Am I the meanest DBA in the world? So says some. Scrooge? Well, if you are into name calling and want to go there, then ok, but I get to call you names too. In this case I did not care – I was fixing things that were only broken because someone abused a privilege. It should also be said that there was some relation in the names off all the tables involved with all the issues that occurred on this day.

If I had to wag my finger and any naughty part of the SQL Server instances it would have to be at security…and I am partially to blame. It can be difficult to keep up with all the changes that happen across a large environment when it comes to assigning permissions, and if you have more than one DBA, the situation is compounded by the fact that you might not always know what the other is doing and vice versa. They might grant something that you would otherwise veto for cause. You might take care of a permissions issue one way when they would handle it differently.

While I worked on some Powershell code to pull back users from specific AD groups and incorporate alerts for some of those groups, sadly the bandwidth was not there to fully roll this out. I did however create some triggers that would send email alerts when a change was made at the server and database levels, and I made them nameless and encrypted.

 

 

The lack of a name for each of these is intentional, as is the encryption. The last thing I wanted was someone seeing these ans what they were doing, and if they had permissions to do so, disabling or dropping them to avoid having their nefarious behavior tracked. Even better would have been to put additional triggers in place to prevent the dropping of these no matter what, but I decided not to go there.

Note there is nothing there for the name for each of these – this is courtesy of the devious mind of Rob Volk. He might have too much time on his hands but this is pretty darn crafty. What you name these triggers is up to you but you have to MAKE NOTE OF WHAT THE NAME ARE!!! When I did these they were a combination of a few tabs and spaces – like “space space space tab tab” but with those actual characters. The result looks like this:

 

triggers_noname

It should go without saying use this code at your own risk and always thoroughly vet and test anything before applying it to a production environment.

If this helps further cement my meanest DBA creds then I guess I am doing it right. Sometimes the elves developers can get out of hand and it is up to Santa the DBA to make sure they know they are being watched.

 

Have you tried restarting your computer SQL Server?

Everyone at one point or another has had this experience – you have to call technical support for assistance with your operating system or an application and get posed with the question “Have you tried restarting your computer?”

“Ummm…yeah. I did that. I also power-cycled my modem, searched the internet, hacked the registry and walked counter clockwise around my chair three times. I just thought I would call you to chat.”

Anyways…

Occasionally, the same logic is what is needed to resolve issues with SQL Server – restarting the SQL Server service. While we want to think that being on a fairly up-to-date version of SQL Server means things like this don’t happen, this is still software written by humans and every possible situation cannot be accounted for in testing. Sometimes you might find documentation that this is needed. Other times, the solution to the issue you are trying to solve is not as clear.

A couple of examples of a documented situation would be when updates to the server require a reboot or when a known issue in SQL Server is able to be immediately resolved by a restart of the service. Undocumented situations that can arise from different situations can range from changes to server settings that are not taking effect to changes in database mail. I have seen both of these situations in the past year – a server setting change was affecting replication (resolved by a restart) and database mail changes not taking effect, causing mail not to be sent and the mail queue to become backed up (another lesson learned here – after the restart of the service all those messages in the queue will be sent; if you don’t want this to happen the queue needs to be cleared out prior to restart).

Unlike rebooting a computer, restarting this service (or the server itself) is typically something that is done only where the circumstances dictate that it is required. Outside of this, restarting SQL Server is typically done as a last resort and is the last thing that the DBA might think to do. Depending on the server and the processes that communicate with the SQL Server instance, the business other resources may need to be not only notified of the restart, but also why it is needed in the first place. This is made more difficult by the fact that it may be unclear whether or not the restart will solve the problem at hand.

While it might not be documented that certain changes may require a service restart, that doesn’t mean that they don’t, and those situations can be difficult to identify. In these cases, all reasonable measures had been taken without yielding the desired results. Both cases involved changes to a SQL Server instance. Making the decision to restart in cases like this is not something that should be done flippantly, but where do you draw the line at time and effort spent before pulling the trigger? My experience with these issues will dictate how I handle similar cases going forward, but as for resolving issues that I have not had experience in dealing with yet, there very well may be more painful times…followed by sound of my head hitting the desk over and over out of frustration that the solution was something so simple.

Starting with something weird

For my first post I thought I would start things out with a bang and detail an issue that I recently encountered:

Location: “qstopsrt.cpp”:404
Expression: fFalse
SPID: 51
Process ID: 2908
Location: qxcntxt.cpp:1221
Expression: cref == 0
SPID: 51
Process ID: 2908
Msg 3624, Level 20, State 1, Line 1
A system assertion check has failed. Check the SQL Server error log for details. Typically, an assertion failure is caused by a software bug or data corruption. To check for database corruption, consider running DBCC CHECKDB. If you agreed to send dumps to Microsoft during setup, a mini dump will be sent to Microsoft. An update might be available from Microsoft in the latest Service Pack or in a QFE from Technical Support.
Msg 0, Level 20, State 0, Line 0
A severe error occurred on the current command. The results, if any, should be discarded.

In case you were wondering that CHECKDB was a waste of time (in this situation). Had never seen something like this before, but who doesn’t like a good challenge?

If you have seen this error, and tried to research it, you know there is not much out there. Most of what I found had to do with issues with the msdb or master databases. Those did not apply to my situation.

Anyways, on to the code:

SELECT TOP 50
[personid] as [personid],
[shoeid] as [shoeid],
[shoe_name] as [shoe_name],
DATEDIFF(day, CAST(purchase_date AS DATE), GETDATE()) as [days_since_purchase],
FROM vw_yabba WITH (NOLOCK)
where [Paid_in_full] = '1'
ORDER BY [days_since_purchase] desc

SELECT TOP 50
[personid] as [personid],
[shoeid] as [shoeid],
[shoe_name] as [shoe_name],
[shoe_type] as [shoe_type],
[shoe_size] as [shoe_size],
DATEDIFF(day, CAST(purchase_date AS DATE), GETDATE()) as [days_since_purchase],
dateadd(day,datediff(day,0,delivey_date),0) as [delivery_date],
CASE WHEN [Paid_in_full] = 1 THEN 'YES' ELSE 'NO' END as [Paid],
[purchase_date] as [purchase_date]
FROM vw_yabba WITH (NOLOCK)
WHERE 0 = 0
AND ((((1=1))))
AND ( ( ( (
([shoe_type] = 'Stiletto')
AND ([Paid_in_full] = '1')
AND ( (DATEDIFF(day, CAST(purchase_date AS DATE), GETDATE()) = '23')
OR (DATEDIFF(day, CAST(purchase_date AS DATE), GETDATE()) = '24')
OR (DATEDIFF(day, CAST(purchase_date AS DATE), GETDATE()) = '25') )
AND ( ([shoe_rating] < '100') OR ([shoe_rating] IS NULL))) OR ( ([Paid_in_full] = '1') AND ( (DATEDIFF(day, CAST(purchase_date AS DATE), GETDATE()) = '23') OR (DATEDIFF(day, CAST(purchase_date AS DATE), GETDATE()) = '24') OR (DATEDIFF(day, CAST(purchase_date AS DATE), GETDATE()) = '25') ) AND ( ([shoe_rating] < '50') OR ([shoe_rating] IS NULL ) ) AND ( ([shoe_type] = 'mule') OR ([shoe_type] = 'athletic') OR ([shoe_type] = 'flat') ) ) ) ) ) ORDER BY [days_since_purchase] asc



Other useful background info:

  • Both queries produce the same error.
  • 3rd party application code - dynamically generated; cannot be changed.
  • Happend on a production reporting server.
  • View queries two tables with a union; tables are refreshed hourly.
  • Not all data is pulled from the tables - some fields are defined in the view.

Researched the error, but like I said - didn't find much that was of any use. Picked someone else's brain on the issue - he said there was an additional thing that could be added to the TOP 50 - WITH TIES. I changed the first query to a TOP 1 WITH TIES and got five records back. Kept working with both queries, trying different things. Eventually found that if I left the Paid_in_Full field out of the where statement the query would complete.

Examined the view - Paid_in_Full was defined as '1' in the view for one table and '0' for the other. Surmised from this that the implied data type could be the cause of the problem.

Solution that resolved the issue - modified the process that populates the tables to include this field (meaning the datatype would now be explicit).

Funny how something so simple could cause an issue so obscure.