SQL DBA Interview Questions with Answers – 1
SQL DBA Interview Questions with Answers
Q. What are the common issues a SQL DBA should deal with as a part of DBA daily job?
- Backup Failure
- Restore Failure
- Log Full Issues
- Blocking Alerts
- Deadlocks Alerts
- TEMPDB full issues
- Disk Full Issues
- SQL Connectivity Issues
- Access issues
- Installation and Upgrade Failures
- SQL Agent Job failures
- Performance Issues
- Resource (Memory/IO/CPU etc.) Utilization Alerts
- High-Availability and Disaster Recovery related issues
Q. “model” system DB is down and we are trying to create a new database. Is it possible to create a new database when model DB is down?
We can’t create a new database when model database is down. SQL Server restart will be unsuccessful when model database is down as TEMPDB creation failed. TEMPDB is created based on model DB configurations, since model DB is down TEMPDB will not be created.
Q. Which operation is faster COMMIT or ROLLBACK? Why?
It’s obviously COMMIT is faster than ROLLBACK. Let me explain with an example: Let’s say we opened a transaction and updated 8000 records:
Commit: It’s completed quickly as the operation is already completed and it just marks those dirty pages as committed and when checkpoint happens all those dirty pages will be written to disk.
Rollback: The operation is already updated 8000 records if we need to rollback then again all these updates has to be rolled back which means there are another 8000 log records will be written to LDF which will take time when compared to commit.
Q. What are the different ways available to insert data from a file into SQL Server database table?
These are the different ways:
- LINKED SERVER
- IMPORT/EXPORT WIZARD
Q. What is the scope of different temp objects?
Local Temp Table: “CREATE TABLE #TempTable”
Local temporary tables are visible only in the current session, and can be shared between nested stored procedure calls
Table Variable: “DECLARE TABLE @TempTable”
The scope of a local variable is the batch, stored procedure, or statement block in which it is declared. They can be passed as parameters between procedures. They are not subject to transactions and will retain all rows following a rollback.
Derived Table: “SELECT * FROM (SELECT * FROM Customers) AS TempTable”
Is visible to the current query only
Global Temp Table: “CREATE TABLE ##TempTable”
This differs from a #temp table in that it is visible to all processes. When the creating process ends, the table is removed (but will wait until any current activity from other processes is done).
CTE: Common Table Expression
;WITH YourBigCTE AS
big query here
SELECT * FROM YourTable1 WHERE ID IN (SELECT ID FROM YourBigCTE)
SELECT * FROM YourTable2 WHERE ID IN (SELECT ID FROM YourBigCTE)
Scope is next immediate select command. Can be used multiple times within the same CTE command, even recursively, and will last for the duration of the CTE command.
Q. What is the maximum limit of SQL Server instances for a standalone computer?
50 instances on a stand-alone server for all SQL Server editions. SQL Server supports 25 instances on a failover cluster.
Q. What is the cluster node limitation?
The number of allowable nodes in the SQL Server cluster depends on your SQL Server version and your Windows Server version. For SQL Server 2008 Standard edition, you can only have two cluster nodes. If you have SQL Server Enterprise, the limit depends on your Windows Server version, 8 cluster nodes for 2003 and 16 nodes for 2008.
Q. Can we install SQL Server using a configure file?
Yes! We can prepare a configuration file. While installing SQL Server the path to the configuration file is specified in the “Ready to Install” page in the configuration file path section. Cancel the setup without actually completing the installation, to generate the INI file.
File Location and Name:
%programfiles%\Microsoft SQL Server\110\Setup Bootstrap\Log\<YYYYMMDD_HHMM>\ConfigurationFile.ini.
Q. How to install a SQL Server using configuration file?
From Command prompt locate the setup.exe file location and can install using config file.
Instead of specifying passwords inside the config file specify them explicitly as below.
Setup.exe /SQLSVCPASSWORD=”************” /AGTSVCPASSWORD=”************” /ASSVCPASSWORD=”************” /ISSVCPASSWORD=”************” /RSSVCPASSWORD=”************” /ConfigurationFile=MyConfigurationFile.INI
Q. What are the top performance counters to be monitor in Performance Monitor?
Processor\%Processor Time: Monitoring CPU consumption allows you to check for a bottleneck on the server (indicated by high sustained usage).
High percentage of Signal Wait: Signal wait is the time a worker spends waiting for CPU time after it has finished waiting on something else (such as a lock, a latch or some other wait). Time spent waiting on the CPU is indicative of a CPU bottleneck. Signal wait can be found by executing DBCC SQLPERF (waitstats) on SQL Server 2000 or by querying sys.dm_os_wait_stats on SQL Server 2005.
Physical Disk\Avg. Disk Queue Length: Check for disk bottlenecks: if the value exceeds 2 then it is likely that a disk bottleneck exists.
MSSQL$Instance: Buffer Manager\Page Life Expectancy: Page Life Expectancy is the number of seconds a page stays in the buffer cache. A low number indicates that pages are being evicted without spending much time in the cache, which reduces the effectiveness of the cache.
MSSQL$Instance: Plan Cache\Cache Hit Ratio: A low Plan Cache hit ratio means that plans are not being reused.
MSSQL$Instance:General Statistics\Processes Blocked: Long blocks indicate contention for resources.
Q. Task manager is not showing the correct memory usage by SQL Server. How to identify the exact memory usage from SQL Server?
To know the exact memory usage relay on column “physical_memory_in_use_kb” from DMV “sys.dm_os_process_memory”.
Using performance counters also we can find the usage.
Performance object: Process
Counter: Private Bytes
Performance object: Process
Counter: Working Set
The Private Bytes counter measures the memory that is currently committed. The Working Set counter measures the physical memory that is currently occupied by the process.
For 64-bit sql servers we can also check the current memory usage using the below performance counter.
Performance object: SQL Server:Memory Manager
Counter: Total Server Memory (KB)
Q. What is the option ”Lock Pages in Memory”?
Lock Pages in Memory is a setting that can be set on 64-bit operating systems that essentially tell Windows not to swap out SQL Server memory to disk. By default, this setting is turned off on 64-bit systems, but depends on various conditions this option needs to be turned on.
We must be very careful in dealing with this option. One can enable this after a detailed analysis of current environment.
Following issues may rise when “Lock Pages in Memory” is not turned on:
- SQL Server performance suddenly decreases.
- Application that connects to SQL Server may encounter timeouts.
- The hardware running SQL Server may not respond for a short time periods.
Q. How do you know how much memory has been allocated to sql server using AWE?
We can use DBCC MEMORYSTSTUS command to know the memory allocation information. But it’s trick to understand the results.
We can use a DMV called “sys.DM_OS_Memory_Clerks”. Sample query to calculate total AWE memory allocated is “SELECT SUM(awe_allocated_kb) FROM sys.dm_os_memory_clerks”
From 2008 onwards we can get all memory related information using DMV “sys.dm_os_process_memory”.
Q. How to apply service pack on Active / Passive cluster on 2008 and 2012?
1. Freeze the service groups on Node A (active node).
2. Confirm all SQL services are stopped on Node B.
3. Upgrade the SQL Server 2008 instance on Node B.
4. Reboot node B.
5. Unfreeze the service group on node A.
6. Fail over the service group to Node B.
7. After the service group comes online, freeze the service group on Node B.
8. Confirm all SQL services are stopped on Node A.
9. Upgrade the SQL Server 2008 instance on Node A.
10. Reboot Node A.
11. Unfreeze the service group on node B.
12. Fail back the service group to Node A.
Q. How to apply a SP on SQL Server 2005 Active / Passive cluster?
1. Login to the Console on the target node
a. RDP to the console is ok, but a standard RDP connection is not recommended.
2. Copy the Service Pack to a local drive on the target node
3. Move all instances to the target node
a. You can only install on the Active Node.
4. Move the Cluster Resource to the target node
5. Move the MSDTC Resource to the target node
6. Verify all users are logged out from all other nodes (RDP and Console sessions)
7. Start the Service Pack install
a. Use a domain account with admin rights to all servers.
b. Ignore locked files
8. Reboot current server
a. You should not need to perform the install on any other nodes, nor reboot them. The service pack will update the passive nodes first.
Q. You find SP is not applied on all the nodes across the cluster. How to apply SP only on required nodes?
If you find that the product level is not consistent across all the nodes, you will need to fool the 2005 patch installer into only patching the nodes that need updating. To do so, you will have to perform the following steps:
- Fail Instance, Cluster, and MSDTC groups to an unpatched node
- Remove any successfully patched nodes from failover candidates of the SQL Server Service of the instance group (do this using Cluster Admin tool)
- Run the patch
- After the patch installs successfully, add the Nodes removed in Step 2 back to the SQL Server Service of the Instance group
Why do you need to do this? Well when the patch installer determines that not all nodes in the cluster are at the same patch level, a passive node operation will fail and will prevent you from moving forward with any further patching.
Q. How to change the sql server service account in a cluster environment?
Method 1: (No failover required)
1. Freeze the service group on active node from cluster administrator and then restart the service.
1. Offline the SQL resources
2. Update the service account at SSCM and restart the service as needed
3. Add the SQL resources back to online
Note: Don’t forget to update service account at the remaining nodes on the cluster.
1. Node 2 (inactive node) change the SQL startup account in SQL Studio or SCM
2. Fail over the SQL service group from node 1 to node 2.
3. Node 1 (now the inactive node) change the SQL startup account in SQL Studio or SCM
Q. How to apply service pack on Active / Active cluster Nodes?
1. Make a note of all node names (and/or IP addresses), SQL Server virtual names along with preferred nodes. If there are more than three nodes you may need to also take note of possible owners for each SQL resource group. For my example assume that I have a cluster with node1 and node2, SQL1 normally lives on node1 and SQL2 normally lives on node2.
2. To start with a clean slate and ensure any previous updates are completed both nodes should be restarted if possible. Choose the physical node that you you want to patch second and restart that node (in my example node2).
3. Restart the node you want to patch first (node1). This will mean that both active SQL instances are now running on node2. Some restarts will be essential, but you could avoid the first two restarts if you need to keep downtime to a minimum and just fail SQL1 over to node2. The main point here is to always patch a passive node.
4. In cluster administrator remove node1 from the possible owners lists of SQL1 and SQL2. This means that neither SQL instance can fail over to node1 while it is being patched.
5. Run the service pack executable on node1.
6. Restart node1.
7. Add node1 back into the possible owners lists of SQL1 and SQL2 and fail both instances over to node1.
8. Repeat steps 4 – 6 on node2.
9. Add node2 back into the possible owners lists of SQL1 and SQL2 and fail both instances over to node2. Check that the build level is correct and review the SQL Server error logs.
10. Fail SQL1 over to node1. Check build levels and SQL Server error logs
Q. What are the main events and columns helpful in troubleshooting performance issues using profiler?
Event Group: Performance
Event: ShowPlan_ALL (BinaryData column must be selected)
Event Group: T-SQL
Event Group: Stored Procedures
Event Group: Locks
Event: Lock: Deadlock Graph
Event: Lock: Lock Deadlock Chain (Series of events that leaads to a deadlock)
Event Group: Sessions
Event: Existing Connection
Event Group: Security Audit
Event: Audit Login
Event: Audit Log Out
Below are the most common columns that help us in understanding the trace file to troubleshoot the problems.
All these columns need not be available for all of the events, but depends on the event select we have to choose the appropriate columns.
Q. What are the agents in replication?
Snapshot Agent: Copy Schema+Data to snapshot folder on distributer. Used in all types of replication.
Log reader Agent: Sends transactions from Publisher to Distributor. Used in transactional replication
Distribution Agent: Applies Snapshots / Transactions to all subscribers’ runs at distributer in PUSH and Runs at Subscriber in PULL. Used in transactional and transactional with updatable subscriptions.
Queue reader Agent: Runs at distributer send back transactions from subscriber to publisher. Used in Transactional With updatable subscriptions.
Merge Agent: Applies initial snapshot to subscribers, from the next time synchronize by resolving
Q. Can we configure log shipping in replicated database?
Replication does not continue after a log shipping failover. If a failover occurs, replication agents do not connect to the secondary, so transactions are not replicated to Subscribers. If a failback to the primary occurs, replication resumes. All transactions that log shipping copies from the secondary back to the primary are replicated to Subscribers.
For transactional replication, the behavior of log shipping depends on the sync with backup option. This option can be set on the publication database and distribution database; in log shipping for the Publisher, only the setting on the publication database is relevant.
Setting this option on the publication database ensures that transactions are not delivered to the distribution database until they are backed up at the publication database. The last publication database backup can then be restored at the secondary server without any possibility of the distribution database having transactions that the restored publication database does not have. This option guarantees that if the Publisher fails over to a secondary server, consistency is maintained between the Publisher, Distributor, and Subscribers. Latency and throughput are affected because transactions cannot be delivered to the distribution database until they have been backed up at the Publisher.
Q. What are the best RAID levels to use with SQL Server?
Before choosing the RAID (Redundant Array of Independent Disks) we should have a look into usage of SQL Server files.
As a basic thumb rule “Data Files” need random access, “Log files” need sequential access and “TempDB” must be on a fastest drive and must be separated from data and log files.
We have to consider the below factors while choosing the RAID level:
As an Admin we have to consider all of these parameters in choosing the proper RAID level. Obviously the choice is always between RAID-5 and RAID-10
Q. How to monitor latency in replication?
There are three methods.
- Replication monitor
- Replication commands
- Tracer Tokens
1. Replication Monitor: In replication monitor from the list of all subscriptions just double click on the desired subscription. There we find three tabs.
- Publisher to Distributor History
- Distributor to Subscriber History
- Undistributed commands
2. Replication Commands:
Publisher.SP_ReplTran: Checks the pending transactions at p
Distributor.MSReplCommands and MSReplTransactions: Gives the transactions and commands details. Actual T_SQL data is in binary format. From the entry time we can estimate the latency.
Distributor.SP_BrowseReplCmds: It shows the eaxct_seqno along with the corresponding T-SQL command
sp_replmonitorsubscriptionpendingcmds: It shows the total number of pending commands to be applied at subscriber along with the estimated time.
3. Tracer Tokens:
Available from Replication Monitor or via TSQL statements, Tracer Tokens are special timestamp transactions written to the Publisher’s Transaction Log and picked up by the Log Reader. They are then read by the Distribution Agent and written to the Subscriber. Timestamps for each step are recorded in tracking tables in the Distribution Database and can be displayed in Replication Monitor or via TSQL statements.
When Log Reader picks up Token it records time in MStracer_tokens table in the Distribution database. The Distribution Agent then picks up the Token and records Subscriber(s) write time in the MStracer_history tables also in the Distribution database.
Below is the T-SQL code to use Tracer tokens to troubleshoot the latency issues.
–A SQL Agent JOB to insert a new Tracer Token in the publication database.
EXEC sys.sp_posttracertoken @publication = <PublicationName>
–Token Tracking Tables
SELECT Top 20 * FROM MStracer_tokens Order by tracer_id desc
SELECT Top 20 * FROM MStracer_history Order by parent_tracer_id desc
Q. Can we perform a tail log backup if .mdf file is corrupted?
Yes we can perform a tail log as long as the ldf if not corrupted and no bulk logged changes.
A typical tail log backup is having two options, 1. WITH NORECOVERY 2.Continue After Error.
1. WITH NORECOVERY: To make sure no transactions happens after the tal log backup
2. CONTINUE AFTER ERROR: Just to make sure log backup happens even though some meta data pages corrupted.
Q. Let’s say we have a situation. We are restoring a database from a full backup. The restore operation ran for 2 hours and failed with an error 9002 (Insufficient logspace). And the database went to suspect mode. How do you troubleshoot this issue?
In that case we can actually add a new log file on other drive and rerun the restore operation using the system stored procedure “sp_add_log_file_recover_suspect_db”. Parameters are the same as while creating a new log file.
Q. Let’s say we have a situation. We are restoring a database from a full backup. The restores operation runs for 2 hours and failed with an error 1105 (Insufficient space on the file group). And the database went to suspect mode. How do you troubleshoot this issue?
In that case we can actually add a new data file on another drive and rerun the restore operation using the system stored procedure “sp_add_data_file_recover_suspect_db”. Parameters are the same as while creating a new data file.
Q. Can you describe factors that causes the logfile grow?
- CHECKPOINT has not occurred since last log truncation
- No log backup happens since last full backup when database is in full recovery
- An active BACKUP or RESTORE operation is running from long back
- Long running active transactions
- Database mirroring is paused or mode is in high performance
- In replication publisher transactions are not yet delivered to distributer
- Huge number of database snapshots is being created
Q. How do you troubleshoot a Full transaction log issue?
Columns log_reuse_wait and log_reuse_wait_desc of the sys.databases catalog view describes what is the actual problem that causes log full / delay truncation.
- Backing up the log.
- Freeing disk space so that the log can automatically grow.
- Moving the log file to a disk drive with sufficient space.
- Increasing the size of a log file.
- Adding a log file on a different disk.
- Completing or killing a long-running transaction.
Q. Does “Truncate” works in transactional replication?
No! As per MSDN blogs information we can’t use TRUNCATE on published database against the published article instead we have to use “DELETE” without where clause.
Q. Consider a situation where publisher database log file has been increasing and there there is just few MB available on disk. As an experienced professional how do you react to this situation? Remember no disk space available and also we can’t create a new log file on other drive
Essentially we have to identify the bottleneck which is filling the log file.
As a quick resolution check all possible solutions as below:
- Resolve if there are any errors in log reader agent / distribution agent
- Fix if there are any connectivity issues either between publisher – distributor or distributor
- Fix if there are any issues with I/O at any level
- Check if there is any huge number of transactions pending from publisher
- Check if there are any large number of VLF’s (USE DBCC Loginfo)which slows the logreader agent work.
- Check all database statistics are up-to-date at distributer. Usually we do siwtch off this “Auto Update Stats” by default.
- To find and resolve these issues we can use “Replication Monitor”, “DBCC Commands”, “SQL Profiler”, “System Tables / SP / Function”.
If incase we can’t resolve just by providing a simple solution we have to shrink the transaction log file. Below are two methods.
To shrink the transaction log file:
1. Backup the log — So transactions in vlf’s are marked as inactive
2. Shrink the logfile using DBCC SHRINKFILE – Inactive VLF’s would be removed
3. If you find no difference in size repeat the above steps 1 and 2
To truncate the transaction log file:
In any case we are not able to provide the solution against the increasing logfile the final solution is disable the replication, truncate the log and reinitialize the subscribers.
1. Disable replication jobs
2. Execute SP_ReplDone procedure. It disable the replication and mark as “Replicate done” for all pending transactions at publisher.
3. Backup the transaction log “WITH TRUNCATE” option.
4. Shrink the log file using “DBCC SHRINKFILE”
5. Flues the article cache using “sp_replflush”.
6. Go to distributor database and truncate the table MSRepl_Commands
7. Connect to replication monitor and reinitialize all subscriptions by generating a new snapshot.
8. Enable all replication related jobs.
Q. Can we add an article to the existing publication without generating a snapshot with all articles?
Yes! We can do that. Follow the below steps to publish a new article to the existing publication.
There are two parameters that we need to change to “False”. 1. Immediate Sync and 2. Allow_Ananymous.
Both the fields were set to ON by default. If the Immediate_sync is enabled every time you add a new article it will cause the entire snapshot to be applied and not the one for the particular article alone.
1. Change the values to “True” for publication properties “Immediate_Sync” and “Allow_Anonymous” using SP_CHANGEPUBLICATION
2. Add a new article to the publication using SP_AddArticle. While executing this procedure along with the required parameters also specify the parameter “@force_invalidate_snapshot=1”.
3. Add the subscriptions to the publication for the single table/article uisng “SP_ADDSUBSCRIPTION”. While executing this proc specify the parameter “@Reserved = Internal”. Generate a new snapshot which only includes newly added article.
Q. How MAXDOP impacts SQL Server?
The Microsoft SQL Server max degree of parallelism (MAXDOP) configuration option controls the number of processors that are used for the execution of a query in a parallel plan. This option determines the computing and threads resources that are used for the query plan operators that perform the work in parallel.
For servers that use more than eight processors, use the following configuration:
For servers that use eight or fewer processors, use the following configuration:
MAXDOP=0 to N
Q. How distributed transactions works in SQL Server?
Distributed transactions are the transactions that worked across the databases, instances in the given session. Snapshot isolation level does not support distributed transactions.
We can explicitly start a distributed transaction using “BEGIN DISTRIBUTED TRANSACTION <TranName>”
For example, if BEGIN DISTRIBUTED TRANSACTION is issued on ServerA, the session calls a stored procedure on ServerB and another stored procedure on ServerC. The stored procedure on ServerC executes a distributed query against ServerD, and then all four computers are involved in the distributed transaction. The instance of the Database Engine on ServerA is the originating controlling instance for the transaction.
When a distributed query is executed in a local transaction, the transaction is automatically promoted to a distributed transaction if the target OLE DB data source supports ITransactionLocal. If the target OLE DB data source does not support ITransactionLocal, only read-only operations are allowed in the distributed query.
In order to work with these transactions, make sure below settings are done.
1. MSDTC must be running on all supported instances
2. Choose the option “No authentigation required” from MSDTC properties
3. Turn on random options at linked server properties like “RPC”, “RPC Out”, “Data Access” etc.
Q. Can you give some examples for One to One, One to Many and Many to Many relationships?
One to One: Citizen – UID
A citizen can have only one UID – A UID can represent only one citizen
One to Many: Customer – Products
A customer can sale number of products – A product can be brought by only one customer
Many to Many: Book – Author
A book can be written by more than one author – An author can write more than one book
Q. What are the phases of sql server database restore process?
1. Copy Data: Copies all data,log and index pages from backup file to database mdf, ndf and ldf files
2. REDO: Rollfoward all committed transactions to database and if it finds any uncommitted transactions it goes to the final phase UNDO.
3. UNDO: Rollback any uncommitted transactions and make database available to users.
Q. I wanted to know what are the maximum worker threads setting and active worker thread count on sql server. Can you tell me how to capture this info? What’s the default value for max thread count?
We can check the current settings and thread allocation using the below queries.
select max_workers_count from sys.dm_os_sys_info
select count(*) from sys.dm_os_threads
Default value is 255.
Increasing the number of worker threads may actually decrease the performance because too many threads causes context switching which could take so much of the resources that the OS starts to degrade in overall performance.
Q. Can you explain sql server transaction log architecture?
We need to spend some time on this as every SQL DBA must aware of this concept.
Q. See I have an environment, Sunday night full backup, everyday night diff backup and every 45 min a transactional backup. Disaster happened at 2:30 PM on Saturday. You suddenly found that the last Sunday backup has been corrupted. What’s your recovery plan?
When you find that the last full backup is corrupted or otherwise unrestorable, making all differentials after that point useless. You then need to go back a further week to the previous full backup (taken 13 days ago), and restore that, plus the differential from 8 days ago, and the subsequent 8 days of transaction logs (assuming none of those ended up corrupted!).
If you’re taking daily full backups, a corrupted full backup only introduce an additional 24 hours of logs to restore.
Alternatively, a log shipped copy of the database could save your bacon (you have a warm standby, and you know the log backups are definitely good).
Q. Full backup size is 300 GB, usually my diff backup size varies between 300 MB and 5 GB, one day unfortunately diff backup size was increased to 250 GB? What might be the reason any idea?
Are you the kind of DBA who rebuilds all indexes nightly? Your differential backups can easily be nearly as large as your full backup. That means you’re taking up nearly twice the space just to store the backups, and even worse, you’re talking about twice the time to restore the database.
To avoid these issues with diff backups , ideally schedule the index maintenance to happen right before the full backup.
Q. What is .TUF file? What is the significance of the same? Any implications if the file is deleted?
.TUF file is the Transaction Undo File, which is created when performing log shipping to a server in Standby mode.
When the database is in Standby mode the database recovery is done when the log is restored; and this mode also creates a file on destination server with .TUF extension which is the transaction undo file.
This file contains information on all the modifications performed at the time backup is taken.
The file plays a important role in Standby mode… the reason being very obvious while restoring the log backup all uncommited transactions are recorded to the undo file with only commited transactions written to disk which enables the users to read the database. So when we restore next transaction log backup; SQL server will fetch all the uncommited transactions from undo file and check with the new transaction log backup whether commited or not.
If found to be commited the transactions will be written to disk else it will be stored in undo file until it gets commited or rolledback.
If .tuf file is got deleted there is no way to repair logshipping except reconfiguring it from scratch.