The post Latest SQL DBA Interview Questions and Answers takes you through the latest SQL DBA interview experiences shared by blog followers. You can also share your interview experience and we can provide you the answers:
Q. Can we perform a log backup with COPY ONLY option?
Yes Of course we can perform Copy Only log backups
Q. What are the marked transactions in SQL Server?
When we are dealing with related databases which means there is a dependency between databases and we need to establish a recovery strategy for all databases to the same consistent phase. But with this approach we may lose the recently committed transactions. Marking transaction logs is beneficial when you are dealing with multiple databases and want to restore all of them to the same point in time recovery.
This is used in rare cases and suitable for TFS because when there is disaster ‘occurs we need our data back in all databases to the same consistent manner, if you don’t mark the T-Logs and restore using the normal point in time recovery it might get our data back but with inconsistency between databases. For example I have created a new branch in TFS for new development work and and somebody checking and modifying code this total activities are recorded in multiple databases so when we want to restore it back we should get exactly the same consistent point. Have a look at here:
Q. Why Primary key is mandate for Transaction Replication?
Yes this must be a great question for sure even I tried to find the answer from SQL documentation but I didn’t get the proper explanation but Brent answered this:
“When we perform an operation (Update / Delete) on publisher the same operation should be applied at all subscribers by matching / lookup a unique column between source (Publisher) and destination (Subscriber). That’s way Primary key is mandatory to configure Transactional Replication.”
Q. Then how merge replication works without primary key?
We can configure one way Merge replication and can include articles without Primary Key. Merge replication uses a globally unique identifier column “GUID” to uniquely identify the each row during the replication process. Merge replication automatically creates / add one GUID / uniqueidentifier column in case the published table does not have one. But this process might impact the replication performance as a side-effect.
Q. What are the best practices to perform a backup for a VLDB (2TB) in SQL Server?
We can’t give you the correct answer unless we know the exact business requirement but these are the generic rules:
- Perform a full backup at low traffic time (Sunday Late Night)
- Plan for a differential backup on every week day and a full in weekend
- Choose a dedicated drive for backup files instead of performing on disk that contains DATA, LOG or TempDB.
- Use backup compression it might take CPU but saves a lot of space and time
- Perform split backup on multiple drives. Let’s say full backup is taking 4 hours then split backup with 4 files can be finished in 1 hour average. But it gives pressure on I/O that’s why we should choose the low traffic time.
- When we perform backup for VLDB make sure no other backup jobs or running in parallel.
- Do not perform backup to network / remote drive instead perform a backup to local drive and then schedule a job to copy.
- Choose the suitable backup disk type to support the write speeds for bulk volumes
- If you have a lot of data / tables are unchanging then you can consider hosting all of those data tables in a separate database with simple recovery mode or at least in a separate file-group in the same database.
- Try third party tools (Litespeed, RedGate etc.) which may speed up the backup process
Q. How do you find the long running queries in SQL Server?
There are multiple ways to find the long running queries:
Use Queries: Prepare a list of useful queries and keep the script ready with you and that can be useful as a quick tool when you find something suspicious on your Server / database
Activity Monitor: Quick way to identify the problematic queries
Server Standard Reports: We can see the top costliest queries using the server native reports
Database Standard Reports: We can find the database level long running queries
Profiler: Run a profiler trace and find the queries with bad performance
Extended Events: Lightweight tool to get the trace
Server Side Trace: Using T-SQL commands we can run server side trace and capture the trace for a certain time period and we can analyse it to find the long running queries
DMV: We can use DMV to scan the buffer cache to identify the long running queries Ex: sys.dm_exec_query_stats, sys.dm_exec_query_plan etc.
DBCC: Sometime DBCC commands can help us to identifying the long running transactions ex: DBCC OPENTRAN
Brent Sample Query to find the top 50 long running queries (Execution time > 5 Min):
SELECT TOP 50 *
ORDER BY total_worker_time DESC
) AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS qp
WHERE qs.max_worker_time > 300
OR qs.max_elapsed_time > 300
Q. Application is running slow and the support team involved you to check the bottleneck. How do you handle that? Can you explain the standard procedure to resolve a slow running application?
Usually this question is asked to test how you experienced the situation. Here are the steps:
These are the generic steps we follow because we have seen issues like this in enterprise environment where we find 100’s of applications are running. If some app is running slow we may not be able to identify the exact object / query / procedure / bottleneck causing the issue. Thereof we should follow the standard procedure to identify the bottleneck
- Before concluding the problem is from SQL Server first we should make sure that the bottleneck is from Database side.
- Means we should make sure the problem is not from the other parts:
- Web Service
- Network / Firewall
- Middleware – Data LayersWe usually do that right, once app team identify that the application is running slow then they will reach the technology desk then we’ll open an incident and involve all teams.
- The first step is usually quick (<=5 min) check all services are up and running (IIS / Tomcat, Network Band width, Firewall, Database Servers, Middleware Services etc.)
- The DBA team usually takes the lead and drive the incident call
- From DB prospect we have to make sure below are running fine:
- Database Servers are up and running
- SQL Service is responding and all DB’s are online
- CPU Usage
- Memory Usage
- Disk I/O Usage
- Performance counters
- As I told you it should be the quick test which should not take more than 5 min. We can use a predefined / written SQL script. Run a Health Check report and check all parameters in one go.
- Now all working fine then we can quickly check through the database for long running queries and transactions:
- Check if any Deadlocks / Blocking – Use Sysprocesses
- Check long running transactions – DBCC opentran
- Check top 10 long running queries / procedures – Use standard database reports or take DMV help
- If you find anything suspicious then check with app / dev team to make sure that query / proc might cause the page / module in app becoming slow
- If you still can’t find anything suspicious then we may need to run a trace through extended events or profiler server side trace by filtering through the database name and application login.
- Once you find the block then it will be easy to solve the problem.
Once you identify the query / stored procedure which is causing the problem then you can fine tune it using the tips:
- Check if there is any blocking or deadlocks during the execution
- Usually below are the key areas where we focus from Execution Plan
- Bookmark/RID/KEY Lookup
- Table Scan/Index Scan
- Data/Line Density
- HASH/Nested Loop/Merge Joins
- Tables/Index statistics
- Warning Signs
- Logical Reads
- Number of Executions
- Query Statistics – Logical and Physical Reads
- Cardinality Estimates, Histogram
- Check if there are any temp objects using and can be replaced by any other best option. For example “Table Variable” can be replaced with a “Temp Table” when dealing with huge data.
- Check how data processing happening. Apply all required data filters before applying/joining with other data sets.
- Check if there are any user defined functions using in select or where clause and think of alternatives if possible.
- Check if there are any query hints using and see if those are really required.
- Check if there are any proper SET options can help
- Check if bad parameter sniffing causing the issue and use local variables or OPTIMIZE FOR UNKNOWN option
- Try to avoid dynamic SQL instead split the stored procedure into multiple procedures
- Check if there is any implicit data conversion happening in query and fix it
- Check if there are any cross database queries fetching data from other database with joining tables from current database. If that’s the case then we can split the processing logic to fetch only the filtered data from other database into a temp object and then apply the join.
- Quickly check the index usage:
- See if the query is using the proper indexes
- Check if we can avoid sort by using a clustered index column
- Check the column order as it might impact the index usage
- Check if statistics are up to date
- Based on frequency of query usage we may consider creating covering index
Q. What kind of an algorithm we can specify while creating an End Point while configuring Database Mirroring in SQL Server?
RC4 is the default algorithm but we should be careful as it’s marked as a deprecated feature and Microsoft suggesting to use AES algorithm.
Q. We are getting a excel sheet from customer on daily basis, we need to update data in SQL Server database table based on excel sheet. This is just a simple requirement we just need to update data from excel to table and no extra transformations required. Can you suggest your thoughts on it?
Use OPENDATASOURCE / OPENROWSET:
We can simply insert data from Excel to SQL Server table and Update data:
1. Create a staging table
2. First truncate the staging table
3. Insert data from Excel to Staging
4. Update base / main table from staging table using a update statement by comparing the key columns
Create a simple ETL package that can accomplish the same list of steps listed above.
1. Control Flow: Execute SQL Statement – Truncate / Delete the staging table
2. Data Flow: Source Excel File, Destination: SQL Server Table
3. Control Flow: Execute SQL Statement: Update the main / base table by joining the staging table and comparing the key columns
If data is going to be huge Best possible approach is using a SSIS ETL package. If it’s for less data and working fine then simply create a T-SQL code in a procedure and schedule a job to execute the proc.
Q. I am trying to understand how temp table works in SQL Server. I have created a simple stored procedure that creates a temp table, inserts and retrieves data from local / global temp tables as below:
–Local Temp test
Create Procedure LTemp_Test
CREATE TABLE #LTemp (Id Int)
INSERT INTO #LTemp VALUES (10);
SELECT * FROM #LTemp
–Global Temp Test
Create Procedure GTemp_Test
CREATE TABLE ##GTemp (Id Int)
INSERT INTO ##GTemp VALUES (10);
SELECT * FROM ##GTemp
Now my question is when I am executing these procedures LTemp_Test is executing successfully but GTemp_Test procedure is executing fine for the first time but in the 2nd execution it’s failing with the error “Msg 2714, Level 16, State 6, Procedure GTemp_Test, Line 5 There is already an object named ‘##GTemp’ in the database.”.
Can you explain me why?
There is nothing wrong in that and this is expected behaviour of Global and Local Temporary tables. First let me explain about the scope of a Temporary table:
Local Temp Table:
- Local Temporary table is accessible in current session only which means when you create a Local Temp Table in session 1 can’t be used / referenced in any other session/connection.
- Local Temp tables are automatically dropped when the current session is closed. For example a local temporary table created in a stored procedure is dropped automatically when the stored procedure completes.
Global Temp Table:
- Global Temporary table is visible to all sessions which means when you create a global temp table in session 1 can be accessible in other sessions as well.
- Global Temp tables are automatically dropped when the session that created that table ends and the last active Transact-SQL statement referencing this table in other sessions ends. For example a global temporary table ##GTT is created in session 1 and the same table is being used in session 5 and session 8. Global Temp Table is automatically dropped when you close the session 1 and there are no transactions are referencing this table in session 5 and session 8.
Now coming to your question:
- LTemp_Test is successfully executing without any issue because the local temp table is automatically dropped when stored procedure execution completes. When next time it executes it creates a new local temp table.
- GTemp_Test is executing successfully for the first time but failing at the second execution because of its scope. As we learnt global temporary table is available for all sessions / connections thereof when you try to execute the procedure 2nd time it’s failing as there is already the global table created in previous session and the same is available for all sessions.
Q. I have a production instance where we hosted nearly 100 databases for multiple applications. Our requirement is to hide unnecessary databases from users that means users should not be able to see databases where they don’t have access. Can we do that?
Unfortunately we don’t have any direct way / feature to hide databases from users in SQL Server. But it’s possible if the user can be a DB_OWNER on all required databases, here is the way:
For example we have a SQL Server instance with 65 databases. We have created a login called “SalesAD” and mapped to 2 databases “SalesDB” and SalesDB_Archive” as a DBO. The requirement is to hide remaining 63 databases from this user.
- Deny View Any Database permission for the login “SalesAD”
DENY VIEW ANY DATABASE TO [SalesAD];
- Remove / drop the user “SalesAD” from those 3 databases
DROP USER [SalesAD]
DROP USER [SalesAD]
- Assign ownership of the database to user “SalesAD”
ALTER AUTHORIZATION ON DATABASE::[SalesDB] to [SalesAD]
ALTER AUTHORIZATION ON DATABASE::[SalesDB_Archive] to [SalesAD]
- Now if you login to SQL Server using “SalesAD” login then you can see only two databases “SalesDB” and “SalesDB_Archive”.
Q. My database contains A, B, C, D, E, F tables and I have a user called USER1. How can I hide tables B, D and E from user USER1? The user USER1 should be able to see / access A, C, and F tables only.
Ok, let’s say in database DB1, USER1 needs to access A,C,F tables and should not be able to access/view B, D, E tables. Here is the process:
- Map USER1 to database DB1 – Only to Public Role
- Give required access on A, C, F tables to USER1 as below:
- GRANT SELECT ON A TO USER1;
- GRANT INSERT ON A TO USER1;
- GRANT DELETE ON A TO USER1;
- GRANT UPDATE ON A TO USER1;
- Same applies for Table C and F.
- Now USER1 can see / access only A, C and F files
Q. How can I move a database data / log file to a new drive when database is participating in AlwaysOn?
There are two ways to do it:
First Way: Removing database from AlwaysOn Group and Detach- Attach method
- Remove the database from AlwaysOn Availability Group
- Detach the database from instance
- Move the physical files to the required / new location
- Attach the database back to instance
- Re-join the database to Availability Group
Second Way: Suspend data moment and move data files – Needs a SQL Server restarts and failover between replicas
- Suspend the data movement for the database to all secondary databases. This stops sending log records to secondary database thereof we can expect the log file growth on primary database.
- Change the readable secondary to No for all secondary replicas
- Logically move data/ log files to new location: Modify the file locations on a secondary replica using ALTER DATABASE …MODIFY FILE command.
- On secondary replica stop SQL Server instance
- Physically move data / log files to the new location: Cut and Paste data and log files to new location
- Start the SQL Server instance on secondary replica and check the database file locations
- Resume the database in Primary Replica
- Change the readable secondary to Yes for the completed secondary replica
- Failover and repeat above steps to complete data / log file movement on all partner servers
Q. What are the consequences when we loss of the principal in mirroring?
If the principal fails, the failover scenario depends on the transaction safety level and whether you have a witness.
Scenario 1: Safety FULL with a witness
This scenario provides the high safety with automatic failover. In the event of the failure of the principal, the mirror forms a quorum with the witness. Automatic failover will take place, thereby minimizing the database downtime. Once the last principle becomes operational, it automatically assumes the role of the mirror.
Scenario 2: Safety FULL without a witness
This scenario provides high safety, but automatic failover is not allowed. In the event of failure of the principal, the database service becomes unavailable. We need to execute the following on mirrored server to make the database service available:
ALTER DATABASE <database name> SET PARTNER OFF
RESTORE DATABASE <database name> WITH RECOVERY
Once principle becomes available, you need to re-establish the mirroring session.
Scenario 3: Safety OFF
If the safety level is OFF, there were transactions that didn’t make it to the mirror at the time of the failure of the principal. These transactions will be lost. Therefore, manual failover with safety OFF involves acknowledging the possibility of data loss. We need to execute the following on mirror to make the database service available:
ALTER DATABASE <database_name> SET PARTNER
Once the database on principal becomes operational, it automatically assumes the role of the mirror. However, the mirroring session remains SUSPENDED, and you will need to manually RESUME the mirroring session.
Q. What are the consequences if we loss mirror in database mirroring?
If the mirror fails, the principal continues functioning, but the mirroring state is DISCONNECTED and the principal is running exposed. Once the mirror database becomes operational, it automatically assumes the role of the mirror and starts synchronizing with the principal. As long as the mirroring state stays DISCONNECTED, the transaction log space on the principal cannot be reused, even if you back up the transaction log. If the log file grows and reaches its maximum size limit or runs out of disk space, the complete database comes to a halt. To prevent this we have options:
- Plan for enough disk space for the transaction log to grow and bring back the mirror database before the space fills up
- Break the database mirroring session
Q. What happens if we loss witness in database mirroring?
If the witness server fails, database mirroring continues functioning without interruption, except that automatic failover is not possible. Once the witness becomes operational, it automatically joins the database mirroring session.
Q. What happens if we lose both mirror and the witness in database mirroring?
Assume you have configured database mirroring with a witness. When the mirror is unavailable, the principal runs exposed. While the mirror is unavailable, if the witness is also lost, the principal becomes isolated and can’t service the clients. Even though the principal database is running, it is not available to the clients. If you attempt to connect to the database, you get the message “Database <dbname> is enabled for database mirroring, but neither the partner nor witness server instances are available: the database cannot be opened.” If the mirror or the witness cannot be brought back online quickly, then the only way to resume database service is to terminate the database mirroring session. To do this, we need to execute the following command after connecting to the master database of the principal server:
ALTER DATABASE <db_name> SET PARTNER OFF
Q. Can I perform point in time recovery if recovery model is Bulk Logged recovery model?
If a log backup taken under the bulk-logged recovery model contains bulk-logged changes, point-in-time recovery is not allowed. Trying to perform point-in-time recovery on a log backup that contains bulk changes will cause the restore operation to fail.
Q. I have table called member with the columns “MemberID” (unique), “DOB” etc. As per the application requirement mostly we fetch data based on MemberID and DOB. What is your index strategy?
We can create a clustered index on MemberID and Non-Clustered index on DOB column. There are few things we should remember to make sure that the index is getting utilized:
- You should understand how index is created on MemberID and DOB then you can be able to understand how they are utilized in search queries
- See when you search Member table using MemberID will apply the binary search and seek the clustered index on MemmebrID and fetch all required columns from clustered index leaf node
- Searching based on DOB, traverse the non-clustered index on DOB and get the values from leaf node of non-clustered index. IF SELECT clause contains more columns then it has to traverse back the clustered index leaf node to fetch other columns along with MemberID and DOB. This process is shown as Key Lookup in execution plan.
- Make sure you are selecting only required data from the table and never use * in SELECT
- Use an appropriate data types. Ex: Use DATE instead of DATETIME for DOB if that’s suits the requirement
- Avoid using conversion functions in where clause
- Always execute proper number of test cases and make sure index is improving the performance before applying it to production
- If your application is required to use a set of columns for almost all of the queries then better create a covering index, make sure you follow the correct order in SELECT query
- Even after all these if you still find any performance issues from execution plan then analyze it and see where we can improve the performance.
Q. In your backup strategy you have loss some data, how do you convince your client?
First check that the SLA (Service Level Agreement) and make sure data loss is in acceptable limits as per SLA. If not you might have missed something in designing backup strategy to matchup the client SLA. However first thing should be “Identify the possible ways” to get our data back. There are certain things from where we may get our data back and of course it always depends on the situation. Check if we can get the tail log backup, see if that data is static or dynamic, if it’s a static / master data then we may get it from the other mirror / partner database, or see if we can match it with any Pre-Prod environment etc.
Even after checking all possible ways you still failed to get data back then approach your manager with a detailed RCA (Root Cause Analysis) by saying:
- Why this happened?
- Where is the mistake?
- Can we provide any quick / temporary solution to get our data back?
- What is the prevention from future failures?
- Provide all possible logs / event information that you collected during the RCA
In the post “Latest SQL DBA Interview Questions and Answers” we just listed few random SQL Server DBA interview questions with answers. These questions are shared by blog followers and we would like to appreciate everyone who shared their experiences.