SQL Server DBA Responsibilities
Key Responsibilities – DBA
We can categorize SQL Server DBA Responsibilities into 7 types
- Capacity Management
- Security Management
- High Availability Management
- Backup and Recovery Management
- Performance Tuning
- Process Improvements
- Daily, Weekly and Monthly maintenance
- Installations / Upgrades / Patching
- Health Checks / Report Analysis
DBA Daily Responsibilities
On daily basis DBA should monitor the below
- Confirm that backups have been made and successfully saved to a secure location
- Check the backup failure alerts, correct the errors and rerun the backups
- Review the average duration of backup, any significant changes occurred investigates on this. Most of the time it happens due to networking low bandwidth
- Validate the backup files using restore verify only. We can create jobs to take care of the task and to send a notification if it fails to verify any backup.
- Monitor all backup and log history is cleaning if it is designed.
- Find out the newly added databases and define the backup plan
- Verify the free space on each drive on all servers. If there is significant variance in free space from the day before, research the cause of the free space fluctuation and resolve if necessary. Often times, log files will grow because of monthly jobs.
- Automate through a job. The job runs for every one hour and reports any drive which is having less than 15 % of free space. We can design a SSRS report to showcase and review the delta values.
Jobs / Maintenance plans
- Check for the failed jobs, investigate the route cause and resolve the issue. Native SQL notification alert sends a mail to DBA team and also we can design a customized SSRS report which reports all failed/success job details with delta value.
- Check for the job execution duration and if any significant changes find the root cause and resolve the issue.
- Make sure that all process related jobs/ maintenance plans completed. That includes data pulling jobs, data update jobs etc.
- All cleanup jobs are running fine, temp tables, logs, history, backup files etc
- Confirm all servers/databases are up and running fine.
- Usually in an Enterprise Database Environment Third Party Tools are used to monitor Servers (Ex: “What’s Up”)
- For database monitoring we can design a native SQL Server solution using T-SQL code and a maintenance plan, it run min by min and send an email to DBA team if it is not able to connect to any of the database in the instance.
- Regularly monitor and identify blocking issues. We can design a procedure that continuously run on all PROD servers and notifies DBA team if any blockings, long running quires/transactions.
- Check Performance counters on all production servers and verify that all counters are within the normal range. We can design a SSRS metrics report to review the counters for every one hour.
- Throughout the day, periodically monitor performance using both System Monitor and DMV.
- Check the fragmentation and rebuild/ reorganize the indexes. We can use a native procedure which takes care of the task.
- Make sure all Stats Update / nightly_checkalloc / Index_Rebuild jobs are completed without any issue.
- Have a look at both SQL Server logs and Windows logs. If you find any strange issues notify the network or storage teams. Most of the times we find Network related or I/O related issues.
- Check the centralized error logs if any.
- Check the error logs for failed logins and notify the audit team if necessary
- Security Logs – Review the security logs from a third party solution or from the SQL Server Error Logs to determine if you had a breach or a violation in one of your policies.
- High Availability or Disaster Recovery Logs – Check your high availability and/or disaster recovery process logs. Depending on the solution (Log Shipping, Clustering, Replication, Database Mirroring, CDP, etc.) that you are using dictates what needs to be checked.
- We can design a native scripts using T-SQL to monitor Replication, Mirroring, Log shipping
- Monitor logshipping and mirroring using the customized stored procs.
- In most of the environments we see third party tools in monitoring Clusters or we can design our own native scripts using Windows Batch Programming , Powershell and T-SQL.
- Check the escalated issues first
- Check the current queue for requests and identify requests to be processed and work on the issue.
- We usually process the requests based on the SLA.
Weekly / Monthly Checklist
- Backup Verification (Comprehensive)- Verify your backups and test on a regular basis to ensure the overall process works as expected. Contact your off site tape vendor and validate the type does not have any restore errors
- Check the logins, service accounts for expire dates
- Backup Verification – Verify your backups on a regular basis. Randomly choose one or two backups and try to restore verify.
- Windows, SQL Server or Application Updates – Check for service packs/patches that need to be installed on your SQL Server from either a hardware, OS, DBMS or application perspective
- Capacity Planning – Perform capacity planning to ensure you will have sufficient storage for a specific period of time such as for 3, 6, 12 or 18 months.
- Fragmentation – Review the fragmentation for your databases to determine if you particular indexes must be rebuilt based on analysis from a backup SQL Server.
- Maintenance – Schedule an official maintenance, do all required health checks on all premium databases and servers.
- Security – Remove unneeded logins and users for individuals that have left the organization, had a change in position, etc.
- Moving data from production to archive: If your environment requires any specific DBA related data for a long time, plan for an archival procedure. We can achieve data from actual OLTP system to other dedicated DBA Server / Database
- We need to work with the other teams to make sure that all servers are at health condition and to balance the infrastructure.
- Usually we approach other teams for below (CR/IR/SR – Change Request/ Incident / Service)
- Found I/O errors
- Networking issues
- F/R – Flatten and Rebuild
- Adding space / drives – SAN (Storage Area Network)
- Starting a Server – When a manual interaction needed
- Backup Tapes – Archived backups
- Escalating an issue
- Document all changes you make to the environment that includes:
- Installations / Upgrades
- Service packs / HotFixes applied
- New databases added
- Logins \ Roles added / removed
- Check lists
- Infrastructure changes
- Process improvements
- Maintain a centralized inventory with all details for below items
- Database Servers
- Application Servers
- Web Servers
- Logins (Both SQL Server and Windows)
- Database Instances
- Databases (Dev / Stag / QA / PROD)
Note: Document as much as information. It really helps in critical situations. Try to add owner at each level, for example application owners, server owners etc. We can easily reach them in-case of any emergencies / Maintenance.
Third Party Tools
- MOM 2005 – Microsoft Operational Manager for monitoring IT infrastructure
- ApexSQL Monitor
- SQL Spotlight by Quest
- Microsoft Systems Center 2012 (SCOM)
- SQL Monitor by RedGate
- RoboCopy – Copying data, files across servers
- Heartbeat – Database monitoring tool
- What’Up – Server monitoring tool
- Octopus – Application deployment tool
- Team Viewer – Remote server connector
- VNC – Remote server connector
- SQL DOC2 – Reporting database architecture