home

Introducing: “AI The One” for Data Professionals

Posted on August 27, 2025 by uday arumilli

A Practical Path to AI for SQL Developers, DBAs, Data Engineers & Data Analysts

AI for data professionals – practical guide for SQL developers, DBAs, data engineers, and data analysts

You’ve mastered databases. You speak fluent SQL. You can deal with Bigdata and build data pipelines in your sleep. But what if the next edge isn’t about writing better queries — it’s about making your data… intelligent?

Why This Blog Series? Why Now?

The world of data is changing — fast.

AI is no longer just a buzzword. It’s baked into the tools you’re already using:

Azure AI & Copilot Integrations (SQL Server Management Studio, VS Code, Power BI, Microsoft Fabric Notebooks, etc.)
Intelligent Query Recommendations
Automated data preparation pipelines
AI based Data Governance
Powerful AI tools are transforming the way SQL Developers, DBAs, Python Developers, and Data Engineers write code — helping them code faster, smarter, and with fewer errors. These include intelligent code assistants like GitHub Copilot, Copilot with Azure SQL, OpenAI Codex, Anthropic Claude Code, Cursor AI, Kite, and AI2SQL etc.

Soon, understanding AI won’t be optional — it will be expected.

But here’s the problem:

Most AI courses are built for data scientists, ML engineers, or Python-first developers.
They’re full of math, jargon, and abstract theory.

What about us?

The SQL Developers, DBAs, Data Engineers, and Data analysts who already understand data better than most?

That’s exactly why I created “AI The One” — an AI learning path made for people like us.

What Is “AI The One”?

A practical, blog-style course that teaches AI and Agentic AI from your perspective as a data professional.

No fluff.
No heavy math.
Just clean, structured, step-by-step learning — grounded in the world you already know: Data, SQL, data modelling, indexes, pipelines, storage, and logic.

What You’ll Learn — Without Getting Overwhelmed

The real-world meaning of AI, ML, LLMs, Gen AI, and Agentic AI
How AI systems relate to what you already know in SQL & Data Management
Simple hands-on examples with your data
Python for SQL minds — just what’s needed, nothing more
Concepts like vectors, embeddings, and prompt engineering
How tools like LangChain, LangGraph, AutoGen, and Copilots will reshape your daily work
And how you can go from data expert → AI-empowered professional

Who Should Follow This?

SQL Developers who want to future-proof their skills
SQL DBAs who want to understand what AI is doing to their data landscape
Data Engineers who want to extend their pipelines into intelligent automation
Data Analysts who want to elevate their insights using AI-powered data exploration
Anyone from the data world who has zero AI experience but big curiosity

Why You’ll Love It

It speaks your language (SQL, tables, indexes, functions, pipelines, execution plans)
Every concept connects to your existing knowledge
No need to be a Python ninja or statistics wizard
You’ll learn not just how, but why and where AI fits in
It’s fun, practical, and focused on real impact in your daily work

Coming Up Next…

Stay tuned for the first blog in the series, where we’ll explore:

“Why Every DBA, SQL Developer and Data Engineer Should Care About AI Now”

#learn #ai #sql #dba #llm #dataengineer #dataanalyst #genai #agenticai#openai #llama

Posted in AI for Data People | Tagged #AgenticAI, #AI, #AIForDataProfessionals, #AIFreeCourse, #CloudData, #DataAnalyst, #DataEngineer, #DataScience, #DBA, #GenerativeAI, #Llama, #LLM, #OpenAI, #SQL | Leave a comment

A discussion between a CxO and a senior Data Architect Part 5

Posted on October 3, 2022 by uday arumilli

Links to other parts

A discussion between a CxO and a senior Data Architect Part 1

A discussion between a CxO and a senior Data Architect Part 2

A discussion between a CxO and a senior Data Architect Part 3

A discussion between a CxO and a senior Data Architect Part 4

Background: We have been going through a discussion that took place between senior leadership and a data architect. Here the final part of the series continues.

Discussion Follows:

Alison: We are currently holding stakeholder financial portfolios, customer personal identities, and other sensitive information which is classified as confidential and restrictive. When I say moving to the cloud, the first thing that comes to my mind was data security. We are going to store our corporate data in a public cloud data center like Azure or AWS. Since you are the data owner, you need to convince me about the cloud migration by explaining the public cloud security capabilities. Considering I have zero knowledge about cloud security, can you list out all possible security risks and how Cloud providers can handle them?

Vasumat: Security is the top concern for any business. Security is a shared responsibility between the cloud provider (Azure, AWS, Google Cloud, etc.) and the customer in a public cloud platform. Three fundamental objectives of data security are A) Confidentiality – Ensuring data privacy; B) Integrity – Protect data from accidental or intentional alteration or deletion without proper authorization; C) Availability / Data Resiliency – Despite the incidents data continues to be available at a required level of performance.

• Things to be protected: We need to protect everything that belongs to our enterprise infrastructure. However, those are categorized as: Cloud endpoint, Network, Data, Application, Resource, Keys & Identities, Backups, Logs, and Cloud Datacenter – Physical device protection.

Possible security risks, reasons, solutions / preventive measures:

• Account hijacking: Compromised login credentials can put our entire enterprise at risk.

Reason: Weak credentials, unchanged passwords, keylogging (monitoring keystrokes), sharing credentials, etc.

Prevention: Strong credentials, no sharing, define expiry time for tokens, enable password policy, enable multifactor authentication, do not write passwords in a clear text format, store keys, and certificates in Azure Vault, allow access only to the specific IP addresses, do not use public computers or Wi-fi to connect to the cloud portals, etc.

• Human error: It is an indirect threat to our cloud workloads. Ex: Unknowingly deleting a resource, downloading insecure applications, misconfigurations, etc.

Reason: Low clarity of goals, untrained resources, unclear policies, not having proper data handover process in resource exit formalities, etc.

Prevention: Train the resources, make your IT policies stronger (Ex: Password expiry, restricting risky apps, games, pirated software downloads, internet gateways), create a tight monitoring control, etc.

• Application Security failures: Web applications are increasingly targeted by malicious attacks that exploit commonly known vulnerabilities. Ex: SQL injection (code injection technique, malicious SQL code/statements are inserted into application fields and try to access information that was not intended to be displayed.), cross-site scripting (attacker sends malicious code/payload to the server using feedback form, comment section, etc.), etc.

Reasons: Not sanitizing the inputs, not implementing timeout policy, displaying session IDs in URL, not using SSL/TLS, not encrypting passwords, failing to verify the incoming request source, exposing object references (table/view/function, database, file, storage, server, etc.), exposing error handling information to the end client, running unnecessary services, using outdated software, plugins, not having a standard audit policy, etc.

Prevention: Properly sanitize the user inputs; configure session timeout based on requirement; do not expose unnecessary information (error info, object references, session ID, app metadata, etc.) to the end client; always make sure that underlying app components are updated with the latest patch; don’t do redirects at all. If it is necessary, have a static list of valid locations to redirect to; equip apps with SSL/TLS, multi-factor authentication, etc.; Establish a strong security clearance layer, which means every time new code is deployed, we need to review, scan and identify security loopholes; Enable Web Application Firewall which acts as a layer between application and internet and filters the traffic and protects our App from common attacks like cross-site forgery, cross-site-scripting (XSS), file inclusion, and SQL injection. It is recommended to use Cloud-based WAF to automatically update it to handle the latest threats. Schedule periodic audits on application code. We use vulnerability scanners like Grabber, which performs automatic black-box testing and identifies security vulnerabilities.

• Data Breach & Theft of Intellectual Property: Altering, deleting, uploading, or downloading our corporate data without authorization is called a data breach. If it happens for sensitive data (patents, trade secrets, PII-Personal Identifiable Information, financial info, etc.), we need to notify the victims and it can critically damage our organization’s image, sometimes leading to legal actions and heavy penalties. Ex: Cyber attacks, phishing attacks, malware Injections, etc.

Reason: Data Breach and theft of IP are the implications of a failed security framework. Typically, any security weak point can cause this to happen. Ex: Leaked Credentials, human error, application loopholes, weak or not having IT policies, storing encryption keys along with the encrypted data, etc.

Prevention: We must be able to control the entire workflow and data flow in our cloud workload. When a request is coming or going from/to our cloud network, we (our policies, standards, security posture) must drive the flow that includes “who can enter our network?”, “sanitizing the request based on its source and access pattern”, “network route it can take”, “resource that it can reach”, “data it can access”, “actions it can perform”, “results it can carry back to the request initiator (service, app, browser, etc.)” etc.

• To implement this, we need to have a strong authentication and authorization mechanism, giving the least possible permissions, enabling threat detection, restricting access to the specific IP addresses, applying data protection features (Classification, data masking, encryption, etc.), securing backup files, log files, frequent audits (data and application) and fixing the problems, take patching (IaaS) seriously, defining the clear data boundary standards and implementing the policies accordingly, storing encryption keys and certificates separately (using a key vault), etc.

• Data Loss: Data loss is any process or event that results in data being corrupted, deleted, and/or made unreadable by a user and/or software or application.

Reason: Hardware failure, power failure, datacenter failures, natural disasters, accidental deletion, not understanding and having proper agreements (data retention period), not having proper backup, no or weak disaster recovery plan, not performing the backup health, not having the tight protection control for backups, etc.

Prevention: Understand the SLA (Service Level Agreement) on Data Retention Policy (How long data is required and how to dispose of), Recovery Point Objective (RPO – Maximum allowed data loss), Recovery Time Objective (RTO-maximum allowed downtime), and plan your backup and disaster recovery accordingly. Depending on data volume, and operations, perform DR drills to ensure that backups are healthy. Wherever possible keep the secondary copies in across regions, utilize long-term backup features, etc.

• Compliance issues: Regulatory compliance is to mitigate the risk and protect our (both enterprise and customer) data. Our cloud infrastructure must be compliant with the regulatory standards that would be defined by our enterprise business team. If we fail to follow the standard and data breach or loss happens, our organization will be in a difficult position from both legal and financial aspects (higher penalties up to $21 million). The most common regulations are GDPR (General Data Protection Regulation), and data privacy (CCPA), likewise we have various regulators for health insurance (HIPAA), payment card (PCI DSS), financial information (SOX), etc. Sample compliance rules: A user mustn’t have access to prod and no-prod servers, block public internet access to VM, restrict a number of administrators, enable password policy, store keys and certificates separately from data, missing security patches is a compliance issue, etc.

Reason: Companies are not taking regulatory compliance seriously; Many companies are still in the awareness stage; Thinking about the investment in implementation efforts that requires collaboration, strategies, and skillset. Mostly programmers and IT developers are taking regulatory compliance as the least priority.

Prevention: Its Big bosses’ (IT Decision maker, cloud/data architect) responsibility to insist on compliance with the regulatory standards; At on-premises we may need to use third-party tools to audit our infrastructure to validate the compliance with the regulatory, but in the cloud, we have in-built support. We can use Azure policies to implement the required standards. “Regulatory compliance dashboard” in Azure security center is one of my favorite features. It monitors, validates, and reports non-compliant issues. So that we can fix them to ensure that we are complying with the regulatory standard. It validates almost all aspects Ex: Network, Cloud endpoints, Data protection, Threat detection, vulnerability management, privileged access, backup & recovery, etc.

Continue reading →

Posted in Interview Q&A | Tagged architect, AWS, Azure, azure sql database, Blob, cloud, data migration, dms, EC2, GCP, Interview, migration, rds, S3, sql | Leave a comment

A discussion between a CxO and a senior Data Architect Part 4

Posted on October 3, 2022 by uday arumilli

Links to other parts

A discussion between a CxO and a senior Data Architect Part 1

A discussion between a CxO and a senior Data Architect Part 2

A discussion between a CxO and a senior Data Architect Part 3

A discussion between a CxO and a senior Data Architect Part 5

Background: We have been going through a discussion that took place between senior leadership and a data architect. Part – 4 continues.

Discussion Follows:

Alison: Can you tell me the most complex migration that you have done till today?

Vasumat: Sure! We have migrated workloads for one of the European-based engineering and construction firms. They operate from 85 offices across the globe with a staff of 13000 (IT Staff 4300) and revenues over $6.5 billion with a customer count of 0.4 million. We have created the business case by considering two major challenges. A) Data management on expensive storage – They are generating enormous data and struggling to maintain, and manage data and in handling disaster recovery solutions at their data centers. B) Scalability – Systems are not scalable to handle heavy workloads. We proposed the cloud solution by addressing these two challenges. Projected Azure as the suitable target by considering the TCO, ROI, and compatibility (majority of their workloads are using Microsoft products .Net, Windows OS, SQL Server, Office suite including Email, SharePoint, etc.).

Alison: Great! Can you summarize the migration?

Vasumat: Sure!

• Migration challenges include legacy systems (SQL Server 2000, Windows 2003), migrating huge data sets, Oracle to Azure SQL Database (Customer requirement), different migration strategies for various applications (refactoring, re-platforming, lift-and-shift, phased), critical application migration (with near zero downtime), hybrid cloud for 5 applications (some components at on-premises and some at Azure), heterogenous database systems (SQL Server, Oracle, MySQL, MongoDB), etc. diversified feature selection (PaaS, SaaS, IaaS, Serverless), etc.

• Tools used: Azure Migrate service (to migrate VMs & SQL Server – Discovery & Assessment, Server Migration Services); Azure Data Box (Shipping over 40 TB of data in a physical device); DMA-DMS (Data Migration Assistant – Database Migration Service for migrating databases); Azure Data Factory (Migrating huge datasets); AzCopy (Migrating storage); Virtual Machine Converter (MVMC) / Virtual Machine Manager (VMM) (for converting VMs on VMware hosts or physical computer to VM running on Microsoft Hyper-V; Azure Site Recovery (ASR) Azure DRaaS – Disaster Recovery as a Service for VMs, in some scenarios we also used it for VM migration; Recovery Services vault for Azure (For storing VM backups); VPN Gateway/ExpressRoute (To establish proper communication channel between on-prem and Azure); Azure Synapse Pathway (while migrating from data warehouse to Synapse, it Converts DDL and DML statements to compliant with Azure Synapse Analytics); SQL Server Migration Assistant (SSMA – to migrate from other RDBMS like MySQL, Oracle etc. to SQL Server and Synapse Analytics); Azure AD Connect (To synchronize on-prem active directory to Azure Active Directory) etc.

• Migration Outcome:

◦ As part of the analysis, we Identified unused/unnecessary applications and retired 180 VMs on-premises. Successfully migrated 260 applications, 1400+ servers, and VMs (1100 Windows, 300+ Linux),

◦ 650+ database instances (SQL Server, Oracle, MySQL, MongoDB),

◦ Project environments migrated – DEV, TEST, STAGE, PRE-PROD, PROD,

◦ 38 file shares, 12000+ active users, 25+ domains, 30+ messaging services,

◦ Data warehouses, reporting, and ETL solutions (SQL, Informatica, SSIS, SSRS),

◦ 1.2 petabytes (1200 TiB) of data was migrated.

◦ Finally, we closed/decommissioned 97% of on-premises legacy data center resources.

◦ Auto-scaling cloud infrastructure helped them to handle peak loads with zero issues. At on-premises they used to experience an average of 3 to 4 outages/downtime per month due to infrastructure issues. After cloud migration, they had zero outages for the first 8 months and 1 outage in the first 12 months.

◦ Also, based on the utilization frequency, we leverage the storage data tiers (Hot, Cool & Archive) which saved a lot of cost to the business. Faster and more efficient application development, management, and support.