Overview

Learning Objectives

Write and interpret subqueries, correlated subqueries, derived tables, and CTEs — and choose the right structure for each analytical situation.
Apply advanced filtering logic using EXISTS, ANY, ALL, CASE, COALESCE, and NULL-safe patterns to handle complex conditions cleanly and correctly.
Use string, type, and date functions — TRIM, UPPER, LOWER, CAST, DATEADD, DATEDIFF, EXTRACT — to standardize and transform raw data into analysis-ready form.
Detect and resolve data quality issues — NULL values, inconsistent categories, duplicate records, and messy strings — using SQL directly, without external tools.

SQL Intermediate bridges the gap between foundational SQL skills and the techniques required in real-world analytics environments. Each part addresses a distinct category of intermediate work — how to structure complex queries, how to filter data precisely, and how to clean and prepare it before it reaches reporting or analysis.

Part 1

Advanced Retrieval Patterns

Subqueries, CTEs, derived tables, and correlated logic — choosing the right structure for any query problem.

Part 2

Advanced Filtering & Logic

EXISTS, ANY/ALL, CASE patterns, COALESCE, and NULL-safe filtering for precise conditional query design.

Part 3

Data Cleaning with SQL

String functions, type conversion, date logic, NULL imputation, and duplicate detection — all in SQL.

Course Goal

By the end of all three parts, learners will write SQL with the precision and structure of an analyst — choosing the right retrieval pattern, applying clean conditional logic, and producing data that is ready for reporting without manual cleanup.

Where SQL Intermediate Sits in the Curriculum

SQL Intermediate builds on the six-keyword workflow introduced in SQL Basics — SELECT, FROM, WHERE, JOIN, GROUP BY, HAVING — and introduces the patterns that make those keywords genuinely powerful at scale and in real data environments.

Course	Focus	Key Skills
SQL Basics	Core workflow	SELECT, WHERE, JOIN, GROUP BY, HAVING, ORDER BY
SQL Intermediate	Query patterns & real-world logic	Subqueries, CTEs, EXISTS, CASE, COALESCE, data cleaning
SQL Advanced	Performance & engineering	Window functions, indexes, execution plans, partitioning

SQL Intermediate uses the same four tables introduced in SQL Basics. Keeping the dataset consistent means all mental energy goes toward mastering the new technique — not toward understanding unfamiliar data. The same rows now serve as the basis for correlated subqueries, EXISTS filters, CASE logic, date calculations, and duplicate detection.

students Person-level One row per student

Key Fields

student_idfirst_namelast_namestart_year

Used For

Correlated subquery practice (per-student calculations), EXISTS-based filtering (students who are enrolled), CASE-based cohort labelling, TRIM/UPPER/LOWER on name fields, NULL detection, and duplicate identification.

courses Course-level One row per course

Key Fields

course_idcourse_namedepartmentcredits

Used For

Subqueries in SELECT (course-level counts), derived table practice, ANY/ALL comparisons against credits, CASE-based credit classification, and CAST on numeric fields.

instructors Instructor-level One row per instructor

Key Fields

instructor_idfirst_namelast_namedepartment

Used For

EXISTS-based filtering, subqueries against department, messy string handling on name fields, COALESCE for missing department values, and IS NULL / IS NOT NULL validation.

enrollments Enrollment-level One row per student per course registration

Key Fields

enrollment_idstudent_idcourse_idgradeterm

Used For

The most-used table across all three parts. Subqueries in WHERE (filtered enrollment sets), correlated subqueries (per-student counts), CASE-based grade flags, COALESCE on NULL grades, date functions on term fields, and duplicate detection using ROW_NUMBER().

Part 1 introduces the retrieval structures that power real-world SQL — going beyond simple SELECT statements to build queries that reference other queries, compute values dynamically, and break complex logic into named, reusable steps. Every technique is demonstrated using the four course tables.

Topic 1

Subqueries in SELECT, WHERE & FROM

Embedding a query inside another query at three different locations. A subquery in SELECT adds a computed column. A subquery in WHERE filters rows against a dynamic result set. A subquery in FROM acts as a temporary table.

Topic 2

Correlated Subqueries

Subqueries that reference the outer query — recalculating once per outer row. Used for row-by-row comparisons and per-entity calculations that cannot be expressed with a simple JOIN or GROUP BY.

Topic 3

Derived Tables

Subqueries placed in the FROM clause and treated as a named temporary table. Derived tables allow filtered, aggregated, or transformed result sets to be joined as if they were regular tables.

Topic 4

CTEs (Common Table Expressions)

Named temporary result sets defined using WITH at the start of a query. CTEs separate complex logic into readable steps — each with a clear name — making multi-step queries easier to write, read, and debug.

Topic 5

Nested Subqueries

Subqueries placed inside other subqueries — multiple levels of nesting. Understanding when nesting is necessary versus when a CTE or JOIN would produce cleaner, more maintainable logic.

Topic 6

Subquery vs. CTE vs. Derived Table

A direct comparison of three ways to express the same logic — when each approach is preferred, how they differ in readability and reuse, and the practical rules for choosing between them.

Choosing the Right Structure

Understanding which retrieval pattern to use — and why — is one of the most practically valuable skills in Part 1. Each structure has a distinct strength and the right choice depends on reuse, readability, and complexity.

Structure	Defined Where	Best Used When	Key Strength
Subquery	Inline inside SELECT, WHERE, or FROM	The logic is simple and used only once	Concise for single-use calculations and filters
Correlated Subquery	Inside WHERE, referencing the outer query	The calculation depends on each outer row	Row-by-row comparisons that JOINs cannot express
Derived Table	Inside FROM, given an alias	A filtered or aggregated result needs joining	Acts like a table — supports JOINs on the result
CTE	Before the main query using WITH	Logic is complex, multi-step, or referenced more than once	Readable, named steps — easiest to debug and extend

Key Concepts — Part 1

Subquery in SELECT · Subquery in WHERE · Subquery in FROM · Correlated subquery with outer reference · Derived table with alias · WITH … AS CTE · Chained CTEs · Nested subquery depth.

Part 2 introduces the filtering and conditional tools that allow SQL to handle the ambiguity, incompleteness, and complexity of real-world data. Rather than filtering on simple equality conditions, learners apply logic that checks for existence, compares against sets, responds to NULL, and classifies records into meaningful categories.

Topic 1

EXISTS

Filters rows based on whether a correlated subquery returns any results — without needing to know what those results are. More efficient than IN for large subquery result sets and better at expressing presence-or-absence logic.

Topic 2

ANY & ALL

Comparing a value against every result returned by a subquery. ANY returns true if at least one comparison passes. ALL returns true only if every comparison passes. Used for threshold and range checks against dynamic sets.

Topic 3

CASE — Advanced Patterns

Moving beyond simple categorization into multi-condition CASE logic — nested CASE statements, CASE inside aggregations, and CASE used to pivot data from rows into columns within a single query.

Topic 4

COALESCE & IFNULL

Replacing NULL values with meaningful defaults in query results. COALESCE evaluates a list of expressions and returns the first non-NULL. Applied to enrollment grades, instructor departments, and any field that may legitimately be missing.

Topic 5

CASE with Aggregations

Embedding CASE inside SUM and COUNT to count or total only the rows meeting a specific condition — producing multi-metric summary reports in a single query without multiple separate GROUP BY passes.

Topic 6

NULL-Safe Filtering

Using IS NULL and IS NOT NULL correctly — understanding why NULL cannot be compared with = or !=, and applying the right filtering pattern to include, exclude, or flag records with missing values.

EXISTS vs. IN — The Key Distinction

Two of the most commonly confused filtering patterns in intermediate SQL are EXISTS and IN. Each checks whether a value appears in a set — but they operate differently and have different performance characteristics on large datasets.

Pattern	How It Works	Best Used When
WHERE id IN (subquery)	Evaluates the subquery once, produces a list, checks membership	The subquery returns a small, simple list of values
WHERE EXISTS (correlated subquery)	Evaluates the subquery once per outer row, stops at first match	Checking presence or absence — especially in large tables
WHERE NOT EXISTS (correlated subquery)	Returns outer rows where the subquery finds no match	Finding records with no related rows — students with no enrollments

NULL and Equality

NULL cannot be compared using = or !=. A condition like WHERE grade = NULL will never match any row — not because no grades are NULL, but because NULL is not equal to anything, including itself. Always use IS NULL or IS NOT NULL when filtering for missing values.

Key Concepts — Part 2

EXISTS · NOT EXISTS · ANY · ALL · CASE WHEN THEN ELSE END (multi-condition) · CASE inside SUM/COUNT · COALESCE(expr, default) · IFNULL(expr, default) · IS NULL · IS NOT NULL.

Part 3 addresses one of the most practical and frequently underestimated skills in data work — preparing raw data for analysis directly in SQL, without external tools. Messy strings, inconsistent types, date format issues, NULL values, and duplicate records are all handled in-query before results reach a report or dashboard.

String Functions

TRIM

Remove Leading & Trailing Spaces

Cleans hidden whitespace from the beginning or end of a string — a common issue in data imported from spreadsheets, forms, and legacy systems. Prevents join failures caused by invisible spaces.

UPPER & LOWER

Standardize Case

Converts string values to a consistent case — ensuring that “Math”, “MATH”, and “math” are treated as the same value in comparisons, GROUP BY operations, and joins.

Messy String Handling

Combined Cleaning Patterns

Combining TRIM, UPPER, LOWER, REPLACE, and SUBSTRING to standardize free-text fields — removing unwanted characters, correcting formatting, and extracting the useful portion of a string.

Type Conversion & Date Functions

CAST & CONVERT

Change Data Types

Converting values from one data type to another — text to number, number to string, string to date. Essential when source data stores numbers as text or dates in non-standard formats.

Date Functions

DATEADD · DATEDIFF · EXTRACT

Calculating time differences between dates, adding or subtracting intervals, and extracting specific date parts (year, month, day). Used for tenure calculations, age derivation, and period-based filtering.

NULL Handling & Duplicate Detection

NULL Imputation

Replace Missing Values

Using COALESCE and IFNULL to substitute meaningful defaults for NULL values — replacing NULL grades with “Not Graded”, NULL departments with “Unknown”, or NULL dates with a reference date.

Duplicate Detection

Find Repeated Records

Using DISTINCT, GROUP BY + HAVING COUNT(*) > 1, and ROW_NUMBER() to identify records that appear more than once in a table — a critical step before any aggregation or reporting.

Duplicate Removal

Keep Only One Version

Using ROW_NUMBER() inside a CTE to assign a sequence number per group, then filtering for only the first record — the standard SQL pattern for de-duplicating a table while keeping the most recent or most relevant row.

The Data Cleaning Workflow

A consistent cleaning workflow applied in SQL before any analysis begins prevents errors from propagating into reports, dashboards, and aggregations. Part 3 teaches learners to work through cleaning problems in a structured, repeatable order.

Step	Problem Addressed	SQL Tools
1. Check for NULLs	Missing values that will affect counts, joins, and aggregations	IS NULL · COALESCE · IFNULL
2. Standardize strings	Inconsistent case and whitespace preventing correct grouping and joins	TRIM · UPPER · LOWER
3. Fix data types	Numbers stored as text, dates in wrong format	CAST · CONVERT
4. Handle dates	Inconsistent date formats, missing date parts, derived date calculations	DATEADD · DATEDIFF · EXTRACT
5. Detect duplicates	Repeated records inflating counts and totals	DISTINCT · GROUP BY HAVING COUNT(*) > 1
6. Remove duplicates	Keeping only one version of each duplicated record	ROW_NUMBER() OVER (PARTITION BY … ORDER BY …)

Key Concepts — Part 3

TRIM · UPPER · LOWER · REPLACE · SUBSTRING · CAST · CONVERT · DATEADD · DATEDIFF · EXTRACT · COALESCE · IFNULL · IS NULL · DISTINCT · GROUP BY + HAVING COUNT(*) > 1 · ROW_NUMBER() for deduplication.

Overview — Key Takeaways

Five foundational principles from SQL Intermediate.

Four Tables

Students, courses, instructors, and enrollments — the same dataset from SQL Basics, now supporting correlated subqueries, EXISTS filters, COALESCE logic, date functions, and duplicate detection.

Three Parts

Retrieval patterns, filtering logic, and data cleaning — three distinct intermediate skill sets that together produce the analytical foundation needed for real-world SQL work.

Right Structure

Subquery, correlated subquery, derived table, or CTE — each retrieval structure has a specific strength. Knowing which to choose for a given problem is the most important decision in Part 1.

NULL Awareness

NULL cannot be compared with =. EXISTS and NOT EXISTS express presence and absence more clearly than IN and NOT IN for large datasets. COALESCE and IFNULL replace missing values with meaningful defaults.

Clean Before Analyzing

Raw data rarely arrives in analysis-ready form. Applying string standardization, type conversion, date normalization, and duplicate removal directly in SQL — before any aggregation — produces results that can be trusted.

Learning Objectives

Where SQL Intermediate Sits in the Curriculum

Choosing the Right Structure

EXISTS vs. IN — The Key Distinction

The Data Cleaning Workflow

Overview — Key Takeaways

Sign in

Sign up