Chapter 4 FILTERING

  1. LIKE: Filters results based on a pattern match within a string column.

  2. LIKE ANY: Filters results based on a pattern match against any of a list of patterns.

  3. IN: Filters results where a column’s value matches any value in a list of values.

  4. NOT IN: Filters results where a column’s value does not match any value in a list of values.

  5. BETWEEN: Filters results where a column’s value falls within a specified range (inclusive).

  6. IS NULL and IS NOT NULL: Filters results based on whether a column’s value is NULL or not NULL.

  7. =and != (or <>): Filters results for exact matches or non-matches.

  8. ILIKE and ILIKE ANY: perform pattern matching without worrying about the case (upper or lower) of the letters in the string.

SELECT TOP 5 *
FROM employees
WHERE NAME LIKE 'A%B.'
  AND LAST_NAME LIKE ANY ('ASD', '%A', 'V_')
  AND CITY IN ('CHICAGO', 'LAWRENCE')
  AND CITY IN (SELECT * FROM CITY_TABLE)
  AND STATE NOT IN ('TEXAS')
  AND SALARY BETWEEN 100 AND 10000
  AND DEPARTMENT IS NOT NULL
  AND SALARY > 1000
  AND NAME <> 'DAVID'
  AND NAME ILIKE `david'
  AND LOWER(NAME) LIKE 'david'    -- for those ILIKE is not supported

4.1 Wildcards

%: Matches any sequence of characters (including zero characters). _: Matches any single character. [] (SQL Server): Matches any single character within the specified set or range. [^] (SQL Server): Matches any single character not in the specified set or range. - (SQL Server): Defines a range of characters within square brackets.

SELECT TOP 5 *
FROM employees
WHERE NAME LIKE 'A%B'               -- starts with A and ends with B
  AND NAME LIKE 'ALE_'              -- 4 character and starts with ALE
  AND NAME LIKE '[JM]ason'          -- jason or mason
  AND NAME LIKE '[^JM]ason'         -- not jason or mason
  AND NAME LIKE '[A-D]%'            -- all names that start with any letter between 'A' and 'D'

4.2 DISTINCT keyword

When DISTINCT is applied to a single column, it filters out duplicate values in that specific column, returning only unique values for that column.

  • Single Column: Returns unique values for that column.

  • Multiple Columns: Returns unique combinations of the specified columns.

4.3 Advanced Filtering

4.3.1 HAVING Clause

  • The HAVING clause is used to filter records that work on aggregated data, as opposed to the WHERE clause which filters rows before any aggregation is applied.

  • Use HAVING when you need to filter groups after applying GROUP BY and aggregate functions (like SUM, COUNT, AVG, etc.).

Example: This query returns departments with more than 10 employees.

SELECT department, COUNT(employee_id) AS num_employees
FROM employees
GROUP BY department
HAVING COUNT(employee_id) > 10

4.3.2 QUALIFY Clause

  • The QUALIFY clause filters the result set after window functions have been applied. It’s particularly useful when you want to filter results based on the output of window functions like ROW_NUMBER, RANK, or DENSE_RANK.

  • Use QUALIFY when you need to apply filtering conditions after using window functions, making it easier to handle complex queries involving ranking or ordering within groups.

SELECT 
    employee_id, department, 
    RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS salary_rank
FROM employees
QUALIFY salary_rank = 1;