Introduction

Welcome to Edition 9 of Making SAS Accessible to Everyone. After mastering report automation with ODS in Edition 8, we now turn to data manipulation and querying using SQL within SAS.

SAS offers full support for SQL through the PROC SQL procedure, blending the power of SQL with the flexibility of SAS datasets. Whether you are already familiar with SQL or just starting out, this edition will help you efficiently query, combine, and summarize your data using PROC SQL.


1. Getting Started with PROC SQL

At its core, PROC SQL enables SQL-like querying on SAS datasets:

PROC SQL;
    SELECT column1, column2
    FROM dataset;
QUIT;

This is useful for quick queries, simple reports, or more advanced manipulation without writing multiple DATA steps.

Example:

PROC SQL;
    SELECT Name, Age, Gender
    FROM work.demographics;
QUIT;

This displays the Name, Age, and Gender columns from the dataset work.demographics.


2. Filtering Rows with WHERE

The WHERE clause narrows down the rows returned based on conditions.

PROC SQL;
    SELECT *
    FROM work.demographics
    WHERE Age > 40 AND Gender = 'F';
QUIT;

You can use logical operators like AND, OR, NOT, as well as pattern matching with LIKE, and inclusion using IN.

WHERE Name LIKE 'A%'    /* starts with A */
WHERE Age IN (25, 30, 35)

3. Sorting Rows with ORDER BY

The ORDER BY clause controls the order of displayed results:

PROC SQL;
    SELECT Name, Age
    FROM work.demographics
    ORDER BY Age DESC;
QUIT;

You can sort by multiple columns and specify ascending (default) or descending (DESC) order.


4. Creating New Tables

You can save query results as a new SAS dataset:

PROC SQL;
    CREATE TABLE work.seniors AS
    SELECT *
    FROM work.demographics
    WHERE Age >= 65;
QUIT;

This is equivalent to using DATA and SET with a WHERE clause in a DATA step, but often more concise.


5. Deleting Tables

SQL allows dropping a dataset (table) from your SAS workspace:

PROC SQL;
    DROP TABLE work.temp_data;
QUIT;

Only use this once you’re certain the table is no longer needed.


6. Joining Tables

One of SQL’s most powerful features is the ability to join datasets.

Inner Join Example:

PROC SQL;
    SELECT A.ID, A.Name, B.Salary
    FROM work.employees AS A
    INNER JOIN work.payroll AS B
    ON A.ID = B.ID;
QUIT;

This merges employees and payroll by matching ID. You can also use LEFT JOIN, RIGHT JOIN, or FULL JOIN depending on your data needs.


7. Summary and Aggregation

SQL supports aggregation functions, allowing you to summarize data:

PROC SQL;
    SELECT Gender, COUNT(*) AS N, AVG(Age) AS AvgAge, MAX(Age) AS MaxAge
    FROM work.demographics
    GROUP BY Gender;
QUIT;

Functions available:

  • AVG(), SUM(), MIN(), MAX(), COUNT()
  • GROUP BY groups data by one or more variables

Use HAVING to filter groups after aggregation:

HAVING AVG(Age) > 40

8. Combining Queries with UNION and EXCEPT

UNION: Combine rows from two queries

SELECT Name FROM table1
UNION
SELECT Name FROM table2;

EXCEPT: Keep values from the first query that aren’t in the second

SELECT Name FROM table1
EXCEPT
SELECT Name FROM table2;

These are useful for comparing datasets.


Conclusion

In Edition 9, we learned to:

  • Use PROC SQL to query and manipulate SAS data
  • Filter and sort results using WHERE and ORDER BY
  • Create and delete datasets dynamically
  • Join datasets to enrich and combine information
  • Aggregate and summarize data with SQL functions
  • Use UNION and EXCEPT for dataset comparison

SQL provides a powerful, readable, and flexible way to work with SAS datasets, especially when handling complex transformations and joins.


What’s Next

In Edition 10, we’ll transition from querying data to Controlling DATA Step Processing.


Stay analytical and SQL-savvy with 3 D Statistical Learning.

Special thanks to Dr. Dany Djeudeu for promoting clarity and accessibility in statistical programming.