Introduction

Welcome to Edition 12 of Making SAS Accessible to Everyone. In the previous edition, we explored techniques for summarizing data using RETAIN, BY-group processing, and subsetting. In this edition, we delve into one of the most powerful and flexible tools in the SAS programming language: functions.

SAS functions allow users to perform data cleaning, transformation, and enhancement operations efficiently. These functions are the foundation of data wrangling, which is an essential part of any analytics pipeline. Whether you are standardizing date formats, cleaning up strings, or performing complex mathematical operations, SAS provides a rich library of built-in functions to get the job done.

This edition introduces:

  • Numeric and date/time functions for calculating values and intervals.

  • Character functions for string manipulation.

  • Type conversion functions for switching between numeric and character values.

  • CALL routines for side-effect operations that don’t return values.


1. Understanding SAS Functions and CALL Routines

SAS functions and CALL routines differ primarily in how they operate:

  • A function takes inputs (arguments) and returns a single value.

  • A CALL routine performs a task or action but does not return a value. Instead, it affects variables directly.

Syntax:

new_variable = function(arg1, arg2, ...);
CALL routine(arg1, arg2, ...);

Functions and routines can be used in DATA steps, PROC SQL, and macro expressions.


2. Numeric and Date Functions

These are essential for quantitative calculations and time-based analysis.

Generating Random Values

RAND('INTEGER', lower, upper)

Generates a uniformly distributed random integer between two bounds. Useful for sampling and simulations.

Finding Extremes

LARGEST(k, val1, val2, ...)
SMALLEST(k, val1, val2, ...)

Used to find the k-th largest or smallest value among several.

Rounding Numbers

ROUND(value, round_to)
CEIL(value)
FLOOR(value)
INT(value)

These help format or normalize numeric data:

  • ROUND(27.34, 0.1) → 27.3

  • CEIL(3.1) → 4

  • FLOOR(3.9) → 3

  • INT(3.9) → 3 (truncates decimals)

Date and Time Extraction

DATEPART(datetime)
TIMEPART(datetime)

Split datetime into separate date and time components for analysis or formatting.

Calculating Intervals Between Dates

INTCK('interval', start_date, end_date, <'method'>)

Counts intervals (e.g., months or years) between two dates.

Shifting Dates

INTNX('interval', start_date, increment, <'alignment'>)

Calculates a new date that is a set number of intervals away from the starting date. Great for rolling periods and forecasts.


3. Character Functions

SAS offers powerful tools for handling text-based data.

Cleaning Text

COMPBL(string)          * Reduce multiple spaces to one;
COMPRESS(string, chars) * Remove specified characters;
STRIP(string)           * Remove leading and trailing spaces;

Parsing and Case Transformation

SCAN(string, n, 'delims')  * Extracts the n-th word based on delimiters;
PROPCASE(string, 'delims') * Converts to Proper Case (Title Case);

Searching Strings

FIND(string, substr, 'mod')
INDEX(string, substr)
LENGTH(string)

These help locate substrings or measure string lengths.

Pattern Recognition

ANYDIGIT(string), ANYALPHA(string), ANYPUNCT(string)

Return the position of the first occurrence of a digit, letter, or punctuation.

String Replacement and Construction

TRANWRD(source, from, to)    * Replace all occurrences;
CAT(string1, ..., stringn)   * Concatenate without trimming;
CATS(...)                    * Trim + concatenate;
CATX('delim', ...)           * Concatenate with delimiter;

Use these to build IDs, merge names, or standardize formats.


4. Type Conversion Functions

Converting Character to Numeric

numeric_var = INPUT(char_var, informat.);

Safely converts string numbers into numeric form.

Converting Numeric to Character

char_var = PUT(numeric_var, format.);

Useful for labeling outputs or formatting numbers as strings.

Conversion Example

DATA convert;
    char_val = "200";
    num_val = INPUT(char_val, 8.);
    label_val = PUT(num_val, 6.1);
RUN;

Avoids common issues associated with implicit conversions.


5. Use Case: Standardizing Customer Identifiers

DATA customers_clean;
    SET customers_raw;
    CleanName = PROPCASE(COMPRESS(Name, '0123456789'), ' ');
    CustomerID = CATX('-', STRIP(City), STRIP(ID));
RUN;

This standardizes names and constructs unique customer IDs.


Conclusion

In Edition 12, you explored how to:

  • Apply numeric, date/time, and string functions to transform data.

  • Leverage type conversion functions to safely change variable formats.

  • Use string construction and pattern functions for text normalization.

  • Implement INTCK and INTNX for time-based interval operations.

  • Combine multiple functions to achieve real-world data cleaning tasks.

Mastering functions is crucial to transitioning from basic data preparation to robust analytical pipelines in SAS.


What’s Next

In Edition 13, we begin Creating and using custom formats.


Keep transforming intelligently with 3 D Statistical Learning.

Special thanks to Dr. Dany Djeudeu for continuously translating SAS concepts into accessible learning journeys.