Chapter 2 : Working with SAS Programs.

Introduction

In this lesson, you’ll learn how to work with SAS code. First, you’ll learn the main components of SAS programs. Then you’ll learn the syntax rules and formatting guidelines for writing SAS programs. As you work with SAS programs, you’ll add descriptive comments, and identify and correct common syntax errors.

Objectives

  • list the components of a SAS program
  • identify the characteristics of SAS statements
  • define SAS syntax rules
  • document a program using comments
  • identify common syntax errors
  • diagnose and correct syntax errors in a SAS program

Exploring SAS Programs

Let’s investigate the main components of SAS programs. Generally speaking, a SAS program is a sequence of steps that you submit to SAS for execution. Each step in the program performs a specific task.

1

Only two kinds of steps make up SAS programs:

  • DATA steps and
  • PROC steps.

A SAS program can contain a DATA step, or a PROC step, or any combination of DATA steps and PROC steps. The number and kind of steps depend on what tasks you need to perform. 

2

A DATA step typically reads data from an input source, processes it, and creates a SAS data set, which is data in a form that SAS understands. So, one of the primary purposes of a DATA step is to create a SAS data set. In addition, you can use a DATA step to create new variables that were not in your original data. In SAS terminology, variables are the columns in your data. 

3

For example, suppose your raw data file contains the fields Cost Price Per Unit and Quantity Sold. In a DATA step, you can multiply these variables and assign the value to a new variable named Total_Retail_Price

4

A PROC or procedure step typically processes a SAS data set. Various PROC steps generate reports and graphs, manage data, and sort data. 

One way to use these two steps together is to use a DATA step to create a SAS data set, and then use a PROC step to create a report. Remember, though, that this is just one possible combination of steps in a SAS program.

5

Your SAS programs might perform other tasks. Now let’s learn more about what makes up a SAS step.

SAS Programming Steps

A SAS program is comprised of a sequence of steps, and a step is comprised of a sequence of statements. Every step has a beginning and ending boundary. These are called step boundaries. SAS compiles and executes each step independently based on the step boundaries. 

6

A DATA step begins with a DATA statement, and a PROC step begins with a PROC statement. SAS detects the end of a step when it encounters one of the following: a RUN statement for most steps, a QUIT statement for some procedures, or the beginning of another step. Occasionally, a user might omit a RUN or QUIT statement, and the step will end implicitly when the next step begins. It is a best practice to include a RUN or QUIT statement to explicitly end each step in a SAS program. 

7
Take a look at this program.

8

Can you tell how many steps it contains?

This program contains three steps: one DATA step and two PROC steps.

In the first line of code, the DATA step creates a temporary SAS data set named work.newsalesemps by reading the orion.sales data set. In the eighth line, the PROC PRINT step creates a list report of the work.newsalesemps data set. In line 11, the PROC MEANS step creates a summary report of work.newsalesemps with statistics for the variable Salary for each value of Job_Title

In addition to DATA and PROC steps, this SAS program also contains global statements. These statements can lie outside DATA and PROC steps, and they can affect more than one step. For example, the first TITLE statement located before the PROC statements, specifies a title that appears on both reports. The second TITLE statement located at the end of the SAS program, turns all titles off for all subsequent output.

You’ll learn several global statements in this course.

 Submitting a SAS Program

In this demonstration, you submit a program and examine the log and results.

  1. Copy and paste the following program into the editor.
data work.newsalesemps;   
         set orion.sales;   
         where Country='AU';
 run; 

title 'New Sales Employees'; 

proc print data=work.newsalesemps;
run; 

proc means data=work.newsalesemps;   
      class Job_Title;   
      var Salary;
run; 

title;
  1. Submit the code and check the log. It’s a good programming practice to first check the log, even if the program appears to produce results. You want to ensure that the code ran successfully before you look at any reports SAS created. Notice that SAS processed the code without warnings or errors.
  2. View the results. The first report is the PROC PRINT report. Recall that this type of report simply lists your data. You can see columns for the various variables and all of their values. Notice that the title you specified appears at the top of the report.

9

The next report is the PROC MEANS report. The MEANS procedure provides data summarization tools to compute descriptive statistics on your data, and displays output by default. Here, SAS calculated statistics for the analysis variable Salary.

10

 

111213

Business Scenario

Orion Star management encourages their programmers to write well-formatted, clearly documented SAS programs. So, you need to know the syntax rules and recommended structure for SAS programming statements, as well as how to use comments in your SAS programs.

Characteristics of SAS Programs

SAS statements usually begin with an identifying keyword, and they always end with a semicolon. Keywords identify the type of statement, and semicolons end the statement. For example, in the following SAS program, the second statement is a SET statement, and the fourth statement is a RUN statement.

14

data work.newsalesemps;    
           set orion.sales;    
           where Country='AU';
run; 

proc print data=work.newsalesemps;
run; 

proc means data=work.newsalesemps;    
             class Job_Title;    
             var Salary;
run;

Take a look at this program. Can you tell how many statements make up this DATA step?

15

This step contains five statements: a DATA statement, a LENGTH statement, an INFILE statement, an INPUT statement, and a RUN statement. Each statement has an identifying keyword and ends in a semicolon.

SAS Program Structure

In the following program, the statements are pretty easy to read.

data work.newsalesemps;   
      set orion.sales;   
      where Country='AU';
run; 

proc print data=work.newsalesemps;
run; 

proc means data=work.newsalesemps;   
      class Job_Title;   
      var Salary;
run; 

The DATA, PROC, and RUN statements begin in column one, and the other statements are indented. Each statement begins on a new line, and a blank line separates each step. Using conventional formatting (that is, structured, consistent spacing) makes a SAS program easy to read. 

However, SAS statements are free format. In other words, they can begin and end anywhere.

16

In SAS, you can have as much or as little white space as you want. You can begin or end a statement in any column and span multiple lines. You can also place multiple statements on one line, and unquoted values can be lowercase, uppercase, or mixed case. 

The following program takes advantage of the free-format style that SAS permits, but at a cost of being difficult to read:

  data work.newsalesemps;
         set orion.sales;
         where Country='AU';
   run;    

   proc print data=work.newsalesempls;  
   run;

   proc means  data=work.newsalesemps;
            class  Job_Title;
            var Salary;
   run;   

Remember the old saying: “Just because you can do something, doesn’t mean that you should.” Again, in this program, the SAS syntax rules have been followed, but this unconventional formatting might be especially difficult for other programmers to read. 

Using conventional formatting can take the guesswork out of your programs. It’s recommended that you use a conventional programming style. Click the Information button in the course to learn about automatic formatting in SAS Enterprise Guide and other SAS environments.

17

Using SAS Comments

In addition to using conventional formatting, another way to make your program easier for others to follow is to add comments to the program. A comment is text in your program that SAS ignores during processing but writes to the SAS log.

You can use comments anywhere in a SAS program to document the purpose of the program, explain segments of the program, or mark SAS code as non-executing text. Using comments to mark SAS code as non-executing text is also called commenting out code. 

18

Comments can also help you test your SAS programs in stages. By commenting out your error-free code, you can use comments to submit only the steps that you’re testing. When your entire program is error-free, you can remove the comment symbols without damaging the SAS program. 

Types of Comments 

Let’s take a closer look at comments. In SAS, you can create comments in two ways. Using the first method, called a block comment, you begin with a forward slash and asterisk, your comment text, and then end with an asterisk and a forward slash.

19

These comments can be any length, and can contain semicolons. They cannot be nested. You should avoid placing block comment symbols in the first or second columns. In some operating environments, SAS might interpret block comment symbols in columns 1 and 2 as a request to end the SAS job or session. 

The second method is called a comment statement. It begins with an asterisk, followed by the comment text, and ends with a semicolon. 

20

Comment statements can begin in columns 1 and 2. To comment out a statement in one of these steps, you simply add an asterisk to the beginning of the statement, as shown above in the WHERE statement. Comments in this form are complete statements, and they can’t contain internal semicolons.

Adding Comments to Your SAS Programs

In this demonstration, you add comments to a program to make sure that another programmer understands it.

  1. Copy and paste the following code into the editor.
data work.newsalesemps;   
      set orion.sales;   
      where Country='US';
run; 

proc print data=work.newsalesemps;
run; 

proc means data=work.newsalesemps;   
      class Gender;   
      var Salary;
run;

2.  At the beginning of the program, add a comment statement stating that you’re                 using sales to create work.newsalesemps.

*This program uses the data set orion.sales to create work.newsalesemps.;

3.   In the PROC MEANS step, add a block comment stating that the variable Salaryis numeric. Place the comment immediatately following the variable name. Remember that SAS ignores any text between the comment symbols.

proc means data=work.newsalesemps;   
      class Gender;  
      var Salary/*numeric variable*/;
run;

4.  Next, comment out the PROC PRINT step so that it doesn’t run when you submit the code.

 /*proc print data=work.newsalesemps;
run;*/

5.  Submit this code and examine the log. Notice that SAS didn’t process the portions of code that were commented out. You can see that the data set was created, and that the PROC MEANS step created output, but the PROC PRINT step that was commented out produced no other messages or output.

2122

Diagnosing and Correcting Syntax Errors

Business Scenario

As an Orion Star programmer, you work with a lot of code…some that’s yours and some that’s not. You need to be able to diagnose and correct syntax errors in any of these SAS programs.

What Is a Syntax Error?

Syntax errors occur when program statements do not conform to the rules of the SAS language. Some common syntax errors are misspelled keywords, missing semicolons, and invalid options. The editor uses the color red to indicate a potential error in your SAS code. Notice that in the first line of the program below SAS displays the misspelled word DAAT in red. 

23

This misspelling affects other statements following it. Although the following statements in the DATA step are syntactically correct, they are only permitted in a DATA step. The editor doesn’t recognize this as a DATA step though, due to the misspelled keyword, so SAS also displays the other statements in the DATA step in red. 

SAS finds syntax errors during the compilation phase, before it executes the program. So, when you submit a SAS program, SAS scans each statement for syntax errors. If no errors are found, SAS executes the step when it reaches the step boundary. Then SAS goes to the next step and repeats the process. 

When SAS encounters a syntax error, it writes the following to the SAS log: the word ERROR or WARNING, the location of the error, and an explanation of the error. SAS continues the syntax scan until it reaches the step boundary, but the step doesn’t execute if errors are found.

24

Then SAS continues scanning the rest of the program, and reports any additional errors as needed. When you check the log, as all good SAS programmers do, and find a warning or error message, you need to correct your code.

Viewing and Correcting Syntax Errors

In this demonstration, you diagnose and correct syntax errors in your program.

Copy and paste the following program into the editor. As you know, the DATA step keyword is misspelled. Also, the semicolon is missing from the PROC PRINT statement, and the PROC MEANS step includes an option that is not valid. As you can see, SAS color-codes the program to indicate the errors.

daat work.newsalesemps;   
      length First_Name $ 12            
             Last_Name $ 18 
             Job_Title $ 25;   
      infile "&path/newemps.csv" dlm=',' ;      
      input First_Name $ 
            Last_Name $
            Job_Title $ 
            Salary;
run; 

proc print data=work.newsalesemps;
run; 

proc means data=work.newsalesemps average min;   
     var Salary;
run;

Submit the program and check the log. You should always check the log to make sure that the program ran successfully, even if output is generated.

Notice that there is a WARNING message and the word DAAT is underlined. In this case, SAS resolved the issue by assuming that DAAT was simply DATA misspelled. A warning means that SAS was able to perform the action. In this case, SAS processed the DATA step. But this is a rare situation, as SAS might not always be able to interpret your misspelled Words.

Next, notice that the RUN statement is underlined. In this case, the previous line is missing the semicolon. The message ‘Syntax error, expecting one of the following…’ indicates that something was missing. Consider how SAS processed this step. SAS started with the PROC PRINT statement and kept going until it reached the semicolon at the end of the RUN statement. So, SAS thought that the PROC PRINT and the RUN statements were all one statement. SAS interpreted RUN as an option for PROC PRINT and printed an error message about an invalid option. Notice that SAS did list the semicolon as of the Expected options.
You might be thinking, “Why did SAS report an error in the RUN statement? There’s nothing wrong with the RUN statement.” When you encounter this type of error, always check the statement before the underlined statement. In many cases you will find that the statement before the error is missing a semicolon.

Now look at the next error message. SAS did not recognize the word AVERAGE as a valid option in the PROC MEANS statement, so the PROC MEANS step didn’t execute. Notice that SAS lists the valid options. The word MEAN is listed as a valid option and should be used to calculate an average. 

In the editor, correct the program. First, correct the spelling of DATA, and then add a semicolon to the end of the PROC PRINT statement. Lastly, change the word AVERAGE to MEAN in the PROC MEANS statement.

data work.newsalesemps;   
       length First_Name $ 12           
              Last_Name $ 18 
              Job_Title $ 25;    
       infile "&path/newemps.csv" dlm=',' ;   
       input First_Name $ 
             Last_Name $          
             Job_Title $ 
             Salary; 
run; 

proc print data=work.newsalesemps; 
run;  

proc means data=work.newsalesemps mean max;    
          class Job_Title;   
          var Salary; 
run;

Submit the revised code and check the log. The log shows that the code ran successfully. No errors or warnings appear. Also, SAS produced the reports you requested. As demonstrated, you can easily view and correct syntax errors in SAS.

Business Scenario

Another common mistake that programmers make is leaving off a matching quotation mark. For example, suppose you write a program that creates a data set and generates two reports. You submit the program, but it doesn’t produce results. The program might have unbalanced quotation marks.

Unbalanced Quotation Marks

25

In SAS, a quotation counter keeps count of the quotation marks in your code. SAS expects an even number, or matching number, of quotation marks.

If SAS detects an uneven number of quotation marks, the code won’t execute properly. Also, although SAS allows either single or double quotation marks, you can’t mix the types.

If you begin with a single quotation mark, you must end with a single quotation mark; otherwise, SAS considers the quotation marks unbalanced. When your program contains unbalanced quotation marks, whether from an uneven number or mismatched quotation marks, SAS misreads both the statement containing the error and any following statements.

You should notice that there’s a problem because much of the program will be coloured purple in the editor. Purple represents a quoted string.

26

In this example, the string begins with a single quotation mark followed by a comma, a semicolon, and then all the remaining statements in the program. Because the string does not contain a matching or ending quotation mark, SAS reads all of this text as a quoted string.

When you submit a program with unbalanced quotation marks in the SAS windowing environment, the program doesn’t stop running, and the log includes only the code you submitted. You won’t see any error or warning messages, nor will you see any indication that any of the steps executed.

You’ll also see a message in the banner of the editor stating that the step is still running. You have to stop an executing program by cancelling the submitted statements. You can then correct your program by adding the missing quotation mark.

When you submit a program with unbalanced quotation marks in client applications such as SAS Enterprise Guide and SAS Studio, SAS writes messages to the log to alert you of the error.

A warning in the SAS log stating that a quoted string has become too long, or that a statement containing quotation marks is ambiguous, sometimes indicates unbalanced quotation marks. In fact, any log message about a quoted string should alert you to the possibility of unbalanced quotation marks.

In client applications, SAS submits additional code, or wrapper code, including a single and double quotation mark. SAS is attempting to repair any potential unbalanced quotes in a submitted program. The wrapper code balances quotation marks and the code stops running, but your results will still contain errors and you must correct the program. To do this, you either add the missing quotation mark, or match the quotation mark, and then resubmit the program.

27

28'

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: