Getting Started with SAS/Part 2

<:Getting Started with SAS

Examine and Display a Variable (Column of Data)
When you prepare data for analysis in SAS you often do more than simply type in numbers or words. The data in a dataset are organized for efficient processing by the SAS System and for convenience to the user additional labels or formatting is also stored with the data. To manipulate the data appropriately it is often necessary to examine the precise technical description of a dataset's variables (to see for instance how many decimals can be displayed, or how long a word will fit into a character string variable). Note: Sometimes, when not explicitly set, a character string variable will truncate textual data to only the first 8 characters of the text. Also, a floating point number with several digits after the decimal place may appear in very different looking forms depending on a variable's format. e.g. the value 0.0078126 can appear as 0, as $ 0.01, as '&lt;.01', or as 0.007813 with the appropriate formats.

There are two general methods for looking at both the data itself and for understanding the technical attributes (characteristics) of the variables (which hold the data). Data can be examined by using the Explorer tool or by running short SAS programs containing procedure steps, proc print or proc contents.

Using the Explorer to View Data
Datasets in the WORK or any other SAS library are viewed as follows: Step 1: Activate the Explorer window by clicking on the Explorer Tab (lower left) (if the Results window obscures the Window's visibility);



The main toolbar has different buttons and certain commands such as Save and Open will have different effects when the Explorer instead of the Editor window is active. Note: if an Explorer window is not open, then choosing View and then Explorer from the main menu can open one.

Step 2: Double click on the Libraries Icon;



When you first start your SAS session in Windows there will be only a few libraries showing in the Explorer window as illustrated below. (We can ignore the Sashelps, and Gismaps libraries for now.) The WORK library, which is where all temporary datasets you create by default will be empty when you first start SAS. If you have run the sample program, however, the WORK area will contain a single dataset called Htwt (see the picture under Step 3 below).



Step 2a: (optional if you are viewing a dataset in the WORK library) In the Explorer Window, right click on any blank space and choose New to define a new library; Fill in a name for the library (any eight letter name such as 'localdat') and type in (or Browse to) the Folder location for your dataset (e.g. c:\data);





Note: Libraries and the LIBNAME statement are discussed more in the Saving your work section of this document. Alternatively, when the Explorer windows are active you can choose File and then New from the main menu or click on the New Library button of the Toolbar to open a New Library Dialog window.



Step 2b: (still optional if examining the scratch WORK library area) Click OK; to create a new library (i.e. to make localdat point to some folder on your machine).

Step 3: Double click on the newly created library icon (or the WORK library icon. If you've run the simple program, the HTWT dataset icon will appear as follows.) All SAS datasets (files that end with .sas7bdat) that exist in the library folder will appear now in the Explorer window.



Step 4: Double click on the dataset icon. The ViewTable window will show the data.



Buttons for sorting the data, for Column attributes and table attributes (number of rows etc.) appear above the data.

Step 4a: You can examine or change the labels, names or formatting characteristics of each after double clicking the variable header. For example if you double click the Sex variable column you&rsquo;ll see that it is a character string (it&rsquo;s non-numeric), and one character in length (as signified by $1.):



In the ViewTable window, when the variable has been given a label, it is the labels (not the actual Variable names) that head each column of data. Labels for variable names are often more meaningful, longer names for the columns of data.

Using PROC Steps to View Data
Step 1: (optional) Generate a libname which will be used to refer to the folder that contains the dataset you'd like to examine (if it's different from the WORK library).

(e.g. Type and run the statement: libname Localdat 'c:\data';)

Step 2:

View the variable names, attributes (type and length), and labels by running a proc contents step: (The example for the sample program follows)

proc contents data=htwt; run;

If your HTWT dataset were saved permanently in a folder (other than the scratch WORK library) and you&rsquo;ve run the libname statement above you would type proc contents data=localdat.htwt; run;

The output below shows: technical information about where and when the dataset was created, the count of observations (cases or rows), that the variables Name and Sex contain text information (they are both of Type Char (meaning &ldquo;character&rdquo;)), etc.

Step 3: If a dataset is not too large (&lt;50 cases) a simple proc print as in:



will print all variables and all cases of the htwt dataset.

Step 3a) (optional) It is common to limit the number of cases printed (especially for large datasets) by using the obs= dataset option and to explicitly name the variables to be printed as in:



which prints the Name and Age information for 5 observations (rows) in the dataset.

OR &hellip; To examine cases near the end or in the middle of a dataset, use a combination of the obs and the firstobs options. The following shows six rows of data beginning from row 10. By default, since there is no var statement, all variables (columns) in the dataset are printed.



Words, Character Strings, Non-numeric Data
Entering textual data is not much different from entering Numeric data. An illustration of reading textual data is provided by the example SAS program above. On an input statement, instead of listing just the variable name, as you would for numeric variables, character variables (like Name) are followed by a dollar sign ($). Special attention to the details of the Input statement and the precise alignment of columns are needed when the names may contain spaces. In the introductory example above, the input could have been simplified to:

The following example illustrates a slightly more complex textual data situation:



Reading Data From Files
Data that you encounter may have been saved in a variety of formats. The most common formats that SAS can read are: SAS datasets saved as such (.sas7bdat files), Text Files (.txt, or .csv), SAS datasets saved as transport files , Excel Spreadsheets (.xls files)

To Read a SAS Dataset
One first defines which library (folder location designated by a LIBNAME statement) contains the dataset, and then you may use the dataset directly in a PROC step or as a SET statement within a data step. An example below prints the first ten observations of a dataset which has been saved earlier.



To Read a Text File
Reading most data often involves a data step. To put Text files into SAS it is helpful to understand the INPUT and the INFILE statements (used in a data step, i.e. between the Data and Run; statements).

Specifically, when a Text file has variables in a column that are Delimited consistently (perhaps each column is separated by a space or by a comma, each common 'delimiters') as a variable arrangement, the following code illustrates reading and defining a SAS dataset from the data in the file. Note the columns won't always line up, but the variables on a line WILL always be separated in a consistent manner.



See also, '''input, infile, truncover&hellip;'''

Sometimes to save space (though such formats are murder to a human eye trying to read the data), data is placed in a text file using a fixed format and fixed width (which you&rsquo;d use if the data for certain variables are found on specific columns and not necessarily separated by spaces or other means). Such data is well illustrated by the htwt data in the Simple Program Example above.

To read data in fixed format
use code such as:



'''Working with .csv files''' is very much like simple delimited data. CSV stands for Comma Separated Values. If you know that the file will consistently have a certain number of variables, that you know the names for, it is easy enough to write directly a data step that will read the data. Use the following example as a guide. Here let&rsquo;s assume that the commadata file (from page 7) does not have a first line that names the variables. data scores; infile "C:\Temp\commadata.csv" delimiter="," missover; input VarA VarB; run;

In addition, the following point and click procedure will allow you to import a standard file format such as .CSV as it does for Excel files.

Suppose you have already created an Excel file and saved as &ldquo;commadata.csv&rdquo; by using the data set commadata as follows:



Step 1: Click the File menu, select Import Data &hellip; to start Import Wizard.



Step 2: Choose Standard data source, then select Comma Separated values from the list of datasource types.



Step 3: Browse the location of the file you want to import. Click Options to open options window.





Step 4: In Choose the SAS destination, choose the library and fill in the name of Member (another word about data set).



Step 5: The Import Wizard can create a file containing PROC IMPORT statements that can be used in SAS programs to import this data again. If you want these statements to be generated, then enter the filename where they should be saved. Otherwise, click on Finish to import the file. Then a VIEWTABLE file named Scores will be in your WORK library.



Note: You need to close the file which is opening in EXCEL before you import it to SAS. Otherwise ERROR message will appear in the log window, telling you that File is in use and Import cancelled.

Other Options We can use the firstobs option on the INFILE statement to handle arbitrary text (such as titles) at the top of the data file. Other INFILE and INPUT options can help us handle files that have data that spread over multiple lines.

infile &quot;C:\Temp\commadata.csv&quot; delimiter=&quot;,&quot; firstobs=2 missover; input VarA VarB;

Step 4: Make sure comma is selected as the delimiter

Note: It is fairly easy to convert tables of data saved in a Word document or found on the web into formats which are raw text, however, neither .htm(l), nor .doc files can be read directly by SAS. Some intermediate steps in addition to choosing Save As &hellip; Text only in Word may be necessary to prepare data to be read by SAS. The handling of dates and times also require special care.

Managing output
Maybe you have noticed that in SAS there are five basic windows: The Results and the Explorer windows and three programming windows: Program Editor, log, and Output. The Result window is like a table of contents for your Output window; the result tree lists each part of your results in an outline form. The Explorer window gives you easy access to your SAS files and libraries.

Using the Result Window to View Results
Suppose you have run the simple program mentioned above and you try to read those results:

Step 1: Activate the Result window by clicking on the Result Tab (lower left) (if the Results window obscures the Window's visibility);



Step 2: Double click on &ldquo;Print: The SAS System&rdquo; to expand it;



Step 3:

Then double click &ldquo;Data Set SASLIB.HTWT&rdquo; to open this result file;



You can open all the result files with the same method.

Using the Result Window to Print Results
Step 1: Open a Result file you want to print using the method above, for example, &ldquo;Data Set SASLIB.HTWT&rdquo;;

Step 2: Click on the File menu and select Print.



Note: There are three result files in &ldquo;Corr: The SAS System&rdquo; and want to print all of them. Then click on &ldquo;Corr: The SAS System&rdquo;, and click on the File menu, select Print.

Redoing a program submission / run from scratch If you want to redo a program submission / run from scratch, then you can follow these procedures:

Step 1: Click on Results;

Step 2: Click on Edit, select Clear All to clear all existing results;

Step 3: Activate Program Editor which contains the program you want to re-run;

Step 4: Submit your program again.

How to clear your log window If you want to clear your existing log window, then you can follow these procedures:

Step 1: Activate Log window;

Step 2: Click on Edit, select Clear All to clear all existing information in Log window.

HTML The SAS system makes it easy for you to create output as HTML files that can be displayed with a web browser such as Netscape or Internet Explorer.



Here is the output:



Next Section>