Getting Started with SAS/Part 2
From BingWiki
Contents |
Printing and Listing Data
Examine and Display a Variable (Column of Data)
When you prepare data for analysis in SAS you often do more than simply type in numbers or words. The data in a dataset are organized for efficient processing by the SAS System and for convenience to the user additional labels or formatting is also stored with the data. To manipulate the data appropriately it is often necessary to examine the precise technical description of a dataset's variables (to see for instance how many decimals can be displayed, or how long a word will fit into a character string variable). Note: Sometimes, when not explicitly set, a character string variable will truncate textual data to only the first 8 characters of the text. Also, a floating point number with several digits after the decimal place may appear in very different looking forms depending on a variable's format. e.g. the value 0.0078126 can appear as 0, as $ 0.01, as '<.01', or as 0.007813 with the appropriate formats.
There are two general methods for looking at both the data itself and for understanding the technical attributes (characteristics) of the variables (which hold the data). Data can be examined by using the Explorer tool or by running short SAS programs containing procedure steps, proc print or proc contents.
Using the Explorer to View Data
Datasets in the WORK or
any other SAS library are viewed as follows:
Step 1:
Activate the Explorer
window by clicking on the Explorer Tab (lower left)
(if the Results
window obscures the Window's visibility);
The main toolbar has different buttons and certain commands such as Save and Open will have different effects when the Explorer instead of the Editor window is active. Note: if an Explorer window is not open, then choosing View and then Explorer from the main menu can open one.
Step 2: Double click on the Libraries Icon;
When you first start your SAS session in Windows there will be only a few libraries showing in the Explorer window as illustrated below. (We can ignore the Sashelps, and Gismaps libraries for now.) The WORK library, which is where all temporary datasets you create by default will be empty when you first start SAS. If you have run the sample program, however, the WORK area will contain a single dataset called Htwt (see the picture under Step 3 below).
Step 2a: (optional if
you are viewing a dataset in the WORK library)
In the Explorer Window,
right click on any blank space and choose New to define a new
library; Fill in a name for the library (any eight letter name such
as 'localdat') and type in (or Browse to) the Folder location for
your dataset (e.g. c:\data);
Note: Libraries and the LIBNAME statement are discussed more in the Saving your work section of this document. Alternatively, when the Explorer windows are active you can choose File and then New from the main menu or click on the New Library button of the Toolbar to open a New Library Dialog window.
Step 2b: (still
optional if examining the scratch WORK library area)
Click OK;
to create a
new library (i.e. to make localdat point to some folder on your
machine).
Step 3:
Double click on the
newly created library icon (or the WORK library icon. If you've run
the simple program, the HTWT dataset icon will appear as follows.)
All SAS datasets (files
that end with .sas7bdat) that exist in the library folder will appear
now in the Explorer window.
Step 4: Double click on
the dataset icon.
The ViewTable window will show the data.
Buttons for sorting the data, for Column attributes and table attributes (number of rows etc.) appear above the data.
Step 4a: You can
examine or change the labels, names or formatting characteristics of
each after double clicking the variable header.
For example if you
double click the Sex variable column you’ll see that it is a
character string (it’s non-numeric), and one character in
length (as signified by $1.):
In the ViewTable
window, when the variable has been given a label, it is the labels
(not the actual Variable names) that head each column of data. Labels
for variable names are often more meaningful, longer names for the
columns of data.
Using PROC Steps to View Data
Step 1: (optional)
Generate a libname
which will be used to refer to the folder that contains the dataset
you'd like to examine (if it's different from the WORK library).
(e.g. Type and run the statement: libname Localdat 'c:\data';)
Step 2:
View the variable
names, attributes (type and length), and labels by running a
proc contents step:
(The example for the sample program follows)
proc contents data=htwt; run;
If your HTWT dataset
were saved permanently in a folder (other than the scratch WORK
library) and you’ve run the libname statement above you would
type
proc contents data=localdat.htwt; run;
The output below shows: technical information about where and when the dataset was created, the count of observations (cases or rows), that the variables Name and Sex contain text information (they are both of Type Char (meaning “character”)), etc.
Step 3: If a dataset is not too large (<50 cases) a simple proc print as in:
will print all
variables and all cases of the htwt dataset.
Step 3a) (optional)
It is common to limit
the number of cases printed (especially for large datasets) by using
the obs= dataset option and to explicitly name the variables
to be printed as in:
which prints the Name
and Age information for 5 observations (rows) in the dataset.
OR … To examine
cases near the end or in the middle of a dataset, use a combination
of the obs and the firstobs options. The following
shows six rows of data beginning from row 10. By default, since there
is no var statement, all variables (columns) in the dataset
are printed.
Words, Character Strings, Non-numeric Data
Entering textual data
is not much different from entering Numeric data. An illustration of
reading textual data is provided by the example SAS program above. On
an input statement, instead of listing just the variable name, as you
would for numeric variables, character variables (like Name) are
followed by a dollar sign ($). Special attention to the details of
the Input statement and the precise alignment of columns are needed
when the names may contain spaces. In the introductory example above,
the input could have been simplified to:
The following example
illustrates a slightly more complex textual data situation:
Other Methods of Entering Data
Reading Data From Files
Data that you encounter may have been saved in a variety of formats. The most common formats that SAS can read are: SAS datasets saved as such (.sas7bdat files), Text Files (.txt, or .csv), SAS datasets saved as transport files (), Excel Spreadsheets (.xls files)
Special Input Formats
To Read a SAS Dataset
One first defines which library (folder location designated by a LIBNAME statement) contains the dataset, and then you may use the dataset directly in a PROC step or as a SET statement within a data step. An example below prints the first ten observations of a dataset which has been saved earlier.
To Read a Text File
Reading most data often involves a data step. To put Text files into SAS it is helpful to understand the INPUT and the INFILE statements (used in a data step, i.e. between the Data and Run; statements).
Specifically, when a
Text file has variables in a column that are Delimited
consistently (perhaps each column is separated by a space or by a
comma, each common 'delimiters') as a variable arrangement, the
following code illustrates reading and defining a SAS dataset from
the data in the file. Note the columns won't always line up, but the
variables on a line WILL always be separated in a consistent manner.
See also, input, infile, truncover…
Sometimes to save space
(though such formats are murder to a human eye trying to read the
data), data is placed in a text file using a fixed format and fixed
width (which you’d use if the data for certain variables are
found on specific columns and not necessarily separated by spaces or
other means). Such data is well illustrated by the htwt data in the
Simple Program Example above.
To read data in fixed format
use code such as:
Working with .csv
files is very much like simple delimited data. CSV stands for
Comma Separated Values. If you know that the file will consistently
have a certain number of variables, that you know the names for, it
is easy enough to write directly a data step that will read the data.
Use the following example as a guide.
Here let’s assume
that the commadata file (from page 7) does not have a first line that
names the variables.
data scores; infile "C:\Temp\commadata.csv" delimiter="," missover; input VarA VarB; run;
In addition, the following point and click procedure will allow you to import a standard file format such as .CSV as it does for Excel files.
Suppose you have
already created an Excel file and saved as “commadata.csv”
by using the data set commadata as follows:
Step 1: Click the File menu, select Import Data … to start Import Wizard.
Step 2: Choose Standard data source, then select Comma Separated values from the list of datasource types.
Step 3: Browse
the location of the file you want to import. Click Options to
open options window.
Step 4: In Choose the
SAS destination, choose the library and fill in the name of
Member (another word about data set).
Step 5: The Import
Wizard can create a file containing PROC IMPORT statements that can
be used in SAS programs to import this data again. If you want these
statements to be generated, then enter the filename where they should
be saved. Otherwise, click on Finish to import the file. Then
a VIEWTABLE file named Scores will be in your WORK library.
Note: You need to close
the file which is opening in EXCEL before you import it to SAS.
Otherwise ERROR message will appear in the log window, telling you
that File is in use and Import cancelled.
Other Options
We can use the firstobs
option on the INFILE statement to handle arbitrary text (such as
titles) at the top of the data file. Other INFILE and INPUT options
can help us handle files that have data that spread over multiple
lines.
infile "C:\Temp\commadata.csv" delimiter="," firstobs=2 missover; input VarA VarB;
Step 4: Make sure comma
is selected as the delimiter
Note: It is fairly easy
to convert tables of data saved in a Word document or found on the
web into formats which are raw text, however, neither .htm(l), nor
.doc files can be read directly by SAS. Some intermediate steps in
addition to choosing Save As … Text only in Word may be
necessary to prepare data to be read by SAS. The handling of dates
and times also require special care.
Managing output
Maybe you have noticed that in SAS there are five basic windows: The Results and the Explorer windows and three programming windows: Program Editor, log, and Output. The Result window is like a table of contents for your Output window; the result tree lists each part of your results in an outline form. The Explorer window gives you easy access to your SAS files and libraries.
Using the Result Window to View Results
Suppose you have run the simple program mentioned above and you try to read those results:
Step 1: Activate the Result window by clicking on the Result Tab (lower left) (if the Results window obscures the Window's visibility);
Step 2: Double click on “Print: The SAS System” to expand it;
Step 3:
Then double click “Data Set SASLIB.HTWT” to open this result file;
You can open all the result files with the same method.
Using the Result Window to Print Results
Step 1: Open a Result file you want to print using the method above, for example, “Data Set SASLIB.HTWT”;
Step 2: Click on the
File menu and select Print.
Note: There are three result files in “Corr: The SAS System” and want to print all of them. Then click on “Corr: The SAS System”, and click on the File menu, select Print.
Redoing a program submission / run from scratch
If you want to redo a
program submission / run from scratch, then you can follow these
procedures:
Step 1: Click on Results;
Step 2: Click on Edit, select Clear All to clear all existing results;
Step 3: Activate Program Editor which contains the program you want to re-run;
Step 4: Submit your program again.
How to clear your log window
If you want to clear
your existing log window, then you can follow these procedures:
Step 1: Activate Log window;
Step 2: Click on Edit,
select Clear All to clear all existing information in Log
window.
HTML
The SAS system makes it
easy for you to create output as HTML files that can be displayed
with a web browser such as Netscape or Internet Explorer.
Here is the output:


