How to use indexes with SAS datasets

I just came across some old SAS tutorials that I wrote a few years ago, so I’ll be posting them here when I’ve got nothing else to say.

Indexes are an alternative to sorting your dataset or building a format. They speed up any where or by processing.

Creating indexes

You can create an index in a data step like this:

data clm2 (index=(mbrno empssn mi=(mbrno iyymm) ei=(empssn iyymm)));
set clm1;
run;

The mi and ei are compound indexes, which behave as if you sorted your dataset mbrno iyymm or by empssn iyymm.

You can use proc datasets to add an index to a dataset that already exists:

proc datasets library=saslib;
modify clm2;
index create mbrno empssn mi=(mbrno iyymm) ei=(empssn iyymm);

Using indexes

Indexes allow you to merge datasets that aren’t sorted. In the above example, now you can use clm2 just like it was sorted by any of the indexed vars:

data clm_plus_elig;
merge clm2 mem;
by empssn iyymm;
run;

This is another example of how to do a lookup.

data d1;
infile datalines;
input col1 8.;
datalines;
101
106
102
102
103
103
104
105
;

data d2 (index=(col1));
infile datalines;
input col1 col2 $8.;
datalines;
104 ddd
102 bbb
103 ccc
101 aaa
;

data d3;
set d1;
set d2 key=col1 / unique;

/* This block handles bad lookups. */
if _IORC_ eq %sysrc(_DSENOM) then do;
col2 = "xxx";
end;

run;

proc print heading=h data=d3;
run;

And this is the output:

this is the d3 output dataset. 14:43 Friday, March 11, 2005 1

Obs col1 col2

1 101 aaa
2 106 xxx
3 102 bbb
4 102 bbb
5 103 ccc
6 103 ccc
7 104 ddd
8 105 xxx

Posted in SAS