Category Archives: SAS

How to use indexes with SAS datasets

I just came across some old SAS tutorials that I wrote a few years ago, so I’ll be posting them here when I’ve got nothing else to say.

Indexes are an alternative to sorting your dataset or building a format. They speed up any where or by processing.

Creating indexes

You can create an index in a data step like this:

data clm2 (index=(mbrno empssn mi=(mbrno iyymm) ei=(empssn iyymm)));
    set clm1;

The mi and ei are compound indexes, which behave as if you sorted your dataset mbrno iyymm or by empssn iyymm.

You can use proc datasets to add an index to a dataset that already exists:

proc datasets library=saslib;
    modify clm2;
    index create mbrno empssn mi=(mbrno iyymm) ei=(empssn iyymm);

Using indexes

Indexes allow you to merge datasets that aren’t sorted. In the above example, now you can use clm2 just like it was sorted by any of the indexed vars:

data clm_plus_elig;
    merge clm2 mem;
    by empssn iyymm;

This is another example of how to do a lookup.

data d1;
    infile datalines;
    input col1 8.;

data d2 (index=(col1));
    infile datalines;
    input col1 col2 $8.;
    104 ddd
    102 bbb
    103 ccc
    101 aaa

data d3;
    set d1;
    set d2 key=col1 / unique;

    /* This block handles bad lookups. */
    if _IORC_ eq %sysrc(_DSENOM) then do;
        col2 = "xxx";


proc print heading=h data=d3;

And this is the output:

this is the d3 output dataset.       14:43 Friday, March 11, 2005   1

Obs    col1    col2

 1      101    aaa
 2      106    xxx
 3      102    bbb
 4      102    bbb
 5      103    ccc
 6      103    ccc
 7      104    ddd
 8      105    xxx