Tuesday, June 17, 2014

What is the difference between Primary key and Unique index in DB2

Primary key uniquely identifies  the row of a table. We declare the primary key on one(or more) column of a table. Defining the primary key we ensure, that column will not have any duplicate or NULL values. It is not a mandatory feature to be added to every table, having one is good. We define the primary key while creating the table or altering the table like below.
CREATE TABLE XMT00.CONTACT_TABL
   (CNTCT_ID             TIMESTAMP NOT NULL ,
    DATA_EFF_DT          DATE NOT NULL ,
    DATA_XPIR_DT         DATE WITH DEFAULT NULL ,
    NM_TP_CD             CHAR(1) FOR SBCS DATA NOT

    ENDG_TRN_ID          TIMESTAMP WITH DEFAULT NULL ,
    CONSTRAINT CNTCT_ID PRIMARY KEY (CNTCT_ID,START_TRN_ID))
  IN AMPXXDB.AMPAXXTS
Or like below in the ALTER table command

ALTER TABLE AXX00.ACCUM_INV_KEY
      PRIMARY KEY (ACCUM_INV_KEY_ID)
 
The uniqueness feature of primary key is achieved through the use of UNIQUE INDEX on its primary columns.We can say, Primary key, under the cover  will use UNIQUE or NON-UNIQUE index. If INDEX key allows NULLS, then we need to use WHERE NOT NULL clause to ensure non-null values are unique.
If we do not create unique index explicitly, then DB2 will create one index. But we wont be able to alter the automatically created index(like if we want to alter the features likes PCTFREE,Clustering etc).So it is always if we can create unique index and then build the primary key on that.

Technically, when we create a table with primary key, we must create a UNIQUE index on the Primary key. A very important point to remember " DB2 can not process that table on which primary key has been defined  but no indexes. It simply marks the table as unavailable or puts it in incomplete status."
CREATE INDEX XMT00.AMPAPX
  ON XMT00.CONTACT_TABL
   (START_TRN_ID          ASC)
  USING STOGROUP AXX001SG
  ....
Even If  we want to drop the unique index for that table, DB2 will allow to drop it, but the table will be unusable and it will be in incomplete status.
So, in conclusion this primary key and unique index goes hand-in-hand.!!!

Monday, June 9, 2014

Pass data through PARM in SAS. "Build output filenames dynamically through SAS"

Suppose we want to build a SAS output file name which will have filename along with the current date,
Example: FNAME_D040414.CSV or .txt.
In this scenario, we need to pass the file name to SAS through PARM.

//STP10 EXEC SAS,PARM='SYSPARM="&NAME"'
//WORK      DD UNIT=WORK,SPACE=(CYL,(50,50),RLSE), 
//SASLOG    DD SYSOUT=*
//OUTCARD   DD DSN=&&OUTCRD,DISP=(NEW,PASS),
//SYSIN     DD DSN=TEST.SAS.PROGRAM(SAS1)


So here we are passing the data through PARM statement on SAS.

So inside the SAS program we need to use the SCAN keyword to retrieve the parameter passed from the jcl and modify accordingly.
'Check the SCAN syntax for more details in the manuals'

OPTION SORTLIB='';               
TITLE;                           
                                 
DATA _NULL_;                     
  NAME = SCAN(SYSPARM(),1,','); 
  CALL SYMPUT("TNAME", TNAME);   
  LENGTH FN $50;                
  INFILE INPUTDATA FILENAME=FN;   
  FILE OUTCARD NOPRINT NOTITLES;
....

We can use the Date combination along with the parameter passed to build the output .txt or .csv dynamically.

Sunday, June 8, 2014

How to define KSDS, ESDS ,RRDS VSAM with IDCAMS.. Explanation of the VSAM parameters

VSAM is one file storage system used to store and retrieve data. It stores data, but not like a relational model or hierarchical model like DB2 or IMS. In fact, there are many structures in DB2, which are implemented using VSAM .We can use VSAM files in both batch and online(CICS) operations.
How do we create a vsam file?
The IBM Access Methods Services (AMS) utility,popular as IDCAMS is the utility to create and maintain VSAM files. In this context,we should know that VSAM is totally catalog driven.When ever we create a VSAM dataset,ie a VSAM file, an entry is recorded in the system catalog which stores the cluster informtion.
Vsam Cluster? what is it ? ::
In this VSAM world,simple VSAM files are often called clusters. A cluster is the set of catalog entries that represent a file.
So, how many types of VSAM can we create, rather how many types of VSAM are there?
The answer is 4 types:
1.ESDS    or Entry Sequenced dataset
2.KSDS    or Key Sequenced dataset
3.RRDS    or Relative record dataset
4.LDS     or Linear dataset
Frankly,till date i never came across the use of LDS or RRDS. The most Widely used VSAM are KSDS and ESDS.
Lets see the JCL to create a KSDS:
How to create a KSDS and the description of the parameters.

//STEP0020 EXEC PGM=IDCAMS
//SYSPRINT  DD SYSOUT=*           
//SYSIN     DD *

  DEFINE CLUSTER(             -                
       NAME(TEST.FILE.MYVSAM) -              
       RECSZ(300 2000)      -                
       KEYS(41 20)          -                
       MANAGEMENTCLASS(xxxx) -           
       INDEXED              -                
       SPEED                -                
       SHR  (2 3)           -                
       FSPC (24,7)          -                
       BUFSP (39936)        -                
       DATACLAS (DCVSEXT))  -                
       DATA(NAME(
TEST.FILE.MYVSAM.DATA) -    
         CISZ (18432)         -              
         CYL  (200,100))       -             
       INDEX (NAME(
TEST.FILE.MYVSAM.INDEX) - 
         CISZ (1024)          -              
         CYL  (10,10))

As we see, the CLUSTER have two components here. The DATA and the INDEX part.
The keywords are marked and highlighted in red.
Explanation of the VSAM parameters:

1. RECORDSIZE(300  2000)
Specifies the record length of the file. The first parameter specifies the average length of the record and second parameter tells the maximum record length.
RECORDSIZE(avg  max)
avg : Average length of records
max : Maximum length of records
e.g. :
 RECORDSIZE(300  2000) [Fixed Length records]
 RECORDSIZE(300  300)    [Variable Length records]
Note :  RECORDSIZE or RECSZ  can also be used in the IDCAMS define step.
We can use this parameter in the cluster level as well as in the data level.
If not used, VSAM will use its default recordsize which is  RECORDSIZE(4086 4086)
 
2. KEYS(41 20)
The first parameter defines the key length and the second parameter defines the offset, ie, the beginning of the key position from the starting of the record
Thus the general format is like :
        KEYS(length-of-the-key  offset-from-begining)

3. MANAGEMENTCLAS : Optional Clause

4. INDEXED: This specifies that the VSAM is a KSDS VSAM

5. SPEED or RECOVERY option is coded at the CLUSTER or DATA level.It just governs how the VSAM load data into the file.
   With the RECOVERY option, the VSAM pre-formats each Control area with binary  zeros before it  writes any control interval to it. When SPEED is specified, no such additional activity is performed and therefore the data loading is much faster with SPEED option.

6..FSPC (24,7): FSPC or FREESPACE(24,7) is same thing.
FREESPACE parameter applies to the KSDS.This free space can be used for adding new records or for expanding existing variable records.FREESPACE applies only to the data component. For the FREESPACE parameter, the first value defines the percentage of free space to be reserved within each control interval, and the second defines the number of control intervals within each control area that should be reserved as free space.
Default is FREESPACE(0 0)

7. SHR  (2 3)OR SHAREOPTIONS ( 2 3):
This parameter tells VSAM whether you want to let two or more jobs to process your file at the same time. It specifies how a VSAM dataset can be shared. This is a big topic to deal with .Read here to explore this feature.
8. BUFSP(39936):  The number of buffers that can be used to hold INDEX and DATA records.
9. CYLINDERS(Pri Sec)or CYL(200,100):Primary : Number of units of primary space to allocate.This amount is allocated once when the dataset is created.Secondary : Number of units of secondary space to allocate.

In the DEFINE CLUSTER command we need to mention the type of VSAM we are creating like:

INDEXED   for KSDS
NONINDEXED  for  ESDS
NUMBERED for  RRDS
LINEAR for  LDS

So, Here is the sample JCL for ESDS VSAM

DEFINE CLUSTER(                   -               
           NAME(TEST.RM3452.FILE) -             
           RECSZ(300 2000)        -                     
           NONINDEXED             -               
           SPEED                  -               
           SHR  (2 3)           
      DATA(NAME(TEST.RM3452.FILE.DATA) -   
             CISZ (18432)              -             
             CYL  (200,100))           -

Friday, June 6, 2014

Output statements using IF ELSE WHEN Clause in SAS - Creating multiple outputs in SAS

Sometimes we need to create multiple outputs in SAS depending on the IF ELSE conditions.The below JCL shows how to achieve this.
//SAS       EXEC SAS                    
//RYAN1     DD DSN=TEST.INPUT.SAS

//SYSOUT    DD SYSOUT=*                     
//SYSIN     DD *                

OPTION NOCENTER;                            
DATA DATA1;                                 
  INFILE RYAN1 MISSOVER;                    
  INPUT @6    POLNO     $CHAR10.            
        @10   CODE      $CHAR04.            
        ;                                   
  DATA EX1 EX2;                             
  SET DATA1;                                
  IF CODE='3992' THEN OUTPUT EX1;        
  IF CODE='T100' THEN OUTPUT EX2;       
  PROC PRINT DATA=EX1;                      
    TITLE 'EX1';                            
  PROC PRINT DATA=EX2;                      
    TITLE 'EX2'; 

Points to note: Both the output datasets, which we want to create should be mentioned in the DATA step. Here EX1 and EX2 are the ones.
Now, we can directly route these outputs to the output datasets as well.

Just a try with SELECT  WHEN clause in SAS  to get the same result.
OPTION NOCENTER;                 
DATA DATA1;                      
  INFILE RYAN1 MISSOVER;          
  INPUT @6    POLNO     $CHAR10. 
        @10   CODE   $CHAR04. 
        ;                        
  DATA EX1 EX2 EX3;              
  SET DATA1;                     
   SELECT (CODE);             
       WHEN  ('3992')  OUTPUT EX1;
       WHEN  ('1002')  OUTPUT EX2;
       OTHERWISE OUTPUT EX3;     
   END;                          
  PROC PRINT DATA=EX1;           
    TITLE 'EX1';                 
  PROC PRINT DATA=EX2;           
    TITLE 'EX2';