Saturday, July 30, 2016

Zoned Decimal and packed Decimal (COMP- 3) fields. How Does COMP 3 saves storage space?

Two computer codes which are used  for internal representation of data are EBCDIC and ASCII.
Extended Binary coded decimal interchange(EBCDIC)  code is a 8 bit character encoding standard used in IBM environment.
Most of the other Non IBM machines use ASCII standard.
Below is the EBCDIC representation of the alphabets and numbers followed in IBM environment.
EBCDIC represenation of Aphabets and Numerals

Following the chart, we can see 'a' (2nd row in above chart)  is represented in binary in 1000000001 and has the hex value of  81.

 All characters & numbers are represented in 8 bits. These 8 bits can be  broken into two 4 bit segments. For historical reasons, the  high order 4 bits are called zone part and low order 4 bits are called digits part.
Each of these 4 bits are represented by one hexadecimal character. Thus each 8 bit EBCDIC can be represented by  two character hexadecimal value. The high order 4 bits are used to represent whether the value stored is a letter,positive or negative number. Low order 4 bits represents numbers 0-9.

Let us see how the letter D is stored internally.(Map with the chart above)

In similar fashion, we can see how the letter 8 will be stored in the system( Map with the chart above)

Thus we see that in zoned decimal each byte represents one digit.

So, how will the positive number 62587 be stored in the system?
Zoned decimal number in EBCDIC repreentaion
In  zoned decimal, the zone portion of the rightmost byte represents the sign of the number.
For positive number, the zone portion is always 1111.
For negative number the zone portion becomes  1101.
Thus -6 can will be represented  as: 1101(Zone part) 0110(Digit part)

Packed Decimal fields: How does it save spaces?  Let's see how it works.

In  packed decimal format, the zone portion is stripped from each byte, so that two digits can be packed together in one byte.This way the space for zone portion of each byte can be used to represent another digit. Thus two digits are represented in single byte there by saving the space.
Only the zone portion of the low order or the rightmost byte is kept for storing the sign of the field.
Considering the same example, see how the number 62587 will be stored in packed decimal format.
Space utilized will be reduced from 5 bytes to 3 bytes.

Number :                                6               2              5             8                7
Packed decimal representation
We can see that two digits are packed in single byte by stripping of the zone portion of all the bytes except the last byte (marked in brown).
Had it been a negative number the last byte would have contained 1101 in place of 1111.

This way, a packed decimal number saves space by packing two digits in one byte.


Friday, June 24, 2016

COALESCE function in DB2 . Alternative to NULL indicator variable in DB2 ; SQLCODE - 305.

If a table column is not defined with NOT NULL, then it implies that the column value can be NULL sometimes. For such columns,if we do not use null handling technique, then the application program will give sqlcode -305.
To avoid that we use indicator variables.However , we can also use the DB2 COALESCE  function.

COALESCE is a function that allows you to substitute any numeric or character value for a null, based on the column data type.
Suppose in DEPT table, i have one column for optional subject(optsubj). It can have a value of 'Y'or 'N'. However since the column is not defined with NOT NULL keyword,it can fetch null values.
If we simply use the query below, our application program will return sqlcode -305.
SELECT fname
              ,lname
              ,optsubj
into       :hv-fname
             :hv-lname
             :hv-optsubj
from DEPT

However, we can tweak the same query in the below way using COALESCE function to get rid of -305 without using indicator variable

SELECT fname
       ,lname
       ,coalesce(optsubj,space(1))
into   :hv-fname
       :hv-lname
       :hv-optsubj
from DEPT

What it will do is, it will replace NULL value with space of 1 byte whenever the column optsubj fetches a null value for any record;
This function can be used in many ways to get desired results. 

Thursday, June 16, 2016

RENT compiler option in cobol - Understanding reentrant programming

For understanding RENT compiler option, we need to know  what is a  re-entrant program.
A re-entrant is a program that doesn’t change itself. “How can a program change itself?” Does the word "Multi threading " looks familiar?  Reentrant and mutithreading are related but somewhat different topics.
We can divide our program(logically) into  three parts:
1. Program code – the instructions that are executed when a program runs.
2. Constants – constants used in a program. These are set when the program is written, and never change.
3.Working  Storage area – or variables. This area changes when different user fires the same program.

"A reentrant Cobol program gives each user their own copy of working storage. " 
For every user,MVS supplies the GETMAIN and STORAGE macros to support this and they can be
used by subsystems such as CICS or IMS or whatever to support reentrant programs.

For each user the copy of the working storage is dynamically acquired and initialized from the original copy. When a user interacts with the program , he only modifies his copy of the variables.

When the program is interrupted to give another user a turn to use the program, information about the data area associated with that user is saved.

So, what will happen if we write a CICS program as non-renetrant?
When one transaction waits, a second transaction can come and modify the working storage variables, and thus providing unstable or undesirable results.

Thus, Re-entrant programs allow a single copy of a program or routine to be used concurrently by two or more processes.


For cobol, we use the compiler option RENT to generate reentrant object program.

Sunday, March 27, 2016

FETCH FIRST N ROWS Vs OPTIMIZE FOR N ROWS in DB2

Very Often we use these two statements in our SQL query,ie, OPTIMIZE FOR N ROWS and FETCH FIRST N ROWS ONLY.
Although it may sound similar,but both of them have specific usage when it comes to the optimizer. This post we will try to see difference between the two.

As the name suggests FETCH FIRST 5 ROWS ONLY, it actually puts a limitation on the number of rows the query is returning.I am assuming here n=5. So, here user gets only 5 rows even though there can be 100 qualifying rows for that query.

OPTIMIZE FOR 5 ROWS will straight way influence the optimizer.It does not limit the number of rows SQL will return. So using this , we will get all the qualifying rows,may be incrementally.Intent here is to give priority to retrieve first few rows.Once optimizer understands this ,it will give preference to access plan that will minimize the response time.So,DB2 will send the rows in a single block. Consider the below scenario where this will come handy.

You have an online application where the screen can hold details of 5 customers at a time. Suppose you have the below query to fetch the customer details

Select Fname, Lname, Dept, Sal,Rollno
  From Dept table 
Order by Sal Desc;

We are assuming the index to be defined on Rollno. If a descending index also exists on SAL column, its likely to have low cluster ratio.
Without using the OPTIMIZE clause, DB2 will do a full table space scan and sort on sal column to fetch you the details.

Now on Adding  'ÓPTIMIZE FOR 5 ROWS' to the query, DB2 will use SAL index directly because it knows you need  the details of 5 highest paid employees.
So, it displays 5 rows to the screen and depending on user input will process the  next 5 or do something else.Here DB2 encourage matching index scan and would not prefer list or sequential prefetch. Access plan selected for 5 records may not be good for 100 rows.

Thus, this OPTIMIZE clause comes handy for queries where you can process incrementally.We can use it to get rows directly from table without going through any buffering operations like sorting. 

Saturday, March 19, 2016

DB2 unload utility using the program DSNTIAUL

IBM provides two methods of unloading data from a DB2 table. One is DSNTIAUL program that is run under IKJEFT01 or IKJEFT1A, or 1B and the other being  DSNUTILB program. DSNUTILB is a fairly new and is considered IBM’s Fast Unload Utility. Each has unique functionality that you may want to explore prior to choosing one. 

Outlined below are functions of DSNTIAUL process. 

1. DSNTIAUL Program: The DSNTIAUL program allows two type of unloads to be invoked against a DB2 table. When invoking DSNTIAUL you may specify an entire table unload or a selective SQL unload. The key DD statements that are being used are the SYSIN and the SYSTSIN. Both types of unloads are being run under IKJEFT1B.

Basic JCL example:
//UNLOAD EXEC PGM=IKJEFT1B, DYNAMNBR=20
//STEPLIB DD DSN=SYSDB2.DQA0.SDSNLOAD, DISP=SHR
// DD DSN=SUBSYS.DB2.DSNEXIT, DISP=SHR
//*************************************************************
//SYSPRINT DD SYSOUT=*
//SYSTSPRT DD SYSOUT=*
//SYSUDUMP DD SYSOUT=*
//SYSOUT DD SYSOUT=*
//SYSREC00 DD DSN=userid.tablerecords.dataset,
//          DISP=(,CATLG,DELETE),
//       UNIT=SYSDA,SPACE=(CYL,(300,300),RLSE)
//*
//SYSPUNCH DD DSN=user-id.tablelayout.dataset,
//     UNIT=SYSDA,SPACE=(CYL,(1,1),RLSE),DISP=(,CATLG,DELETE)
//SYSTSIN DD *
SYSTEM (SUBSYS) RUN PROGRAM (DSNTIAUL) PLAN (DSNTIAUL) -
LIBRARY (‘SUBSYS.DB2.RUNLIB.LOAD’)
END
//SYSIN DD *
DBCREATOR.TABLENAME 
/*
//*


For selective Unload, like when you are using a where clause or using select statements (like SELECT * from user_table;)  we need to use PARM('SQL') in SYSTSIN statement; else you may endup with SQLCODE -104.
Your SYSTSIN would look like

//SYSTSIN DD *
SYSTEM (SUBSYS) RUN PROGRAM (DSNTIAUL)
         PLAN (DSNTIAUL) -
LIBRARY (‘SUBSYS.DB2.RUNLIB.LOAD’) PARM('SQL')
END 
//SYSIN DD *
SELECT * FROM schema.Tablename ;

Thursday, February 18, 2016

TSO WHOHAS code in REXX

Sometimes we face job abends, or job waiting for resource owing to the fact that some datasets get into contention with some other jobs or some user is using that particular file. Most of the organizations use TSO WHOHAS command to check the user holding the dataset. In this post we will see the piece of code in REXX which makes the command work.

/* REXX*/
DSNAME = ARG(1)
IF DSNAME = ' ' THEN DO
          SAY  'ENTER DATASET NAME'
PARSE UPPER EXTERNAL DSNAME
END
ADDRESS TSO
IF SYSDSN("'"DSNAME"'") <>  'OK' THEN DO
    SAY "DATASET DOES NOT EXIST"
    ADDRESS ISREDIT
    EXIT
END
ADDRESS TSO "ISRDDN E  ' "DSNAME" ' "
EXIT
Simple piece of code and it uses ISRDDN utility. It is an IBM utility which lists all the allocated DDNAME for your current TSO session.

Without the rexx code also , we even can execute the command
TSO ISRDDN ENQ 'Data-set Name'
and it will show the corresponding information

Tuesday, January 12, 2016

How to find the plan and package name in DB2 from program name

Suppose we have a cobol-db2 module named 'PGM-ABC'
We want to bind it and run the program. But we are not sure about the package and plan details.So where do we find it from? SYSIBM tables will give us the details we need.Lets check it out.
With the program name we can query SYSIBM.SYSPACKAGE table giving the name of the program(PGM-ABC) against the column 'NAME'.  We can use DB2 file-aid tool or use the below  query in SPUFI.

Select location, collid,name ,contoken,timestamp,bindtime from SYSIBM.SYSPACKAGE where NAME = 'PGM-NAME' ;

Once we get the COLLID,we need to query the SYSIBM.SYSPACKLIST table to get the plan name

SELECT PLANNAME,LOCATION,COLLID FROM SYSIBM.SYSPACKLIST WHERE COLLID='colld-id-name-we-got-above';

From experience, we should try to select the latest rows from the tables checking the TIMESTAMP fields; or check what other ppl have used in recent times.

Wednesday, January 6, 2016

SAS in Mainframes(z/Os) - Connecting to DB2 from SAS in mainframes

This post is all about connecting DB2 from SAS in mainframe environment and using the result set for further processing. To call and connect to DB2 session we need the SQL proc to be called first using the statement PROC SQL. When used in a program we need to begin  the code with PROC SQL and end it with QUIT; In Between these two we can write our SQL codes to fetch the required data.
Without wasting much time lets check the first JCL used to connect to db2.



//TTYYTST  JOB(TEST),CLASS=I,MSGCLASS=X,MSGLEVEL=(1,1)
//                NOTIFY=&SYSUID
//STEP01  EXEC PROC=SAS,OPTIONS='  '
//WORK    DD SPACE=(CYL,(,500)),VOL=(,,,59)
//OUTFILE DD DSN=TEST.OUTPUT.SAS,
//                   DISP=(,CATLG,DELETE),
//                  DCB=(RECFM=FB,LRECL=122,BLKSIZE=000),
//                  UNIT=SYSDA,SPACE=(TRK,(500,500),RLSE)
//SYSIN     DD *
OPTION NOCENTER;
OPTION SORTLIB;
PROC SQL;                                     <== Starts here
CONNECT TO DB2 (SSID=DQAT);    <== DQAT is the db2 subsystem here  
   CREATE TABLE test AS                  <== test is the sas datset here.
        SELECT * 
         FROM CONNECTION TO DB2 (
select fname,
         lastname,
         dept_name,        
from  test01.dept
where student_id = 1 fetch first 5 rows only) ;

%PUT SQLXRC=&SQLXRC;
%PUT SQLMSG=&SQLMSG;
QUIT;

DATA STEP1;
SET TEST;
FILE OUTFILE DLM=',' ;
IF _N_=1 THEN DO;
PUT   'FIRST NAME,'
          'LAST NAME,'
          'DEPT NAME,'
;
END;
PUT  FNAME
          LASTNAME
         DEPT_NAME
;
//SYSPRINT DD SYSOUT=*
//*

JCL is self explanatory. %PUT SQLXRC and SQLMSG are the macros used to capture the DB2 return codes and error messages if any.

Saturday, January 2, 2016

Invoke CA7 commands from Batch terminal - use of SASBSTR

CA7 is a product of Computer Associates and is a scheduling tool for batch jobs in mainframe. When the number of jobs to be executed are astronomical, it is difficult to manage the jobs manually. So we need a tool like CA7 to manage the same. Manipulation of jobs, predecessor and triggers are done through CA7 panels. But sometimes we need to define hundreds of jobs to CA7, it is advisable to use batch terminal to do the same. Also batch terminals are less prone to errors. Batch terminals are a set of CA7 commands in a member of a partitioned dataset which is input to a CA7 batch program and submitted using a JCL

The program which executes the CA7 commands in batch is called SASSBSTR. The list of commands are put into a member of a partitioned dataset and given as an input to the program through SYSIN DD statement in the JCL. When executing the SASSBSTR program, the SYSIN input statements are copied into the input dataset called CA7. Communication dataset informs CA7 that there are commands waiting in the input dataset. These are read and processed by CA7. Any output is written to the output dataset. When all the commands have been processed, SASSBSTR copies the contents of the output dataset to SYSPRINT. Have a look into the below diagram
CA7 commands from batch terminal


Now, lets have the JCL to execute the CA7 from batch terminal.

//@SASBSTR EXEC PGM=SASBSTR
//UCC7CMDS DD DSN=TTOS.NCA7A01.COMMDS,DISP=SHR
//SYSDUMP DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSOUT DD SYSOUT=*
//SYSIN DD *
LJOB,JOB=SJABCD7,LIST=ALL     <==  CA7 command goes here
/*
//*

All relevant information for the particular job SJABCD7 will be listed in SYSOUT since we gave SYSPRINT DD SYSOUT=*.

Check the commonly used CA7 commands in this page. We can use these commands in the SYSIN DD* statements.

Explanation of JCL:
SASSBSTR is the program which executes the batch terminal
UCC7CMDS DD statement is a communication dataset which is used to read the CA7 commands from SYSIN DD Statements into the input dataset.

The CA7 commands are embedded within the SYSIN DD statement.All valid commands can be used here. Usually the programs are written in a cataloged procedure and the batch terminal is given as an input to the program. The output can be stored in a dataset.