Thursday, September 3, 2015

Splitting Input Files using Sort. Use of SPLIT ,SPLITBY,SPLIT1R commmands

SPLIT command spits the output records one record at a time among output datasets. This happens until all the output records are written. The split happens in rotation among the datasets mentioned in the OUTFIL.
The First record from the output records is written to first dataset mentioned in the OUTFIL group, the Second record from the output records gets written to the second dataset mentioned in the OUTFIL group and so on.
When each OUTFIL dataset has 1 record, the rotation starts again with the dataset mentioned first in the OUTFIL group.
The records are not contiguous in the OUTFIL datasets.
The Below JCL splits the data in INFILE and copies to OUTFILE1 and OUTFILE2 as mentioned above.

Consider the contents of Input File - INFILE as below:

1111111111111111111111111111
1211111111111111111111111111
1311111111111111111111111111
1411111111111111111111111111
1511111111111111111111111111
1611111111111111111111111111
1711111111111111111111111111
1811111111111111111111111111
1911111111111111111111111111
2011111111111111111111111111
2111111111111111111111111111
Let us use the commands and see the outputs.

The Below JCL splits the data in INFILE and copies to OUTFILE1 and OUTFILE2 as mentioned above.
//STEP01 EXEC PGM=SORT
//SORTIN DD DSN=INFILE,DISP=SHR
//SORTOUT1 DD DSN=OUTFILE1,DISP=SHR
//SORTOUT2 DD DSN=OUTFILE2,DISP=SHR
//SYSIN DD *
SORT FIELDS=COPY
OUTFIL FNAMES=(SORTOUT1,SORTOUT2),SPLIT
/*

The contents of OUTFILE1 and OUTFILE2 would be as below,
OUTFILE1
1111111111111111111111111111
1311111111111111111111111111
1511111111111111111111111111
1711111111111111111111111111
1911111111111111111111111111
2111111111111111111111111111
OUTFILE2
1211111111111111111111111111
1411111111111111111111111111
1611111111111111111111111111
1811111111111111111111111111
2011111111111111111111111111
OUTFILE1 dataset contains records 1, 3, 5…so on.
OUTFILE2 dataset contains records 2, 4, 6…so on.
Note that the records in the output datasets are not contiguous.

SPLITBY Command:

SPLITBY splits the output records M records at a time in rotation among the datasets mentioned in the OUTFIL. This happens until all the output records are written.
The First Set of records from the output records gets written to first dataset mentioned in the OUTFIL group, the Second Set of records from the output records gets written to the second dataset mentioned in the OUTFIL group and so on.
When each OUTFIL dataset has the specified set of records, the rotation starts again with the dataset mentioned first in the OUTFIL group.
The syntax is SPLITBY=M, where M=1,2,3…so on
The records are not contiguous in the OUTFIL datasets.
SPLITBY=1 is equivalent to SPLIT.
The below JCL splits the data in INFILE and copies to OUTFILE3 and OUTFILE4 as mentioned above.
//STEP01 EXEC PGM=SORT
//SORTIN DD DSN=INFILE,DISP=SHR
//SORTOUT1 DD DSN=OUTFILE3,DISP=SHR
//SORTOUT2 DD DSN=OUTFILE4,DISP=SHR
//SYSIN DD *
SORT FIELDS=COPY
OUTFIL FNAMES=(SORTOUT1,SORTOUT2),SPLITBY=3
/*
The contents of OUTFILE3 and OUTFILE4 would be as below,
OUTFILE3
1111111111111111111111111111
1211111111111111111111111111
1311111111111111111111111111
1711111111111111111111111111
1811111111111111111111111111
1911111111111111111111111111
OUTFILE4
1411111111111111111111111111
1511111111111111111111111111
1611111111111111111111111111
2011111111111111111111111111
2111111111111111111111111111
OUTFILE3 contains records (1, 2, 3), (7, 8, 9).
OUTFILE4 contains records (4, 5, 6), (10, 11).
Note that the records in the output datasets are not contiguous.

SPLIT1R splits output records M records at a time in one rotation among the datasets mentioned in the OUTFIL. This happens until all the records are written. In SPLIT1R the rotation happens only once among the OUTFIL datasets.
If on reaching the last OUTFIL, more than M records from the output records is left, all of those would be move to last OUTFIL.
If the input has only M records, then all input records will get moved to the first OUTFIL. The remaining OUTFIL datasets will be empty.
The syntax is SPLIT1R=M, where M=1, 2, 3…so on.
The records are contiguous among the OUTFIL datasets.
The below JCL’s splits the data in INFILE,
JCL1:
//STEP01 EXEC PGM=SORT
//SORTIN DD DSN=INFILE,DISP=SHR
//SORTOUT1 DD DSN=OUTFILE5,DISP=SHR
//SORTOUT2 DD DSN=OUTFILE6,DISP=SHR
//SYSIN DD *
SORT FIELDS=COPY
OUTFIL FNAMES=(SORTOUT1,SORTOUT2),SPLIT1R=5

The output files contents are shown below,
OUTFILE5:
1111111111111111111111111111
1211111111111111111111111111
1311111111111111111111111111
1411111111111111111111111111
1511111111111111111111111111
OUTFILE6:
1611111111111111111111111111
1711111111111111111111111111
1811111111111111111111111111
1911111111111111111111111111
2011111111111111111111111111
2111111111111111111111111111
There are two output files, and M=5. The input INFILE contains 11 records.
The OUTFILE5 contains records 1, 2, 3, 4, 5.
The dataset OUTFILE6 contains records 6, 7, 8, 9, 10, 11(i. e all the remaining records)

JCL2:

//STEP01 EXEC PGM=SORT
//SORTIN DD DSN=INFILE,DISP=SHR
//SORTOUT1 DD DSN=OUTFILE7,DISP=SHR
//SORTOUT2 DD DSN=OUTFILE8,DISP=SHR
//SYSIN DD *
SORT FIELDS=COPY
OUTFIL FNAMES=(SORTOUT1,SORTOUT2),SPLIT1R=11
//
The output file contents are shown below:

OUTFILE7:
1111111111111111111111111111
1211111111111111111111111111
1311111111111111111111111111
1411111111111111111111111111
1511111111111111111111111111
1611111111111111111111111111
1711111111111111111111111111
1811111111111111111111111111
1911111111111111111111111111
2011111111111111111111111111
2111111111111111111111111111

OUTFILE8
empty as expected.

No comments:

Post a Comment