What is Sorting?
Sorting and data transformation are essential operations in enterprise batch processing. IBM's DFSORT utility is the backbone for sorting, merging, filtering, and formatting large datasets efficiently on z/OS.
It plays a critical role in automating business workflows—such as payroll, billing, and report generation—by organizing and manipulating data before further processing. With built-in support, high performance, and rich features, DFSORT is a reliable tool for handling data-heavy jobs in mainframe environments.
This handout introduces DFSORT's core concepts, its syntax, practical usage. In short, sorting means arranging records in a specific order—alphabetically, numerically, or by date.
IBM Docs - DFSORT Introduction
Why Sorting Matters in Enterprise Systems
- Makes large datasets easier to manage
- Speeds up searches and report generation
- Reduces data errors in billing, payroll, and reporting
- Essential for tasks like file merging, deduplication, and analytics
💡 Without sorting, enterprise systems become slow and inefficient.
DFSORT is IBM’s built-in z/OS utility for:
- Sorting, merging, and reformatting datasets
- Filtering and summarizing data efficiently
- Used in batch JCL to handle large volumes of data
DFSORT JCL example
There are more than 20 control statements. Here are few commonly used.
IBM Docs - DFSORT Control Statement
Required DD Statements
SYSOUT
– Log and messagesSYSIN
– DFSORT control statementsSORTIN
– Input datasetSORTOUT
– Output dataset
Execution Flow
INREC → INCLUDE → SORT / SUM → OMIT → OUTREC
SORT FIELDS=COPY
performs no sort, just filtering or formatting.- Use
OPTION COPY
when no sort is needed—faster thanSORT FIELDS=COPY
. SUM FIELDS=NONE
is a great way to deduplicate records.- Use
INREC
to filter early, reducing data before sorting. OUTFIL SPLIT
is handy for splitting one dataset into multiple outputs.
SORT
SORT – Specifies the key fields and order (ascending or descending) used to sort the input records.
//SYSIN DD *
SORT FIELDS=(start,length,format,A/D)
SORT FIELDS=COPY,SKIPREC=5,STOPAFT=100
/*
OPTION
OPTION – Used to set general processing options like copying without sorting, skipping or limiting records.
//SYSIN DD *
OPTION COPY,SKIPREC=10,STOPAFT=100
/*
INCLUDE
INCLUDE – Filters records to include only those that match specific conditions.
//SYSIN DD *
INCLUDE COND=(start,length,format,OP,value)
/*
OMIT
OMIT – Filters out records that match certain conditions, effectively the opposite of INCLUDE.
//SYSIN DD *
OMIT COND=(start,length,format,OP,value)
/*
OUTFIL
OUTFIL – Sends different output records to multiple files, or controls formatting and splitting of output.
//SYSIN DD *
OUTFIL FILES=(01,02),SPLIT
OUTFIL FILES=01,INCLUDE=(...),STARTREC=5,ENDREC=20
/*
SUM
SUM – Removes duplicates or totals up numeric fields for records with the same key values.
Used to add numeric fields or remove duplicates.
//SYSIN DD *
SUM FIELDS=(start,length,format)
SUM FIELDS=NONE ← Removes duplicate records
SUM FIELDS=NONE,XSUM ← Move duplicates to another file (not supported here)
/*
INREC
vs OUTREC
INREC/OUTREC – Reformats records before (INREC) or after (OUTREC) sorting, merging, or copying.
//SYSIN DD *
OUTREC FIELDS=(1:1,10,11:C'**')
OUTREC FINDREP=(IN=C'OLD',OUT=C'NEW')
OUTREC OVERLAY=(5:C'ABCD')
OUTREC IFTHEN=(WHEN(conditions),OVERLAY=(...))
/*
📝 Same syntax applies to INREC and OUTREC.