Introduction to GDG (Generation Data Group)
In IBM Mainframe, a Generation Data Group (GDG) is a collection of related datasets that are grouped together by a common name and a generation number. These datasets typically contain versions of data created over time—like daily reports, backups, or job outputs. Each generation has a unique version identifier (e.g., G0001V00
, G0002V00
) and is managed as part of a group.
GDGs help in automatically managing dataset versions, so you don’t need to manually name and track them. You can refer to them using relative numbers like (0)
, (+1)
, or (-1)
instead of their full names. The system handles cataloging, deletion, and roll-off based on the GDG definition.
They are very useful when working with time-based or versioned data, such as logs or incremental backups, and simplify access to the most recent or previous dataset without hardcoding dataset names.
Refer Here for more info.
Relative vs Absolute GDG Referencing
Many learners get confused about what (0)
, (+1)
, (-1)
really mean, especially when used across different jobs or steps.
Simplified Explanation
- Relative GDG Reference (e.g.
MY.GDG(+1)
):- Based on the latest cataloged generation at the moment the job starts.
(0)
→ Latest generation(-1)
→ Previous generation(+1)
→ A new generation to be created
- Absolute GDG Name (e.g.
MY.GDG.G0002V00
):- Refers to a specific generation, regardless of timing
Important Behavior – Within a Single Job
If you write:
//DD1 DD DSN=MY.DATA(+1), DISP=(NEW,CATLG,DELETE)
//DD2 DD DSN=MY.DATA(+1), DISP=(NEW,CATLG,DELETE)
This does not create two generations. Both DD1
and DD2
refer to the same generation (e.g., G0001V00
).
To create a second new generation in the same job, you must do:
//DD1 DD DSN=MY.DATA(+1), DISP=(NEW,CATLG,DELETE)
//DD2 DD DSN=MY.DATA(+2), DISP=(NEW,CATLG,DELETE)
If you want to re-access the same generation within the job (e.g., for further processing), you can use:
//DD1 DD DSN=MY.DATA(+1), DISP=(NEW,CATLG,DELETE)
//DD2 DD DSN=MY.DATA(+1), DISP=OLD
Important Behavior – Across Jobs
Let’s say Job A creates:
//DD1 DD DSN=MY.DATA(+1) ← becomes G0001V00
//DD2 DD DSN=MY.DATA(+2) ← becomes G0002V00
After Job A finishes, these generations are cataloged. Then in Job B:
//CURRENT DD DSN=MY.DATA(0) ← points to G0002V00
//PREV DD DSN=MY.DATA(-1) ← points to G0001V00
✅ So (0)
always means latest generation in catalog when this job starts.
Common Misconception
People think they can create multiple generations in one job using the same (+1)
. But you must increment properly. Also, (0)
in another job refers to the latest generation in catalog, not what you created in the last job unless it has been cataloged.
Tip
Use relative references only when:
- You’re okay with pointing to the latest or previous generation dynamically
Use absolute names when:
- You need to ensure you're referring to one fixed version (like for audit or reprocessing)
Skipping GDG Numbers and Understanding References
When you work with GDGs, each new generation must follow the sequence of existing ones. Skipping numbers (for example going from (+1)
straight to (+10)
) is allowed, but it creates gaps that can confuse relative references and cleanup routines. Here’s what you need to know:
How Skipping Works
-
Sequential Count vs. Name Gaps The GDG limit counts how many generations you have, regardless of their numeric suffix. So even if you name them
G0001V00
,G0010V00
, andG0011V00
, your GDG still holds three generations. -
Gaps Don’t Fill Numbers you skip (e.g. G0002V00 through G0009V00) are never created later unless you explicitly use
(+2)
,(+3)
, etc.
Why Gaps Matter
-
Relative References
(0)
always points to the highest-numbered valid generation in the catalog (here G0011V00).(-1)
points to the next lower valid generation (G0010V00).- Gaps are simply ignored—they don’t become “empty slots” you can use later.
-
Cleanup Behavior
- A GDG defined with
LIMIT(3) EMPTY SCRATCH
will only delete old generations when you allocate the 4th generation. - Because you still have only three cataloged datasets (
G0001V00
,G0010V00
,G0011V00
), cleanup won’t happen until you add a fourth, no matter how large the numeric gap is.
- A GDG defined with
Example Scenario
- Start with no generations.
- Allocate
(+1)
→ creates G0001V00 - Allocate
(+10)
→ creates G0010V00 - Allocate
(+11)
→ creates G0011V00 At this point, GDG shows three entries:(0)
→ G0011V00(-1)
→ G0010V00(-2)
→ G0001V00
- If you now allocate
(+12)
(a fourth generation), the system will:- Exceed the limit of 3
- Delete all older generations (because of
EMPTY
) - Scratch them from disk You’ll end up with only G0012V00 in the GDG.
Best Practices
- Use small, ordered increments (
+1
, then+2
, then+3
) when creating multiple generations in the same job. - Avoid large jumps unless there’s a clear reason; gaps make troubleshooting harder.
- Rely on relative references
(0)
,(-1)
, etc., to pick up the correct datasets—gaps will be skipped automatically. - Remember cleanup only occurs when you exceed the defined limit, not when you reach it.
Understanding EMPTY vs NOEMPTY and SCRATCH vs NOSCRATCH
When defining a GDG base, you control how old generations are removed with two parameter pairs: EMPTY/NOEMPTY and SCRATCH/NOSCRATCH. Getting these right ensures you don’t accidentally wipe out data or leave orphaned files.
EMPTY vs NOEMPTY
-
EMPTY
- When you exceed the GDG limit, all previous generations are deleted at once.
- Use this if you want a “clean slate” whenever you create a new generation beyond your retention limit.
-
NOEMPTY
- When you exceed the limit, only the oldest generation is deleted, preserving the rest up to your limit.
- Use this to maintain a rolling window of the last
n
generations.
Example
With LIMIT(3) EMPTY
:
- You have G0001, G0002, G0003.
- On creating G0004, you end up with only G0004.
With LIMIT(3) NOEMPTY
:
- You have G0001, G0002, G0003.
- On creating G0004, you end up with G0002, G0003, G0004 (G0001 deleted).
SCRATCH vs NOSCRATCH
-
SCRATCH
- Physically removes the dataset from disk when it’s deleted.
- Use this when you want to free up space immediately.
-
NOSCRATCH
- Deletes the GDG entry from the catalog but leaves the dataset on disk (uncataloged).
- Use this if you might need to recover or examine the old file manually later.
Example
With EMPTY SCRATCH
:
- Old datasets are wiped from both catalog and disk.
With NOEMPTY NOSCRATCH
:
- Oldest dataset is uncataloged but still resides on disk; you must delete it manually if you no longer need it.
Common Pitfalls & Tips
- Pitfall: Expecting EMPTY to remove only one generation. It clears all when the limit is exceeded.
- Pitfall: Assuming NOSCRATCH preserves catalog info—you must re-catalog if you need to re-use it.
- Tip: Combine
NOEMPTY SCRATCH
for a rolling window that automatically frees disk space. - Tip: Before running a job, verify your GDG definition with IDCAMS:
//SYSIN DD *
LISTGDG ENT('YOUR.GDG.BASE') ALL
/*
This shows your current limit and RETENTION settings.
Maximum Length Limits and Name Truncation Issues
GDG base names must leave room for the generation/version suffix (.GaaaaVxx
), or allocation will fail with a “name too long” error. Here’s what you need to know:
z/OS Dataset Name Limits
- Total length of a dataset name (including periods and all qualifiers) is 44 characters.
- The GDG suffix
.GaaaaVxx
always takes 9 characters (including the period).
Calculating Your Base Name Length
- Subtract 9 from 44 to get 35.
- Max base name length = 35 characters.
- If your base name is 36 or more, the system cannot append
.G0001V00
without truncation.
Common Error
//DD1 DD DSN=VERY.LONG.DATASET.NAME.EXCEEDING.THE.LIMIT(+1),...
- If
VERY.LONG.DATASET.NAME.EXCEEDING.THE.LIMIT
is 36+ characters, you’ll see:
IEC112I DCBNAME TOO LONG FOR ALLOCATION
- This happens even before any dataset is created.
Best Practices for name
- Count characters in your planned base name before coding JCL.
- Use ISPF option 3.2”DSLIST or a simple text editor column count to verify.
- If you need a long identifier, consider using an alias or shortened prefix.
- Review existing naming standards—often sites define prefixes like
PRD.
orDEV.
to stay within limits.
Quick Tip
- A handy rule of thumb:
- “Always keep your base name under 30 characters.”
- This gives extra buffer space for qualifiers or unexpected suffixes.
- By respecting the 44-character maximum and reserving 9 characters for the GDG suffix, you’ll avoid frustrating allocation errors and keep your JCL running smoothly.
GDG Cataloging Gotchas
GDG generations must be cataloged to be discoverable by JCL’s relative references (0), (-1), etc.
Tricky DISP Combinations
A dataset created with DISP=(NEW,DELETE,DELETE) (or any DISP that omits CATLG) is built then immediately removed from both catalog and disk.
Effect: No catalog entry ⇒ no change in GDG count ⇒ (0) doesn’t move.
Misconception: (0) Always Advances Many assume that once you run a job with (+1), the next (0) will point to that new generation.
Reality: If the new generation never makes it into the catalog, (0) remains stuck on the prior one—even across jobs.
Tip: Always use DISP=(NEW,CATLG,...) when you need the generation available later.
Manual Generation Deletion
If you remove a specific generation’s catalog entry—say you created (+1), (+2), (+3) and then manually deleted MY.GDG.G0002V00—the GDG chain simply skips over it when resolving relative numbers.
What you’ll see next time:
- (0) points to the highest remaining generation (+3).
- (-1) points to the next lower cataloged generation (+1), because +2 is gone.
Why it happens: Relative references build their list from whatever is in the catalog. Deleting +2 removes it from that list, so (–1) jumps straight to (+1).