Skip to main content

Introduction to GDG (Generation Data Group)

In IBM Mainframe, a Generation Data Group (GDG) is a collection of related datasets that are grouped together by a common name and a generation number. These datasets typically contain versions of data created over time—like daily reports, backups, or job outputs. Each generation has a unique version identifier (e.g., G0001V00, G0002V00) and is managed as part of a group.

GDGs help in automatically managing dataset versions, so you don’t need to manually name and track them. You can refer to them using relative numbers like (0), (+1), or (-1) instead of their full names. The system handles cataloging, deletion, and roll-off based on the GDG definition.

They are very useful when working with time-based or versioned data, such as logs or incremental backups, and simplify access to the most recent or previous dataset without hardcoding dataset names.

IBM Docs - GDG

GDG management is handled using IDCAMS.

Relative vs Absolute GDG Referencing

Many learners get confused about what (0), (+1), (-1) really mean, especially when used across different jobs or steps. Here is simplified explaination:

  • Relative GDG Reference (e.g. MY.GDG(+1)):
    • Based on the latest cataloged generation at the moment the job starts.
    • (0) → Latest generation
    • (-1) → Previous generation
    • (+1) → A new generation to be created
  • Absolute GDG Name (e.g. MY.GDG.G0002V00):
    • Refers to a specific generation, regardless of timing

Important Behavior – Within a Single Job

If you write:

//DD1 DD DSN=MY.DATA(+1), DISP=(NEW,CATLG,DELETE)
//DD2 DD DSN=MY.DATA(+1), DISP=(NEW,CATLG,DELETE)

This does not create two generations. Both DD1 and DD2 refer to the same generation (e.g., G0001V00).

To create a second new generation in the same job, you must do:

//DD1 DD DSN=MY.DATA(+1), DISP=(NEW,CATLG,DELETE)
//DD2 DD DSN=MY.DATA(+2), DISP=(NEW,CATLG,DELETE)

If you want to re-access the same generation within the job (e.g., for further processing), you can use:

//DD1 DD DSN=MY.DATA(+1), DISP=(NEW,CATLG,DELETE)
//DD2 DD DSN=MY.DATA(+1), DISP=OLD

Important Behavior – Across Jobs

Let’s say Job A creates:

//DD1 DD DSN=MY.DATA(+1)  ← becomes G0001V00
//DD2 DD DSN=MY.DATA(+2) ← becomes G0002V00

After Job A finishes, these generations are cataloged. Then in Job B:

//CURRENT DD DSN=MY.DATA(0)   ← points to G0002V00
//PREV DD DSN=MY.DATA(-1) ← points to G0001V00

✅ So (0) always means latest generation in catalog when this job starts.

Summary - Referencing

People think they can create multiple generations in one job using the same (+1). But you must increment properly. Also, (0) in another job refers to the latest generation in catalog, not what you created in the last job unless it has been cataloged.

tip

Use relative references only when: You’re okay with pointing to the latest or previous generation dynamically

Use absolute names when: You need to ensure you're referring to one fixed version (like for audit or reprocessing)


Do You Need to Worry About GDG Number Gaps?

Mostly no—GDG handles numbering and referencing automatically. But here’s what you should know (not study) to avoid surprises:

Key Points

  • GDG counts cataloged generations, not their numeric gaps. If you create G0001V00, G0010V00, and G0011V00, GDG says “3 generations,” even with skipped numbers.
  • Skipped numbers are ignored, and won’t be auto-filled later.
  • Relative references like (0), (-1), etc., still work correctly. (0) points to the most recent generation, even if there are gaps. - Cleanup happens only when the count exceeds the limit, not when numbers get high. So if LIMIT(3) is set, cleanup starts only after you catalog a 4th dataset, no matter the gaps.

Quick Example

  1. Lets allocate a GDG with LIMIT(3) EMPTY SCRATCH
  2. Allocate (+1) → G0001V00
  3. Allocate (+10) → G0010V00
  4. Allocate (+11) → G0011V00
  5. Allocate (+12) → GDG limit exceeded → It will only deletes older 3, keeps only G0012V00 after the job completes.

Best Practice

Stick to ordered increments (+1, then +2, then +3), unless there's a strong reason to jump—gaps don’t break anything, but can confuse humans reading the list.


Understanding EMPTY vs NOEMPTY and SCRATCH vs NOSCRATCH

When defining a GDG base, you control how old generations are removed with two parameter pairs: EMPTY/NOEMPTY and SCRATCH/NOSCRATCH. Getting these right ensures you don’t accidentally wipe out data or leave orphaned files.

EMPTY vs NOEMPTY

  • EMPTY

    • When you exceed the GDG limit, all previous generations are deleted at once.
    • Use this if you want a “clean slate” whenever you create a new generation beyond your retention limit.
  • NOEMPTY

    • When you exceed the limit, only the oldest generation is deleted, preserving the rest up to your limit.
    • Use this to maintain a rolling window of the last n generations.

Example With LIMIT(3) EMPTY:

  • You have G0001, G0002, G0003.
  • On creating G0004, you end up with only G0004.

With LIMIT(3) NOEMPTY:

  • You have G0001, G0002, G0003.
  • On creating G0004, you end up with G0002, G0003, G0004 (G0001 deleted).

SCRATCH vs NOSCRATCH

  • SCRATCH

    • Physically removes the dataset from disk when it’s deleted.
    • Use this when you want to free up space immediately.
  • NOSCRATCH

    • Deletes the GDG entry from the catalog but leaves the dataset on disk (uncataloged).
    • Use this if you might need to recover or examine the old file manually later.

Example With EMPTY SCRATCH:

  • Old datasets are wiped from both catalog and disk.

With NOEMPTY NOSCRATCH:

  • Oldest dataset is uncataloged but still resides on disk; you must delete it manually if you no longer need it.

Common Pitfalls & Tips

  • Pitfall: Expecting EMPTY to remove only one generation. It clears all when the limit is exceeded.
  • Pitfall: Assuming NOSCRATCH preserves catalog info—you must re-catalog if you need to re-use it.
  • Tip: Combine NOEMPTY SCRATCH for a rolling window that automatically frees disk space.
  • Tip: Before running a job, verify your GDG definition with IDCAMS:
//SYSIN DD *
LISTGDG ENT('YOUR.GDG.BASE') ALL
/*

This shows your current limit and RETENTION settings.


Maximum Length Limits and Name Truncation Issues

GDG base names must leave room for the generation/version suffix (.GaaaaVxx), or allocation will fail with a “name too long” error. Here’s what you need to know:

z/OS Dataset Name Limits

  • Total length of a dataset name (including periods and all qualifiers) is 44 characters.
  • The GDG suffix .GaaaaVxx always takes 9 characters (including the period).

Calculating Your Base Name Length

  • Subtract 9 from 44 to get 35.
    • Max base name length = 35 characters.
  • If your base name is 36 or more, the system cannot append .G0001V00 without truncation.

Common Error

//DD1 DD DSN=VERY.LONG.DATASET.NAME.EXCEEDING.THE.LIMIT(+1),...
  • If VERY.LONG.DATASET.NAME.EXCEEDING.THE.LIMIT is 36+ characters, you’ll see:
IEC112I DCBNAME TOO LONG FOR ALLOCATION
  • This happens even before any dataset is created.

Best Practices for name

  • Count characters in your planned base name before coding JCL.
  • Use ISPF option 3.2”DSLIST or a simple text editor column count to verify.
  • If you need a long identifier, consider using an alias or shortened prefix.
  • Review existing naming standards—often sites define prefixes like PRD. or DEV. to stay within limits.
tip
  • A handy rule of thumb:
    • “Always keep your base name under 30 characters.”
    • This gives extra buffer space for qualifiers or unexpected suffixes.
  • By respecting the 44-character maximum and reserving 9 characters for the GDG suffix, you’ll avoid frustrating allocation errors and keep your JCL running smoothly.

GDG Cataloging Gotchas

GDG generations must be cataloged to be discoverable by JCL’s relative references (0), (-1), etc.

Tricky DISP Combinations

A dataset created with DISP=(NEW,DELETE,DELETE) (or any DISP that omits CATLG) is built then immediately removed from both catalog and disk.

Effect: No catalog entry ⇒ no change in GDG count ⇒ (0) doesn’t move.

Misconception: (0) Always Advances Many assume that once you run a job with (+1), the next (0) will point to that new generation.

Reality: If the new generation never makes it into the catalog, (0) remains stuck on the prior one—even across jobs.

Tip: Always use DISP=(NEW,CATLG,...) when you need the generation available later.

Manual Generation Deletion

If you remove a specific generation’s catalog entry—say you created (+1), (+2), (+3) and then manually deleted MY.GDG.G0002V00—the GDG chain simply skips over it when resolving relative numbers.

What you’ll see next time:

  • (0) points to the highest remaining generation (+3).
  • (-1) points to the next lower cataloged generation (+1), because +2 is gone.

Why it happens: Relative references build their list from whatever is in the catalog. Deleting +2 removes it from that list, so (–1) jumps straight to (+1).