Skip to main content

Introduction to Files in Mainframe

If you’re coming from a PC or Unix background, you’re used to working with "files." But in the mainframe world, things are a bit different. This handout explains what a data set is, how it differs from regular files, and why understanding this difference is important when working with z/OS.

What Is a Data Set?

  • A data set is the mainframe equivalent of a file—but structured to hold records.
  • A record is like a line or entry of structured data—similar to a row in a spreadsheet or a line in a log file.
  • Records can be fixed or variable-length, and are read or written by programs directly.
  • Some datasets store text (e.g., source code), while others store binary data (e.g., executables or load modules).

Key Differences

ConceptPC/Unix FileMainframe Data Set (z/OS)
Basic unitFileData Set
StructureByte streamRecord-based
File types.txt, .exe, .csv, etc.SEQ, PDS, PDSE, VSAM, etc.
Storage methodStored in foldersManaged by z/OS, often cataloged
Access methodByte-based (e.g., stream)Record-based (fixed or variable)

Cataloged vs. Non-Cataloged

TermWhat It MeansWhy It Matters
Catalogedz/OS knows where the data set is storedYou can access it using just its name
Non-CatalogedYou need volume and device infoHarder to locate manually

Think of the catalog like a search index on your PC—it tells the system where each data set is. In z/OS, the catalog is a core system feature that makes it easy for programs, JCL, and utilities to locate datasets without needing physical storage details.

For a deeper dive, see: Catalog Handout

Ways to Manage Data Sets in Mainframe

You can work with datasets in multiple ways depending on whether you're doing it interactively or through automation:

  • TSO Commands: Command-line interface for allocating, deleting, or copying datasets in real-time (e.g., using ALLOCATE, DELETE, etc.).
  • JCL (Job Control Language): Batch method where you define dataset actions in a job script and submit it for processing—ideal for repeated or scheduled tasks.
  • ISPF Panels: Menu-driven, interactive environment where you can browse, edit, or manage datasets without typing full commands—user-friendly for beginners.

Each method has its use: TSO is quick for ad hoc work, JCL is best for automation, and ISPF is great for hands-on exploration. We have seperate handsout for each of them.

Common Misconceptions

MisconceptionReality
“All files are readable”Load modules (executables) aren’t meant to be opened
“Data sets are folders like on Windows”Only PDS/PDSE have members; other datasets don't
“Catalog = Folder”Catalog = Index; it helps find the data, not hold it
“You can read records like lines in Notepad”Only if it's a text-based dataset with readable encoding