Skip to main content

Introduction to SPUFI in DB2

SPUFI (SQL Processor Using File Input) is a utility in IBM DB2 for z/OS that allows users to enter and execute SQL statements directly from an ISPF panel. Although commonly viewed as a basic SQL interface, SPUFI offers capabilities and limitations that are often missed in typical usage.

One of the most overlooked aspects of SPUFI is that, despite its name, it doesn't require file input—you can type SQL statements directly in the interface. It's primarily designed for developers and DBAs working in the TSO/ISPF environment, and it's ideal for ad hoc queries, testing, and even viewing catalog information. However, it’s not suited for production-grade SQL execution or for use with embedded SQL programs.

Useful Details

  • SPUFI accepts multiple SQL statements in one session, but each must be terminated with a semicolon (;) or another character defined by the user.
  • Unlike programmatic SQL environments, SPUFI does not support host variables, making it unsuitable for simulating COBOL or C applications.
  • The output format can be toggled between formatted (column-aligned) and unformatted (raw), which can greatly impact how readable your results are, especially when copying outputs into documents or reports.

Another key feature is the row limit setting, which restricts how many rows are displayed. This is often set too low by default (e.g., 100), causing users to mistakenly believe their query has returned fewer results. Raising this limit in the SPUFI setup panel helps ensure you're seeing the full output.

Temporary tables and changes to special registers like CURRENT SCHEMA exist only within the SPUFI session. This session-scoped behavior allows repeatable testing but means that no state is preserved afterward.

Finally, it’s worth noting that SPUFI does not perform syntax validation unless the SQL is actually executed. So a malformed SQL statement won’t trigger any warning until runtime.

Understanding SPUFI and able to execute will help in testing your queries well.

IBM Docs - SPUFI

Excellent — here is your full web-based handout for "How IBM Db2 Organizes Data: Logical and Physical Structures", structured in the long-term learning format:


DB2 Keys

Keys define how data in a table stays unique, linked, and valid. Db2 supports:

  • Primary Keys → Uniquely identify each row in a table.
  • Foreign Keys → Define relationships between tables.
  • Unique Keys → Prevent duplicate values in specific columns.

Primary Key

  • Uniquely identifies each row.
  • Must be NOT NULL.
  • Requires a manually created unique index in Db2 for z/OS.
-- Define a table with a Primary Key
CREATE TABLE HR.EMPLOYEES (
EMP_ID INT NOT NULL,
NAME VARCHAR(50),
PRIMARY KEY (EMP_ID)
);

-- You MUST define the unique index yourself
CREATE UNIQUE INDEX EMP_IDX ON HR.EMPLOYEES (EMP_ID);

Foreign Key

  • Creates a parent-child relationship between tables.
  • Refers to a primary or unique key in another table.
CREATE TABLE HR.PAYROLL (
PAY_ID INT PRIMARY KEY,
EMP_ID INT,
FOREIGN KEY (EMP_ID) REFERENCES HR.EMPLOYEES(EMP_ID)
);
  • You should create an index on foreign key columns for performance, though it’s not enforced.

Unique Key

  • Prevents duplicates in one or more columns.
  • Useful for business rules (like unique emails or SSNs).
ALTER TABLE HR.EMPLOYEES 
ADD CONSTRAINT EMP_UK_EMAIL UNIQUE (EMAIL);

IBM Docs - DB2 Keys

Constraint

Constraints enforce rules on the data in your tables to maintain accuracy, consistency, and reliability.

Db2 supports these types of constraints:

  • Entity Integrity → Ensured via Primary Keys
  • Referential Integrity → Enforced using Foreign Keys
  • Data Uniqueness → Guaranteed using Unique Constraints
  • Valid Values → Checked using Check Constraints
  • Mandatory Fields → Controlled with Not Null Constraints

IBM Docs - Constraints

Data types

A data type defines what kind of data a column can store: numbers, text, dates, etc. Choosing the correct type improves storage efficiency, ensures data accuracy, and boosts query performance.

IBM Docs - String

IBM Docs - Numeric

IBM Docs - Data time

🔢 Numeric Data Types

INTEGER – Whole Numbers

  • Stores integers (no decimal places).
  • Uses 4 bytes of storage.
  • Ideal for: IDs, counters, ages, quantities.
CREATE TABLE Employees (
EmployeeID INTEGER PRIMARY KEY,
Age INTEGER
);
example

✅ Use when you’re storing whole numbers only.

DECIMAL(p, s) – Fixed-Precision Numbers

  • Used for financial or high-precision numeric values.
  • p = precision (total digits), s = scale (digits after decimal).
  • Accurate and prevents rounding errors.
CREATE TABLE Accounts (
AccountID INTEGER PRIMARY KEY,
Balance DECIMAL(10, 2)
);
example

💡 Balance allows values like 12345678.90 — 10 digits total, 2 after the decimal.


🔤 Character Data Types

CHAR(n) – Fixed-Length Strings

  • Reserves exactly n characters.
  • Pads with spaces if the string is shorter.
  • Good for: country codes, fixed IDs, and status flags.
CREATE TABLE Countries (
CountryCode CHAR(2) PRIMARY KEY,
CountryName CHAR(50)
);
example

🛠 Use for data that’s always the same length.

VARCHAR(n) – Variable-Length Strings

  • Stores strings up to n characters, uses only what’s needed.
  • Reduces storage overhead for inconsistent text lengths.
  • Best for: names, emails, addresses.
CREATE TABLE Customers (
CustomerID INTEGER PRIMARY KEY,
Name VARCHAR(100),
Email VARCHAR(100)
);
example

🧠 Efficient for text with variable lengths.


📅 Date and Time Data Types

📆 DATE – Calendar Dates Only

  • Stores dates like YYYY-MM-DD.
  • Used for birthdates, deadlines, event dates.
CREATE TABLE Events (
EventID INTEGER PRIMARY KEY,
EventDate DATE
);
example

🎯 Perfect when time isn’t relevant.

TIMESTAMP – Date + Time

  • Captures full datetime: YYYY-MM-DD HH:MM:SS.
  • Records exact moments — useful for audit trails.
CREATE TABLE Orders (
OrderID INTEGER PRIMARY KEY,
OrderTimestamp TIMESTAMP
);
example

🔍 Use for tracking creation or modification times.


✅ Quick Reference Table

Data TypePurposeKey Point
INTEGERWhole numbers only4 bytes, good for IDs, counts
DECIMAL(p,s)Precise numeric valuesIdeal for money
CHAR(n)Fixed-length textPadded with spaces if short
VARCHAR(n)Variable-length textSaves space, flexible input
DATEDate without timeFormat: YYYY-MM-DD
TIMESTAMPDate with timeFormat: YYYY-MM-DD HH:MM:SS

🏗️ How IBM Db2 Organizes Data: Logical and Physical Structures

Db2 uses a layered architecture to manage memory, storage, and logical database objects. This guide breaks each concept into real-world understanding, code examples, lab exercises, and gotchas.

IBM Docs - Defining DB2 Objects

IBM Docs - SQL DDL Statement


📦 Buffer Pool (Handled by DBA)

A buffer pool is a section of memory that caches recently accessed data pages. It helps avoid reading from disk repeatedly, speeding up performance.

IBM Docs - Importance of Buffer pool

  • Real-World Usage
    • Used to optimize I/O performance. DBAs assign different buffer pools to tablespaces based on how frequently data is accessed.

Syntax:

CREATE BUFFERPOOL BP1 SIZE 1000 PAGESIZE 8K;

Above command will create buffer pool BP1 with 1000 pages, each 8KB in size.

note

Generally DBA / DB2 Admins only will have access to create Buffer pools

tip
  • Page size must match the tablespace.
  • Buffer pools consume real memory.
  • Use only needed pools to avoid memory exhaustion.

💽 Storage Group (STOGROUP) (Handled by DBA)

A storage group is a collection of disk volumes where Db2 physically stores data. It lets DBAs control where tablespaces and data reside.

IBM Docs - Storage group

  • Real-World Usage
    • Used to group specific disk volumes for high-speed or low-cost storage tiers.

Syntax:

CREATE STOGROUP STG1 VOLUMES ('VOL1', 'VOL2') VCAT MYCAT;
note

Generally DBA / DB2 Admins only will have access to create Storage groups.

tip
  • VCAT refers to the catalog that indexes the volumes.
  • Ensure volumes exist and are assigned properly.
  • STOGROUP is required for tablespace creation.

🗃️ Database

The database is the top-level container that holds all schemas, tables, indexes, and views.

IBM Docs - Database

  • Real-World Usage
    • In Db2/zOS, databases are mandatory and allow grouping of related objects for performance and administration.

Syntax:

CREATE DATABASE DB1 
BUFFERPOOL BP1
INDEXBP IDXBP1
STOGROUP STG1
CCSID EBCDIC;
note

Generally DBA only will have access to create database

tip
  • CCSID (character set) must match your system/application.

📁 Tablespace (Handled by DBA)

A tablespace maps logical tables and indexes to physical storage. It connects storage groups with buffer pools.

IBM Docs - Tablespace

  • Real-World Usage
    • DBAs use different tablespaces for different workloads — OLTP, reporting, archive — for better control and performance.

Syntax:

CREATE TABLESPACE TS1 
IN DB1
USING STOGROUP STG1
BUFFERPOOL BP1;

Try It Yourself

  • Try to create a tablespace after getting STOGROUP and BUFFERPOOL from DBA (See if you have access)

  • Query it:

    SELECT * FROM SYSIBM.SYSTABLESPACE WHERE NAME = 'TS1';
tip
  • Cannot use a buffer pool with a mismatched page size.
  • Always link to a valid storage group.

🧾 Schema

A schema organizes related database objects like tables, views, and indexes. It prevents name conflicts.

IBM Docs - Schema

  • Real-World Usage
    • Each developer or team can use their own schema to avoid clashing over table names.

Syntax:

SELECT * FROM HR.EMPLOYEES;

Here, HR is the schema qualifier.

Try It Yourself

  • Try selecting from a table with and without schema:
tip
  • Default schema = your TSO ID.
  • You can’t access other schemas without authorization (SQLCODE -551).

📊 Table

A table holds user data in rows and columns. Each column has a type like VARCHAR, INT, etc.

IBM Docs - Table

  • Real-World Usage
    • Tables are the foundation of all business data: employee records, transactions, logs, etc.

Syntax:

CREATE TABLE EMPLOYEES (
EMP_ID INT PRIMARY KEY,
NAME VARCHAR(100),
SALARY DECIMAL(10,2)
) IN DB1.TS1;

Try It Yourself

  • Create your own table.

  • Then run:

    SELECT * FROM <Your Schema>.EMPLOYEES;
tip
  • You must create a UNIQUE INDEX for primary keys.
  • Db2 does not auto-create indexes like Oracle/PostgreSQL.

📈 Index

An index improves query performance by allowing faster row lookup based on key values.

  • Real-World Usage
    • Indexes are essential on large tables with WHERE clauses or JOIN conditions.

Syntax:

CREATE UNIQUE INDEX EMP_IDX ON HR.EMPLOYEES (EMP_ID) 
USING STOGROUP STG1;

Try It Yourself

  • Create both a unique and non-unique index.
  • Try running a query using the indexed column.
tip
  • Unique indexes are required for primary keys.
  • Only one clustering index per table.
  • Indexes consume disk space.

👓 View

A view is a saved SQL query that acts like a virtual table.

IBM Docs - VIEW

  • Real-World Usage
    • Used to simplify complex joins or hide sensitive columns.

Syntax:

CREATE VIEW HIGH_EARNERS AS
SELECT NAME, SALARY FROM HR.EMPLOYEES WHERE SALARY > 50000;

Try It Yourself

  • Create a view using a real table.
  • Query the view created previously
tip
  • Views do not store data.
  • Cannot use indexes directly on views.

🔢 Sequences

A sequence generates a unique series of numbers, often used for primary keys.

IBM Docs - Sequences

  • Real-World Usage
    • Used when inserting new rows and you need a reliable unique ID.

Syntax:

CREATE SEQUENCE EMP_SEQ START WITH 1 INCREMENT BY 1;

Try It Yourself

  • Insert the row in the EMP table using the Sequence EMP_SEQ for EMP_ID column.
tip
  • You must manually fetch sequence value during insert.
  • Use IDENTITY column to automate this.

🆔 Identity Column

An identity column auto-generates a number when inserting a row — simplifying primary key creation.

IBM Docs - Identity Column

  • Real-World Usage
    • Used when you don’t want to manage sequences manually.

Syntax:

CREATE TABLE EMP_AUTO (
EMP_ID INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
NAME VARCHAR(100)
);

Try It Yourself

  • Insert the row in EMP_AUTO table with name Alice.
  • Try to insert the row providing both EMP_ID and NAME.

🔁 Triggers

A trigger is a procedure that runs automatically on INSERT, UPDATE, or DELETE.

IBM Docs - Trigger

  • Real-World Usage
    • Used to enforce rules or do background updates.

Syntax:

CREATE TRIGGER CHECK_SALARY
BEFORE INSERT OR UPDATE ON HR.EMPLOYEES
REFERENCING NEW AS N
FOR EACH ROW
WHEN (N.SALARY < 30000)
SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = 'Salary too low!';

Try It Yourself

  • Insert a row with low salary and watch the trigger fire.
  • Then insert a valid row.
tip
  • Triggers fire row-by-row — may affect performance.
  • Debugging triggers can be tricky.
  • Avoid recursive logic unless carefully handled.

DB2 Catalog

Db2 catalog tables store metadata about your database objects — tables, indexes, schemas, buffer pools, and more. They act as a dictionary or directory that Db2 uses to manage and locate objects.

Catalog tables help DBAs and developers understand what objects exist, their definitions, storage info, and security settings. They are essential for troubleshooting, auditing, and performance tuning.

SELECT TABNAME, CREATOR, TYPE 
FROM SYSIBM.SYSTABLES
WHERE CREATOR = 'HR';

Important Db2 Catalog tables are:

Catalog TableDescription
SYSIBM.SYSTABLESInformation about tables and views
SYSIBM.SYSCOLUMNSDetails of columns in tables
SYSIBM.SYSINDEXESIndex information
SYSIBM.SYSTABLESPACETablespace metadata
SYSIBM.SYSBUFFERPOOLBuffer pool info
SYSIBM.SYSSTOGROUPStorage groups
SYSIBM.SYSSCHEMASSchemas and their owners
SYSIBM.SYSSEQUENCESSequence objects info
SYSIBM.SYSTRIGGERSTrigger definitions

IBM Docs - Catalog