DB2 and SQL Concepts | Tiny Sparks Guru

Introduction to SPUFI in DB2

SPUFI (SQL Processor Using File Input) is a utility in IBM DB2 for z/OS that allows users to enter and execute SQL statements directly from an ISPF panel. Although commonly viewed as a basic SQL interface, SPUFI offers capabilities and limitations that are often missed in typical usage.

One of the most overlooked aspects of SPUFI is that, despite its name, it doesn't require file input—you can type SQL statements directly in the interface. It's primarily designed for developers and DBAs working in the TSO/ISPF environment, and it's ideal for ad hoc queries, testing, and even viewing catalog information. However, it’s not suited for production-grade SQL execution or for use with embedded SQL programs.

Useful Details

SPUFI accepts multiple SQL statements in one session, but each must be terminated with a semicolon (;) or another character defined by the user.
Unlike programmatic SQL environments, SPUFI does not support host variables, making it unsuitable for simulating COBOL or C applications.
The output format can be toggled between formatted (column-aligned) and unformatted (raw), which can greatly impact how readable your results are, especially when copying outputs into documents or reports.

Another key feature is the row limit setting, which restricts how many rows are displayed. This is often set too low by default (e.g., 100), causing users to mistakenly believe their query has returned fewer results. Raising this limit in the SPUFI setup panel helps ensure you're seeing the full output.

Temporary tables and changes to special registers like CURRENT SCHEMA exist only within the SPUFI session. This session-scoped behavior allows repeatable testing but means that no state is preserved afterward.

Finally, it’s worth noting that SPUFI does not perform syntax validation unless the SQL is actually executed. So a malformed SQL statement won’t trigger any warning until runtime.

Understanding SPUFI and able to execute will help in testing your queries well.

IBM Docs - SPUFI

Excellent — here is your full web-based handout for "How IBM Db2 Organizes Data: Logical and Physical Structures", structured in the long-term learning format:

DB2 Keys

Keys define how data in a table stays unique, linked, and valid. Db2 supports:

Primary Keys → Uniquely identify each row in a table.
Foreign Keys → Define relationships between tables.
Unique Keys → Prevent duplicate values in specific columns.

Primary Key

Uniquely identifies each row.
Must be NOT NULL.
Requires a manually created unique index in Db2 for z/OS.

-- Define a table with a Primary Key
CREATE TABLE HR.EMPLOYEES (
  EMP_ID INT NOT NULL,
  NAME VARCHAR(50),
  PRIMARY KEY (EMP_ID)
);

-- You MUST define the unique index yourself
CREATE UNIQUE INDEX EMP_IDX ON HR.EMPLOYEES (EMP_ID);

Foreign Key

Creates a parent-child relationship between tables.
Refers to a primary or unique key in another table.

CREATE TABLE HR.PAYROLL (
  PAY_ID INT PRIMARY KEY,
  EMP_ID INT,
  FOREIGN KEY (EMP_ID) REFERENCES HR.EMPLOYEES(EMP_ID)
);

You should create an index on foreign key columns for performance, though it’s not enforced.

Unique Key

Prevents duplicates in one or more columns.
Useful for business rules (like unique emails or SSNs).

ALTER TABLE HR.EMPLOYEES 
ADD CONSTRAINT EMP_UK_EMAIL UNIQUE (EMAIL);

IBM Docs - DB2 Keys

Constraint

Constraints enforce rules on the data in your tables to maintain accuracy, consistency, and reliability.

Db2 supports these types of constraints:

Entity Integrity → Ensured via Primary Keys
Referential Integrity → Enforced using Foreign Keys
Data Uniqueness → Guaranteed using Unique Constraints
Valid Values → Checked using Check Constraints
Mandatory Fields → Controlled with Not Null Constraints

IBM Docs - Constraints

Data types

A data type defines what kind of data a column can store: numbers, text, dates, etc. Choosing the correct type improves storage efficiency, ensures data accuracy, and boosts query performance.

IBM Docs - String

IBM Docs - Numeric

IBM Docs - Data time

🔢 Numeric Data Types

`INTEGER` – Whole Numbers

Stores integers (no decimal places).
Uses 4 bytes of storage.
Ideal for: IDs, counters, ages, quantities.

CREATE TABLE Employees (
  EmployeeID INTEGER PRIMARY KEY,
  Age INTEGER
);

example

✅ Use when you’re storing whole numbers only.

`DECIMAL(p, s)` – Fixed-Precision Numbers

Used for financial or high-precision numeric values.
p = precision (total digits), s = scale (digits after decimal).
Accurate and prevents rounding errors.

CREATE TABLE Accounts (
  AccountID INTEGER PRIMARY KEY,
  Balance DECIMAL(10, 2)
);

example

💡 Balance allows values like 12345678.90 — 10 digits total, 2 after the decimal.

🔤 Character Data Types

`CHAR(n)` – Fixed-Length Strings

Reserves exactly n characters.
Pads with spaces if the string is shorter.
Good for: country codes, fixed IDs, and status flags.

CREATE TABLE Countries (
  CountryCode CHAR(2) PRIMARY KEY,
  CountryName CHAR(50)
);

example

🛠 Use for data that’s always the same length.

`VARCHAR(n)` – Variable-Length Strings

Stores strings up to n characters, uses only what’s needed.
Reduces storage overhead for inconsistent text lengths.
Best for: names, emails, addresses.

CREATE TABLE Customers (
  CustomerID INTEGER PRIMARY KEY,
  Name VARCHAR(100),
  Email VARCHAR(100)
);

example

🧠 Efficient for text with variable lengths.

📅 Date and Time Data Types

📆 `DATE` – Calendar Dates Only

Stores dates like YYYY-MM-DD.
Used for birthdates, deadlines, event dates.

CREATE TABLE Events (
  EventID INTEGER PRIMARY KEY,
  EventDate DATE
);

example

🎯 Perfect when time isn’t relevant.

⏰ `TIMESTAMP` – Date + Time

Captures full datetime: YYYY-MM-DD HH:MM:SS.
Records exact moments — useful for audit trails.

CREATE TABLE Orders (
  OrderID INTEGER PRIMARY KEY,
  OrderTimestamp TIMESTAMP
);

example

🔍 Use for tracking creation or modification times.

✅ Quick Reference Table

Data Type	Purpose	Key Point
`INTEGER`	Whole numbers only	4 bytes, good for IDs, counts
`DECIMAL(p,s)`	Precise numeric values	Ideal for money
`CHAR(n)`	Fixed-length text	Padded with spaces if short
`VARCHAR(n)`	Variable-length text	Saves space, flexible input
`DATE`	Date without time	Format: `YYYY-MM-DD`
`TIMESTAMP`	Date with time	Format: `YYYY-MM-DD HH:MM:SS`

🏗️ How IBM Db2 Organizes Data: Logical and Physical Structures

Db2 uses a layered architecture to manage memory, storage, and logical database objects. This guide breaks each concept into real-world understanding, code examples, lab exercises, and gotchas.

IBM Docs - Defining DB2 Objects

IBM Docs - SQL DDL Statement

📦 Buffer Pool (Handled by DBA)

A buffer pool is a section of memory that caches recently accessed data pages. It helps avoid reading from disk repeatedly, speeding up performance.

IBM Docs - Importance of Buffer pool

Real-World Usage
- Used to optimize I/O performance. DBAs assign different buffer pools to tablespaces based on how frequently data is accessed.

Syntax:

CREATE BUFFERPOOL BP1 SIZE 1000 PAGESIZE 8K;

Above command will create buffer pool BP1 with 1000 pages, each 8KB in size.

note

Generally DBA / DB2 Admins only will have access to create Buffer pools

tip

Page size must match the tablespace.
Buffer pools consume real memory.
Use only needed pools to avoid memory exhaustion.

💽 Storage Group (STOGROUP) (Handled by DBA)

A storage group is a collection of disk volumes where Db2 physically stores data. It lets DBAs control where tablespaces and data reside.

IBM Docs - Storage group

Real-World Usage
- Used to group specific disk volumes for high-speed or low-cost storage tiers.

Syntax:

CREATE STOGROUP STG1 VOLUMES ('VOL1', 'VOL2') VCAT MYCAT;

note

Generally DBA / DB2 Admins only will have access to create Storage groups.

tip

VCAT refers to the catalog that indexes the volumes.
Ensure volumes exist and are assigned properly.
STOGROUP is required for tablespace creation.

🗃️ Database

The database is the top-level container that holds all schemas, tables, indexes, and views.

IBM Docs - Database

Real-World Usage
- In Db2/zOS, databases are mandatory and allow grouping of related objects for performance and administration.

Syntax:

CREATE DATABASE DB1 
  BUFFERPOOL BP1 
  INDEXBP IDXBP1 
  STOGROUP STG1 
  CCSID EBCDIC;

note

Generally DBA only will have access to create database

tip

CCSID (character set) must match your system/application.

📁 Tablespace (Handled by DBA)

A tablespace maps logical tables and indexes to physical storage. It connects storage groups with buffer pools.

IBM Docs - Tablespace

Real-World Usage
- DBAs use different tablespaces for different workloads — OLTP, reporting, archive — for better control and performance.

Syntax:

CREATE TABLESPACE TS1 
  IN DB1 
  USING STOGROUP STG1 
  BUFFERPOOL BP1;

Try It Yourself

Try to create a tablespace after getting STOGROUP and BUFFERPOOL from DBA (See if you have access)

Query it:

SELECT * FROM SYSIBM.SYSTABLESPACE WHERE NAME = 'TS1';

tip

Cannot use a buffer pool with a mismatched page size.
Always link to a valid storage group.

🧾 Schema

A schema organizes related database objects like tables, views, and indexes. It prevents name conflicts.

IBM Docs - Schema

Real-World Usage
- Each developer or team can use their own schema to avoid clashing over table names.

Syntax:

SELECT * FROM HR.EMPLOYEES;

Here, HR is the schema qualifier.

Try It Yourself

Try selecting from a table with and without schema:

tip

Default schema = your TSO ID.
You can’t access other schemas without authorization (SQLCODE -551).

📊 Table

A table holds user data in rows and columns. Each column has a type like VARCHAR, INT, etc.

IBM Docs - Table

Real-World Usage
- Tables are the foundation of all business data: employee records, transactions, logs, etc.

Syntax:

CREATE TABLE EMPLOYEES (
  EMP_ID INT PRIMARY KEY,
  NAME VARCHAR(100),
  SALARY DECIMAL(10,2)
) IN DB1.TS1;

Try It Yourself

Create your own table.
Then run:
```
SELECT * FROM <Your Schema>.EMPLOYEES;
```

tip

You must create a UNIQUE INDEX for primary keys.
Db2 does not auto-create indexes like Oracle/PostgreSQL.

📈 Index

An index improves query performance by allowing faster row lookup based on key values.

Real-World Usage
- Indexes are essential on large tables with WHERE clauses or JOIN conditions.

Syntax:

CREATE UNIQUE INDEX EMP_IDX ON HR.EMPLOYEES (EMP_ID) 
USING STOGROUP STG1;

Try It Yourself

Create both a unique and non-unique index.
Try running a query using the indexed column.

tip

Unique indexes are required for primary keys.
Only one clustering index per table.
Indexes consume disk space.

👓 View

A view is a saved SQL query that acts like a virtual table.

IBM Docs - VIEW

Real-World Usage
- Used to simplify complex joins or hide sensitive columns.

Syntax:

CREATE VIEW HIGH_EARNERS AS
SELECT NAME, SALARY FROM HR.EMPLOYEES WHERE SALARY > 50000;

Try It Yourself

Create a view using a real table.
Query the view created previously

tip

Views do not store data.
Cannot use indexes directly on views.

🔢 Sequences

A sequence generates a unique series of numbers, often used for primary keys.

IBM Docs - Sequences

Real-World Usage
- Used when inserting new rows and you need a reliable unique ID.

Syntax:

CREATE SEQUENCE EMP_SEQ START WITH 1 INCREMENT BY 1;

Try It Yourself

Insert the row in the EMP table using the Sequence EMP_SEQ for EMP_ID column.

tip

You must manually fetch sequence value during insert.
Use IDENTITY column to automate this.

🆔 Identity Column

An identity column auto-generates a number when inserting a row — simplifying primary key creation.

IBM Docs - Identity Column

Real-World Usage
- Used when you don’t want to manage sequences manually.

Syntax:

CREATE TABLE EMP_AUTO (
  EMP_ID INT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  NAME VARCHAR(100)
);

Try It Yourself

Insert the row in EMP_AUTO table with name Alice.
Try to insert the row providing both EMP_ID and NAME.

🔁 Triggers

A trigger is a procedure that runs automatically on INSERT, UPDATE, or DELETE.

IBM Docs - Trigger

Real-World Usage
- Used to enforce rules or do background updates.

Syntax:

CREATE TRIGGER CHECK_SALARY
BEFORE INSERT OR UPDATE ON HR.EMPLOYEES
REFERENCING NEW AS N
FOR EACH ROW
WHEN (N.SALARY < 30000)
SIGNAL SQLSTATE '45000' SET MESSAGE_TEXT = 'Salary too low!';

Try It Yourself

Insert a row with low salary and watch the trigger fire.
Then insert a valid row.

tip

Triggers fire row-by-row — may affect performance.
Debugging triggers can be tricky.
Avoid recursive logic unless carefully handled.

DB2 Catalog

Db2 catalog tables store metadata about your database objects — tables, indexes, schemas, buffer pools, and more. They act as a dictionary or directory that Db2 uses to manage and locate objects.

Catalog tables help DBAs and developers understand what objects exist, their definitions, storage info, and security settings. They are essential for troubleshooting, auditing, and performance tuning.

SELECT TABNAME, CREATOR, TYPE 
FROM SYSIBM.SYSTABLES 
WHERE CREATOR = 'HR';

Important Db2 Catalog tables are:

Catalog Table	Description
`SYSIBM.SYSTABLES`	Information about tables and views
`SYSIBM.SYSCOLUMNS`	Details of columns in tables
`SYSIBM.SYSINDEXES`	Index information
`SYSIBM.SYSTABLESPACE`	Tablespace metadata
`SYSIBM.SYSBUFFERPOOL`	Buffer pool info
`SYSIBM.SYSSTOGROUP`	Storage groups
`SYSIBM.SYSSCHEMAS`	Schemas and their owners
`SYSIBM.SYSSEQUENCES`	Sequence objects info
`SYSIBM.SYSTRIGGERS`	Trigger definitions

IBM Docs - Catalog

Introduction to SPUFI in DB2​

Useful Details​

DB2 Keys​

Primary Key​

Foreign Key​

Unique Key​

Constraint​

Data types​

🔢 Numeric Data Types​

INTEGER – Whole Numbers​

DECIMAL(p, s) – Fixed-Precision Numbers​

🔤 Character Data Types​

CHAR(n) – Fixed-Length Strings​

VARCHAR(n) – Variable-Length Strings​

📅 Date and Time Data Types​

📆 DATE – Calendar Dates Only​

⏰ TIMESTAMP – Date + Time​

✅ Quick Reference Table​

🏗️ How IBM Db2 Organizes Data: Logical and Physical Structures​

📦 Buffer Pool (Handled by DBA)​

💽 Storage Group (STOGROUP) (Handled by DBA)​

🗃️ Database​

📁 Tablespace (Handled by DBA)​

🧾 Schema​

📊 Table​

📈 Index​

👓 View​

🔢 Sequences​

🆔 Identity Column​

🔁 Triggers​

DB2 Catalog​

Introduction to SPUFI in DB2

Useful Details

DB2 Keys

Primary Key

Foreign Key

Unique Key

Constraint

Data types

🔢 Numeric Data Types

`INTEGER` – Whole Numbers

`DECIMAL(p, s)` – Fixed-Precision Numbers

🔤 Character Data Types

`CHAR(n)` – Fixed-Length Strings

`VARCHAR(n)` – Variable-Length Strings

📅 Date and Time Data Types

📆 `DATE` – Calendar Dates Only

⏰ `TIMESTAMP` – Date + Time

✅ Quick Reference Table

🏗️ How IBM Db2 Organizes Data: Logical and Physical Structures

📦 Buffer Pool (Handled by DBA)

💽 Storage Group (STOGROUP) (Handled by DBA)

🗃️ Database

📁 Tablespace (Handled by DBA)

🧾 Schema

📊 Table

📈 Index

👓 View

🔢 Sequences

🆔 Identity Column

🔁 Triggers

DB2 Catalog