Relational database management system fun. Who could ask for more? Really?
Here are some relational database notes in pdf and below in html, that explain more of the theoretical nature of relational dbs.
- All 13 of Codd’s 12 commandments
- Terminology, lingo and jargon
- Data types for all
- Primary keys and Foreign keys
- Normalization including First, Second and Third Normal Forms explained
- Why Queries are CRUD, and
- a simple way to design a data model
OK, so the notes are a little on the theoretical side, but each flavor of database is an implementation of the theoretical requirements–a reconciliation of the ideal with the real, the possible, warts and all. And that is why each SQL DBMS has slightly different SQL.
Codd’s 12 commandments
A relational database management system must manage its stored data using only its relational capabilities.
Rule 1 Information Rule
All information in the database should be represented in one and only one way – as values in a table, in a column.
Rule 2 Guaranteed Access Rule
Each and every datum (atomic value) is guaranteed to be logically accessible by resorting to a combination of table name, primary key value and column name.
Rule 3 Systematic Treatment of Null Values
Null values (distinct from empty character string or a string of blank characters and distinct from zero or any other number) are supported in the fully relational DBMS for representing missing information in a systematic way, independent of data type.
Rule 4 Dynamic On-line Catalog Based on the Relational Model
The database description is represented at the logical level in the same way as ordinary data, so authorized users can apply the same relational language to its interrogation as they apply to regular data. [Or the system must support an online, inline, relational catalog that is accessible to authorized users by means of their regular query language.] Database users should be able to access the meta data with the same query language that they access the data in the database.
Rule 5 Comprehensive Data Sublanguage Rule
A relational system may support several languages and various modes of terminal use. However, there must be at least one language whose statements are expressible, per some well-defined syntax, as character strings and whose ability to support all of the following is comprehensible:
The system must support at least one relational language that:
- Has linear syntax
- Can be used both interactively and within applications
- Supports data operations:
- Data definition operations (including data view definitions),
- Data manipulation operations (update as well as retrieval)
- Security and integrity constraints
- Transaction management operations (begin, commit, and rollback)
Rule 6 View Updating Rule *
All views that are theoretically updateable must be updateable by the system.
Rule 7 High-level Insert, Update, and Delete
The system must support set-at-a-time insert, update, and delete operators. Multiple rows and tables
The capability of handling a base relation or a derived relation as a single operand applies nor only to the retrieval of data but also to the insertion, update, and deletion of data.
Rule 8 Physical Data Independence
Changes to the physical level must not require a change to an application based on the structure.
Rule 9 Logical Data Independence *
Changes to the logical level must not require a change to an application based on the structure.
Rule 10 Integrity Independence *
Integrity constraints must be specified separately from application programs and stored in the catalog. [Validation rules that ensure the accuracy and consistency of the data should be stored in the meta data.]
Integrity constraints specific to a particular relational database must be definable in the relational data sublanguage and storable in the catalog, not in the application programs.
Rule 11 Distribution Independence *
The distribution of portions of the database to various locations should be invisible to users of the database.
The data manipulation sublanguage of a relational DBMS must enable application programs and terminal activities to remain logically unimpaired whether and whenever data are physically centralized or distributed.
Rule 12 Nonsubversion Rule *
If a relational system has or supports a low-level (single-record-at-a-time) language, that low-level language cannot be used to subvert or bypass the integrity rules or constraints expressed in the higher-level (multiple-records-at-a-time) relational language.
[Note that based on these rules there is no fully relational database management system available today. In particular, rules 6, 9, 10, 11 and 12 are difficult to satisfy.]
Relation/ Table: Every relation has two parts: 1) heading, 2) body
Heading has attributes (theory name) or columns (db name). Order does not matter. The number of attributes in a heading is the degree, so a heading with three columns has a degree of 3. Attributes have:
- Data type
Body, or rows, is made up of tuples. The number of tuples in the body (rows) is known as the cardinality.
Relvar: (relational variable) a set of results returned from a query.
A named, finite set of all possible values of a given kind
Integrity Rule 1: Domain integrity: We can only put data in our database if it is one of our predetermined data types.
- CHARACTER(n)/CHAR(n): character of length n, back end padded with spaces
- VARCHAR(n): variable length with no padded spaces
- INT: may have a few types depending on range
- FLOAT/ DOUBLE/REAL: for decimals
- DATE/TIME/TIMESTAMP: dates and times
- BOOLEAN: True/False
Data Atomicity: every field should only have a single value. Create a set of numbers for a list of numbers.
Null isn’t a value and does not belong to a type. It is a flag or marker saying the field is empty.
A primary key is a subset of a table heading (often a single attribute), with uniqueness.
Integrity Rule 2: Every table must have a primary key, which must be unique and must not be null.
Candidate key: a potential primary key.
Composite key: primary key made up of multiple attributes. Think of unique combinations of other tables, such as reviewer and restaurant in the review table
Natural key: use actual data as unique identifier, such as a unique name
Surrogate key: add an id number as a key, used a lot.
An attribute (or attributes) in one table that corresponds to the primary key of another table.
Think lookup such as an order table looking up customer information in customer table.
Foreign keys may point to other rows in the same table
Integrity Rule 3: We cannot have any unmatched foreign keys. The key points to nowhere problem.
Referential Integrity options:
- Cascade update/cascade delete: find all other related fields and either update or delete them
- Restrict: doesn’t allow deletions with dependencies involved
- Set Default: Dependencies are set to a default value
- Set Null: set all affected dependencies to null
The process of organizing your tables and their attributes to minimize redundancy and duplication in the database.
First Normal Form (1NF): a relation is in 1NF if the domain of each attribute contains only atomic values, and each value of each attribute contains only a single value from that domain. No multiple value cells. Make other tables out of them.
Second Normal Form (2NF): a relation is in 2NF only if it is in 1NF and no non-prime attribute is dependent on any proper subset of any candidate key of the table. Specific to composite primary keys. No attribute should be dependent of only part of a composite primary key. Think of splitting up tables.
Third Normal Form (3NF): A relation is in 3NF only if it is in 2NF and every non-prime attribute of the table is directly dependent on every superkey of the table. Every field is dependent on only the primary key and no other non-prime field. Think of splitting up tables all over the place.
- Create: Insert
- Read: Select
- Update: Update
- Delete: Delete
Views are virtual tables created by queries that can be treated as tables for more queries.
Add WITH CHECK OPTION to validate additional updates.
Client server architecture, with user permissions
Designing a Data Model
- Determine the purpose of the db.
- List all the potential fields
- Divide the fields into tables
- Give the fields names and data types
- Chose the Primary Key: Natural or Surrogate? Composite of Single?
- Determine the table relationships: 1 to 1, 1 to many, many to many?
- Refine the design: draw up a few on paper
- Apply the rules of normalization: get to 3NF
- Use entity-relationship diagram.
- Naming conventions: Tables get plural names, Attributes get single names