Normal Forms and All That Jazz :
A Database Professional's Guide to the Theory of Database Design
a Chris Date Master Class
ABOUT THIS SEMINAR
How many of these questions can you answer?
1. What's the difference between 3NF and BCNF?As you can see, these questions all have to do with normalization and normal forms. Normal forms are important, of course, but there's much more to database design theory than just normal forms as such. Here are some more questions:
2. Is it true that if a table has just two columns, then it's in 4NF?
3. Is it true that if a table has just one key and just one other column, then it's in 5NF?
4. Is it true that if a table is in BCNF but not 5NF, then it must be all key?
5. Is it true that 5NF tables are redundancy free?
6. What precisely is denormalization?
7. What's Heath's Theorem, and why is it important?All of these questions have to do with design theory. Design theory is the scientific foundation for database design, just as the relational model is the scientific foundation for database technology in general. And just as anyone professionally involved in database technology in general needs to be familiar with the relational model, so anyone involved in database design in particular needs to be familiar with design theory. But design theory has its problems ... and one of those problems, from the practitioner's point of view at any rate, is that it's riddled with terms and concepts that are difficult to understand and don't seem to have much to do with design as actually done in practice. Now, nobody could claim designing databases is easy; but a sound knowledge of the theory can only help. In fact, if you want to do design properly- if you want to build databases that are robust and flexible and accurate- then you really have to come to grips with that theory. There's just no alternative: at least, not if you want to claim to be a design professional. Proper design is so important! (After all, the database lies at the heart of much of what we do in the computing world; so if the database is badly designed, the negative impacts can be extraordinarily widespread.) Attend this seminar, then, and learn the answers to questions like those above, as well as much, much more. To be specific, the seminar will:
8. What's The Principle of Orthogonal Design?
9. What makes some JDs reducible and others irreducible?
10. What's dependency preservation, and why is it important?
11. Should data redundancy always be avoided? Can it be?
12. What's the chase?
* Review, but from an unfamiliar perspective, aspects of design you should already be familiar withOverall, the intent is to serve as a painless introduction to design theory for database professionals.
* Explore in depth aspects you're probably not already familiar with
* Provide clear and accurate explanations and definitions of all pertinent concepts
* Not spend a lot of time on well-known material such as 2NF and 3NF
TOPICS OUTLINE
1. PreliminariesDURATION2. Normalization
- Terminology
- The running example
- The place of design theory
- Relations
- Relvars
- Predicates
- Propositions
3. FDs and BCNF
- Objectives
- Update anomalies
- The normal form hierarchy
- Constraints
4. Preserving FDs
- First normal form
- Functional dependencies
- Second and third normal form
- Boyce/Codd normal form
- Heath’s Theorem
- Armstrong’s axioms
- Additional rules
5. Denormalization
- An unfortunate conflict
- A 3NF algorithm
- A BCNF algorithm
- Boyce/Codd normal form
- Independent projections
6. JDs and 5NF
- Arguments for denormalization
- What denormalization isn’t
- Denormalization considered harmful
7. Implicit dependencies
- Join dependencies
- Fifth normal form
- Cyclic rules
- A useful theorem
- FDs aren’t JDs
- Update anomalies revisited
8. MVDs and 4NF
- Irrelevant components
- Combining components
- Irreducible JDs
- The chase algorithm
9. Further normal forms
- Multivalued dependencies
- Fourth normal form
- Axiomatization
- Embedded dependencies
10. Orthogonality
- Equality dependencies
- Sixth normal form
- Superkey normal form
- Redundancy free normal form
- Essential tuple normal form
- Domain-key normal form
11. Redundancy
- Two cheers for normalization
- A motivating example
- Tuples vs. propositions
- What orthogonality isn’t
- The Principle of Orthogonal Design
12. Do we need primary keys?
- A little history
- Database design is predicate design
- A series of examples
- Managing redundancy
- A definition
- Arguments in favor
- Relvars with more than one key
- The invoices and shipments example
- The applicants and employees example
- Recommendations
Two days (12 classroom hours).WHO SHOULD ATTEND
This is an advanced class; attendees will be expected to be familiar with the relational model and to have a professional interest in database design. Prior attendance at the Chris Date Master Class 'SQL and Relational Theory: How to Write Accurate SQL Code' is recommended, although by no means essential before attending this Design class.OBJECTIVES
On completion of this seminar, attendees will:DOCUMENTATION
- Understand, and be able to apply, the scientific principles of normalization and orthogonality that underlie design practice
- Know which normal forms are important, how they differ from one another, and how to achieve them
- Understand dependencies and the concepts of dependency inference and dependency preservation
- Generally, understand the contributions (and the limitations) of design theory
Attendees will receive a workbook containing copies of the speaker's slides.
(Download this seminar description in PDF format for printing from here)