Craig S. Mullins

Return to Home Page

November 2003

 

 

 

                                           



The DBA Corner
by Craig S. Mullins  

 

Nuggets of Data Rambling Through My Mind

I constantly have random database-related thoughts that build up over time. I thought it would be useful and interesting to group some of these thoughts together into a column, sort of like Andy Rooney meets IT.

Do you ever wonder why e-mail systems don't use database management technology? Do you store some of your e-mails for long periods of time? Do you group them into folders? Are they hard to find later? And the substance of the message isn't usually obvious from the folder name you gave it (which made sense at the time) or the subject of the e-mail (which might not have anything to do with the actual content). I'd love to be able to use SQL against my e-mail system, writing something like:

SELECT TEXT
FROM ALL OF MY E-MAIL FOLDERS
WHERE TEXT CONTAINS "NEW DB2 PRODUCT ROLLOUT"

I bet a DBMS that interfaces to or integrates with e-mail would be a rousing success.

Have you noticed that relational technology is taking a beating these days? With new "data models" purported to be better than relational, it seems that the industry is all too ready to stick a fork in relational technology.

There are no true relational database management systems. DB2 and Oracle and SQL Server, though based on the relational model, do not support all of the features that would make them truly relational. They are better termed "SQL DBMS products." Before sounding the death knell for relational, shouldn't we at least try a fully relational DBMS first? Keep in mind that the three largest software companies in the world today are the three biggest providers of relational--well, SQL--database technology: IBM, Microsoft, and Oracle. These companies’ database products are successful because they are useful, and, in fact, help companies manage and access their data.

Why is it important for my DBMS to be relational? My DBMS works just fine as it is, how could relational help? With SQL, you have a language that is non-orthogonal. You can write the same data request in many ways. This can cause performance problems and confuse programmers. Another problem--current SQL database systems permit the creation of tables that can contain duplicate rows. The relational model is based on set theory, and a set cannot contain duplicate members. And there are significant features that current SQL database systems omit, such as domains. A domain is basically the set of all valid values for a particular column (or attribute). It is flexible, so you can define the valid values more precisely than a mere data type. But a domain actually defines the valid operations and comparisons that can be performed on columns assigned to that domain. Because SQL DBMS products lack domains, it is completely legal to code the following predicates:

WHERE AVG(SALARY) > PHONE_NUMBER
WHERE SHOE_SIZE = IQ
WHERE AGE = FIRST_NAME

These are not useful predicates. But if each of these columns were assigned to a different domain, comparisons between them would be disallowed, because comparisons between incompatible domains would be disallowed. And values outside the scope of the domain could not be assigned to the column. And these would be good things because it would boost data integrity and data quality.

From Database Trends and Applications, November 2003.

© 2003 Craig S. Mullins,  All rights reserved.

Home.