To truly understand how a database works, you need to move from abstract theoretical concepts to practical real-world examples. This section does just that, by using a simple example database to explain some of the basic concepts you must know before proceeding further.
Databases, Tables, and Records
Every database is composed of one or more tables. These tables, which structure data into rows and columns, are what lend organization to the data.
Primary and Foreign Keys
Records within a table are not arranged in any particular order—they can be sorted alphabetically, by ID, by member name, or by any other criteria you choose to specify. Therefore, it becomes necessary to have some method of identifying a specific record in a table.
Think of a primary key as a label on each file that tells you what it contains. In the absence of this label, the files would all look the same and it would be difficult for you to identify the one(s) you need.
With a relational database system like MySQL, it’s also possible to link information in one table to information in another. When you begin to do this, the true power of an RDBMS becomes evident.
Referential integrity is a basic concept with an RDBMS, and one that becomes important when designing a database with more than one table. When foreign keys are used to link one table to another, referential integrity, by its nature, imposes constraints on inserting new records and updating existing records. For example, if a table only accepts certain types of values for a particular field, and other tables use that field as their foreign key, this automatically imposes certain constraints on the dependent tables. Similarly, referential integrity demands that a change in the field used as a foreign key—a deletion or new insertion—must immediately be reflected in all dependent tables.
CREATE DATABASE name; USE name; SELECT id FROM name WHERE Age > 3; DELETE FROM name WHERE Status = "Revoked";
As these examples illustrate, SQL syntax is close to spoken English, and this makes it quite easy for novice programmers to learn and use. SQL statements can be divided into three broad categories, each concerned with a different aspect of database management.
- Statements used to define the structure of a database These statements define the relationships among different pieces of data; definitions for database, table, and column types; and database indexes. In the SQL specification, this component is referred to as Data Definition Language (DDL).
- Statements used to manipulate data These statements control adding and removing records, querying and joining tables, and verifying data integrity. In the SQL specification, this component is referred to as Data Manipulation Language (DML).
- Statements used to control the permissions and access level to different pieces of data These statements define the access levels and security privileges for databases, tables, and fields, which may be specified on a per-user and/or per-host basis. In the SQL specification, this component is referred to as Data Control Language (DCL).
Typically, every SQL statement ends in a semicolon, and white space, tabs, and carriage returns are ignored by the SQL processor. The following two statements are equivalent, even though the first is on a single line and the second is split over multiple lines.
DELETE FROM name WHERE Status = "Revoked";
An important part of designing a database is a process known as normalization. Normalization refers to the activity of streamlining a database design by eliminating redundancies and repeated values. Most often, redundancies are eliminated by placing repeating groups of values into separate tables and linking them through foreign keys.
This not only makes the database more compact and reduces the disk space it occupies, but it also simplifies the task of making changes. In non-normalized databases, because values are usually repeated in different tables, altering them is a manual (and errorprone) find-and-replace process. In a normalized database, because values appear only once, making changes is a simple one-step UPDATE.
The normalization process also includes validating the database relationships to ensure that there aren’t any crossed wires and to eliminate incorrect dependencies. This is a worthy goal, because when you create convoluted table relationships, you add greater complexity to your database design … and greater complexity translates into slower query time as the optimizer tries to figure out how best to handle your table joins.
A number of so-called normal forms are defined to help you correctly normalize a database. A normal form is simply a set of rules that a database must conform to. Five such normal forms exist, ranging from the completely non-normalized database to the fully normalized one.
Using the MySQL Command-Line Client
The MySQL RDBMS consists of two primary components: the MySQL database server itself and a suite of client-side programs, including an interactive client and utilities to manage MySQL user permissions, view and copy databases, and import and export data.
To begin, ensure that your MySQL server is running and then connect to it by entering the command mysql at your command prompt to invoke the command-line client. Remember to send a valid password with your username, or else MySQL will reject your connection attempt. (Throughout this section and the ones that follow, boldface type is used to indicate commands that you should enter at the prompt).
[user@host]# mysql -u root -p Password: ******
The mysql> you see is an interactive prompt, where you enter SQL statements. You can close the connection to the server and exit the client at any time by typing quit at the mysql> prompt.
mysql> quit Bye
Because all tables are stored in a database, the first statement you need to know is the CREATE DATABASE statement, which initializes an empty database. Try it out by creating a database called db_test:
mysql> CREATE DATABASE db_test; Query OK, 1 row affected (0.05 sec)
To select a particular database as the default for all subsequent statements, use the USE statement. Here’s an example:
mysql> USE db_test; Database changed
Once you’ve got a database, the next step is to add some tables to it. To create a table, use the CREATE TABLE statement, as in the following listing:
mysql> CREATE TABLE airport ( AirportID smallint(5) unsigned NOT NULL, AirportCode char(3) NOT NULL, AirportName varchar(255) NOT NULL, CityName varchar(255) NOT NULL, CountryCode char(2) NOT NULL, Runways INT(11) unsigned NOT NULL, NumTerminals tinyint(1) unsigned NOT NULL, PRIMARY KEY (AirportID) ) ENGINE=InnoDB; Query OK, 0 rows affected (0.38 sec)
The CREATE TABLE statement begins with the table name, followed by a set of parentheses. These parentheses enclose one or more field definitions, separated by commas. Each field definition contains the field name, its data type, and any special modifiers or constraints that apply. Following the closing parenthesis is an optional table type specifier, which tells MySQL which storage engine to use for this table. Table and field names must conform to the same rules that apply to database names. MySQL tables are stored as files within the database directory and, as such, are subject to the host operating system’s rules on filenames.
Specifying Field Data Types
When creating a MySQL table, specifying a data type for every field is necessary. This data type plays an important role in enforcing the integrity of the data in a MySQL database and in making this data easier to use and manipulate. MySQL offers a number of different data types.
- TINYINT, SMALLINT, MEDIUMINT, INT, BIGINT
- TINYBLOB, BLOB, MEDIUMBLOB, LONGBLOB
- TINYTEXT, TEXT, MEDIUMTEXT, LONGTEXT
[/column][column span=6 last=yes]
- Integer values
- Single-precision floating-point values
- Double-precision floating-point values
- Decimal values BIT — Bit-field values
- Fixed-length strings up to 255 characters
- Variable-length strings up to 255 characters
- Binary data
- Text blocks
- Date values
- Time values or durations
- Year values
- Combined date and time values
- Predefined sets of values
Adding Field Modifiers and Keys
A number of additional constraints, or modifiers, can be applied to a field to increase the consistency of the data that will be entered into it and to mark it as “special” in some way. These modifiers can either appear as part of the field definition, if they apply only to that specific field (for example, a default value for a field), or after all the field definitions, if they relate to multiple fields (for example, a multicolumn primary key).
- To specify whether the field is allowed to be empty or if it must necessarily be filled with data, place the NULL and NOT NULL modifiers after each field definition.
- To specify a default value for a field, use the DEFAULT modifier. This default value is used if no value is specified for that field when inserting a record. In the absence of a DEFAULT modifier for NOT NULL fields, MySQL automatically inserts a nonthreatening default value into the field.
- To have MySQL automatically generate a number for a field (by incrementing the previous value by 1), use the AUTO_INCREMENT modifier. This is particularly useful to generate row numbers for each record in the table. However, the AUTO_INCREMENT modifier can only be applied to numeric fields that are both NOT NULL and belong to the PRIMARY KEY. A table may only contain one AUTO_INCREMENT field.
- To specify the character set for fields containing string values, use the CHARACTER SET modifier.
- To index a field, use the INDEX modifier. When a field is indexed in this manner, MySQL no longer needs to scan each row of the table for a match when performing queries; instead, it can simply look up the index. Indexing is recommended for fields that frequently appear in the WHERE, ORDER BY, and GROUP BY clauses of SELECT queries and for fields used to join tables together.
- A variant of the INDEX modifier is the UNIQUE modifier, which is a special type of index used to ensure that values entered into a field must be either unique or NULL.
- To specify a primary key for the table, use the PRIMARY KEY modifier. The PRIMARY KEY constraint can best be thought of as a combination of the NOT NULL and UNIQUE constraints because it requires values in the specified field to be neither NULL nor repeated in any other row. It thus serves as a unique identifier for each record in the table, and it should be selected only after careful thought has been given to the inter-relationships between tables.
- To specify a foreign key for a table, use the FOREIGN KEY modifier. The FOREIGN KEY modifier links a field in one table to a field (usually a primary key) in another table, setting up a base for relationships. However, foreign keys are only supported in MySQL’s InnoDB storage engine; the FOREIGN KEY modifier is simply ignored in all other engines.
Using Other Table Modifiers
The TYPE attribute isn’t the only option available to control the behavior of the table being created. A number of other MySQL- specific attributes are also available. Here’s a list of the more interesting ones.
- The AUTO_INCREMENT modifier specifies the starting value to use for AUTO_ INCREMENT fields in the table.
- The CHARACTER SET and COLLATE modifiers specify the table character set and collation.
- The CHECKSUM modifier controls whether table checksums should be calculated and stored.
- The COMMENT modifier saves a descriptive label for the table.
- The MAX_ROWS and MIN_ROWS modifiers specify the maximum and minimum number of rows the table is likely to have.
- The PACK_KEYS modifier controls whether table indexes are compressed. Compressing indexes reduces the table’s size on disk, but can affect performance (as indexes need to be uncompressed every time they are updated).
- The DELAY_KEY_WRITE modifier controls whether table indexes are updated only after all writes to the table are complete. This can improve performance for tables that see a high frequency of writes.
- The UNION modifier specifies a list of tables to be merged (only useful with the MERGE storage engine).
- The DATA DIRECTORY and INDEX DIRECTORY modifiers specify custom paths for the table data and index files.
Table definitions created with the CREATE TABLE statement are not set in stone—it’s easy to alter them at a later date as well. The SQL statement to do this is the ALTER TABLE statement. It is used to add or delete fields; alter field types; add, remove, or modify keys; alter the table type; and change the table name (among other things). The following sections discuss these capabilities in greater detail.
Altering Table Names
To alter a table name, use an ALTER TABLE statement with a supplementary RENAME clause. The following example demonstrates by renaming table bills to invoices:
mysql> ALTER TABLE airport RENAME TO cities; Query OK, 0 rows affected (0.28 sec)
An alternative is to use the RENAME TABLE statement, which does the same thing. Here’s an example, which reverses the previous operation:
mysql> RENAME TABLE cities TO airport; Query OK, 0 rows affected (0.06 sec)
Removing Tables and Databases
To remove a database, use the DROP DATABASE statement, which deletes the named database and all its tables permanently. Similarly, to delete a table, use the DROP TABLE statement. Try this out by creating and dropping a database and a table:
mysql> CREATE DATABASE music; Query OK, 1 row affected (0.05 sec) mysql> CREATE TABLE member ( MemberID INT NOT NULL ); Query OK, 0 rows affected (0.00 sec) mysql> DROP TABLE member; Query OK, 0 rows affected (0.00 sec) mysql> DROP DATABASE music; Query OK, 0 rows affected (0.49 sec)