DB-Lec 2: Rational Model (1)

Welcome file

DB-Lecture 2: Relational Model(1)

Definition:

  • The relational model is very simple and elegant.

  • A relational database is a collection of one or more relations, which are based on the relational model.

  • A relation is a table with rows and columns.

  • The major advantages of the relational model are its straightforward data representation and the ease with which even complex queries can be expressed.

  • Owing to the great language SQL, the most widely used language for creating, manipulating, and querying relational database.

I.Structure of Relational Databases

1.1.Basic Structure

Formally, given sets Di(i=1,...,n)(Di=aijj=1,...k)D_i(i=1,...,n) (D_i=a_{ij}|_{j=1,...k}), a relation rr is a subset of D1×D2×...×DnD_1\times D_2\times ...\times D_n — a Cartesian product of a list of domain DiD_i.

Thus, a relation is a set of n-tuples (a1j,a2j,...,anj)(a_{1j},a_{2j},...,a_{nj}), where each aijDi(i[1,n])a_{ij}\in D_i(i\in [1, n])

1.2.Attribute Types

  • Each attribute of a relation has a name

  • The set of allowed valued for each attribute is called the domain of the attribute.

  • Attribute values are (normally) required to be atomic, i.e., indivisible:

  • Multivalued attribute values are not atomic.

  • Composite attribute values are not atomic.

  • The special value null is a member of every domain.

  • The null value causes complications in the definition of many operations.

1.3.Concepts about Relation

A relation is concerned with two concepts: relation schema and relation instance.

The relation schema describes the structure of the relation.

The relation instance corresponds to the snapshot of the data in the relation at a given instant in time.

Relation Schema

Assume A1,A2,...,AnA_1,A_2,...,A_n are attributes. Formally expressed:

R=(A1,A2,...,An)R=(A_1,A_2,...,A_n) is a relation schema.

r(R)r(R) is a relation on the relation schema RR

Relation Instance

The current values (i.e., relation instance) of a relation are specified by a table.

An element t of r is a tuple, represented by a row in a table.

Let a tuple variable t be a tuple, then t[name] denotes the value of t on the name attribute.

The Properties of Relation

  • The order of tuples is irrelevant (i.e., tuples may be stored in an arbitrary).

  • No duplicated tuples in a relation.

  • Attribute values are atomic.

Database

A database consists of multiple relations

Information about an enterprise (e.g., university) is broken up into parts: Instructor, Student, Advisor.

Bad design: univ (instructor -ID, name, dept_name, salary, student_Id, …)

results in :

  • Repetition of information (e.g., two students have the same instructor)

  • The need for null values (e.g., represent an student with no advisor)

Normalization theory (Chapter 7) deals with how to design “good” relational schemas

Key

Let LRL\subset R

KK is a superkey of RR is values for KK are sufficient to identify a unique tuple of each possible relation r(R)r(R):

  • E.g., both {ID} and {ID, name} are superkeys of the relation instructor.

KK is a candidate ket is KK is minimal superkey.

  • E.g., {ID} is a candidate key for the relation instructor, since it is a superkey and no any subset.

KK is a primary key, if KK is a candidate key and is defined by user explicity.

  • Primary key is usually marked by underline.

Primary key and candidate key can not be Null.

Foreign Key

Assume there exists relations rr and ss: r(A,B,C),s(B,D)r(\underline{A}, B, C), s(\underline{B}, D), we can say that attribute B in relation r is foreign key referencing s, and r is a referenced relation, and s is a referenced relation.

参照关系中的外码的值必须在被参照关系中实际存在,或为null.

Primary key and foreign key are integrated constraints.

  • Primary key constraints

  • Foreign key constraint

  • Referential integrity constraint

Query Languages

Language in which user requests information from the database.

Pure query languages: refer to query languages that adhere to the principles of purity, meaning they are free from side effects and rely solely on input data to produce output.

  • Relational Algebra — the basis of SQL

  • Tuple Relational Calculus

  • Domain Relational Calculus —QBE

Pure query languages form underlying basis of query languages that people use.

II.Relational-Algebra

2.1.Fundamental Relational-Algebra Operations

  • Six basic operators:

  • Select σ\sigma

The select operation selects tuples that satisfy a given predicate.

Notation:σp(r)σ_p(r)

p is called the selection predicate

σp(r)\sigma_p(r) can be defined as:σp(r)={ttrandp(t)}\sigma_p(r)=\{t|t\in r \text{and} p(t)\}

In the selection predicate p, we allow comparisons using =,,>,,<,=,\neq, >, \geq, <, \leq

The select predicate may include comparisons between two attributes.

General form of a single p

We can combine several predicates into a larger predicate by using the connectives

Note that, the selection conditions need to aim at the attribute values of the same tuple, when we conduct section operation.

  • Project \prod

Project operation is a unary operation that returns its argument relation, with certain attributes left out.

Notation: A1,A2,...,Ak(r)\prod_{A_1,A_2,...,A_k}(r)

where A1,...,AkA_1, ..., A_k are attribute names and rr is a relation name.

The result is defined as the relation of k columns obtained by erasing the columns that are not listed.

Duplicate rows removed from result, since relations are sets.

  • Union \cup

The union operation allows us to combine two relations.

Notation: rsr\cup s

Defined as: rs={ttrorts}r\cup s = \{t| t\in r \text{or} t\in s\}

For rsr\cup s to be valid:

  • r and s must have the same arity (i.e., the same number of attributes)

  • The attribute domains must be compatible

  • set difference -

The set-difference operation allows us to find tuples that are in one relation but are not in another.

Notation: rsr - s

Defined as: rs={ttrandts}r - s = \{t|t\in r \text{and} t\notin s\}

Set differences must be taken between compatible relations:

  • r and s must have the same arity.

  • Attribute domains of r and s must be compatible.

  • Cartesian product ×\times

The Cartesian-product operation (denoted by ×\times) allows us to combine information from any two relations.

Notation: r×sr\times s

Defined as:r×s={{tq}trandqs}r\times s = \{\{t q\}|t\in r \text{and} q\in s\}

Assume that attributes of r® and s(S) are disjoint (i.e., RS=ΦR\cap S=\Phi).

If attributes of r® and s(S) are not disjoint, then renaming for attributes must be used.

  • attaching to the attribute the name of the relation from which the attribute originally came.

Composition of Operations:

  • The result of a relational-algebra operation is relation

  • Relational-algebra operations can be composed together into a relational-algebra expression

  • Consider the query – Find the names of all instructors in the Physics department.

name(σdepth_name="Physics"(instructor))\prod_{\text{name}}(\sigma_{\text{depth\_name="Physics"}}(\text{instructor}))

  • Instead of giving the name of a relation as the argument of the projection operation, we give an expression that evaluates to a relation.

  • Rename ρ\rho

Allows us to name, and therefore to refer to, the results of relational-algebra expressions. (procedural)

Allows us to refer to a relation by more than one name.

The expression:ρX(E)\rho_X(E) returns the expression E under the name X

If a relational-algebra expression E has arity n, then

ρX(A1,A2,...,An)(E)\rho_{X(A_1,A_2,...,A_n)}(E)

returns the result of expression E

  • The operators take one or two relations as inputs, and return a new relation as a result.

  • one relation: unary operation

  • two relations: binary operations

Example Queries:

  • Example 1: Find all loans of over $1200 σamount>1200(loan)σ_{\text{amount}>1200}(\text{loan})

  • Example 2: Find the loan number for each loan of an amount greater than $1200. loan-number(σamount>1200(loan))\prod_{\text{loan-number}}(σ_{\text{amount}>1200}(\text{loan}))

  • Example 3: Find the names of all customers who have a loan, or an account, or both, from the bank. customer-name(borrower)customer-name(depositor)\prod_{\text{customer-name}}(\text{borrower})\cup\prod_{\text{customer-name}}(\text{depositor})

  • Example 4: Find the names of all customers who at least have a loan and an account at bank. customer-name(borrower)customer-name(depositor)\prod_{\text{customer-name}}(\text{borrower})\cap\prod_{\text{customer-name}}(\text{depositor})

此博客中的热门博文

Numberical Analysis --- Interpolation & Polynomial Approximation

Compuer Animation(Postgraduate Course): Lecture 4:Keyframe interpolation and velocity control

Compuer Animation(Postgraduate Course): Lecture 3:Representation of transformation and rotation