跳转至

Chapter 3 Introduction to SQL

文本统计:约 3080 个字 • 211 行代码

The SQL data-definition language (DDL) allows the specification of information about relations, including:

  • The schema for each relation.
  • The domain of values associated with each attribute.
  • Integrity constraints

And as we will see later, also other information such as

  • The set of indices to be maintained for each relations.
  • Security and authorization information for each relation.
  • The physical storage structure of each relation on disk.

3.1 Domain Types in SQL

(1) char(n). Fixed length character string, with user-specified length n.

(2) varchar(n). Variable length character strings, with user-specified maximum length n.

(3) int. Integer (a finite subset of the integers that is machine-dependent).

(4) smallint. Small integer (a machine-dependent subset of the integer domain type).

(5) numeric(p,d). Fixed point number, with user-specified precision of p digits, with d digits to the right of decimal point.

Note

number(3,1) allows 44.5 to be store exactly, but neither 444.5 or 0.32

(6) real, double precision. Floating point and double-precision floating point numbers, with machine-dependent precision.

(7) float(n). Floating point number, with user-specified precision of at least n digits.

3.2 Built-in Data Types in SQL

date: Dates, containing a (4 digit) year, month and date

  • Example: date ‘2005-7-27’

time: Time of day, in hours, minutes and seconds.

  • Example: time ‘09:00:30’ time ‘09:00:30.75’

timestamp: date plus time of day

  • Example: timestamp ‘2005-7-27 09:00:30.75’

interval: period of time

  • Subtracting a date/time/timestamp value from another gives an interval value
  • Interval values can be added to date/time/timestamp values

3.3 Table Constructs

3.3.1 Create Table Construct

An SQL relation is defined using the create table command:

create table r (A1 D1, A2 D2, ..., An Dn,                       (integrity-constraint1),
            ...,    
            (integrity-constraintk))
  • \(r\) is the name of the relation
  • each \(A_i\) is an attribute name in the schema of relation \(r\)
  • \(D_i\) is the data type of values in the domain of attribute \(A_i\)

比如说创建一个学生的数据表

create table student (
    ID          varchar(5),
    name        varchar(20) not null,
    dept_name   varchar(20),
    tot_cred    numeric(3,0) default 0,
    primary key (ID),
    foreign key (dept_name) references department) );

创建一个学生参加什么课程的数据表

create table takes (
    ID           varchar(5),
    course_id    varchar(8),
    sec_id       varchar(8),
    semester     varchar(6),
    year         numeric(4,0),
    grade        varchar(2),
    primary key (ID, course_id, sec_id, semester, year),
    foreign key (ID) references  student,
    foreign key (course_id, sec_id, semester, year) references section );

Note

sec_id can be dropped from primary key above, to ensure a student cannot be registered for two sections of the same course in the same semester

在数据库设计中,外键(Foreign Key)约束用于维护两个表之间的关系。当一个表中的数据被删除或更新时,这些操作可能会对另一个表产生影响。为了处理这种情况,可以设置不同的级联操作(Cascade Actions)。以下是这些操作的含义:

  1. ON DELETE 操作
  • CASCADE: 当父表中的记录被删除时,子表中所有相关联的记录也会被自动删除。
  • SET NULL: 当父表中的记录被删除时,子表中对应的外键字段会被设置为NULL(前提是该字段允许NULL值)。
  • RESTRICT: 如果子表中有任何记录依赖于父表中的记录,则不允许删除父表中的记录。这会阻止删除操作,并返回错误。
  • SET DEFAULT: 当父表中的记录被删除时,子表中对应的外键字段会被设置为其默认值(前提是该字段有默认值)。
  1. ON UPDATE 操作
  • CASCADE: 当父表中的记录被更新时,子表中所有相关联的记录也会被自动更新,以保持一致性。
  • SET NULL: 当父表中的记录被更新时,子表中对应的外键字段会被设置为NULL(前提是该字段允许NULL值)。
  • RESTRICT: 如果子表中有任何记录依赖于父表中的记录,则不允许更新父表中的记录。这会阻止更新操作,并返回错误。
  • SET DEFAULT: 当父表中的记录被更新时,子表中对应的外键字段会被设置为其默认值(前提是该字段有默认值)。

Example

假设我们有两个表:departmentemployee,其中 employee 表有一个外键 dept_name 引用 department 表的主键 dept_name

CREATE TABLE department (
    dept_name VARCHAR(50) PRIMARY KEY,
    dept_location VARCHAR(100)
);

CREATE TABLE employee (
    emp_id INT PRIMARY KEY,
    emp_name VARCHAR(50),
    dept_name VARCHAR(50),
    FOREIGN KEY (dept_name) REFERENCES department(dept_name)
        ON DELETE CASCADE
        ON UPDATE CASCADE
);

在这个例子中:

  • ON DELETE CASCADE: 如果某个部门被删除,那么所有属于该部门的员工记录也会被自动删除。
  • ON UPDATE CASCADE: 如果某个部门的名字被更改,那么所有属于该部门的员工记录中的 dept_name 字段也会被自动更新为新的部门名字。

3.3.2 Drop and Alter Table Constructs

(1) drop table student. Deletes the table and its contents 删除一整个表

(2) delete from student. Deletes all contents of table, but retains table 删除所有数据,但是保留表

(3) alter table

  • alter table r add A D, where A is the name of the attribute to be added to relation r and D is the domain of A. All tuples in the relation are assigned null as the value for the new attribute.
  • alter table r drop A, where A is the name of an attribute of relation r. Dropping of attributes not supported by many databases 代价大,不鼓励

3.4 Basic Query Structure

The SQL data-manipulation language (DML) provides the ability to query information, and insert, delete and update tuples

A typical SQL query has the form:

select A1, A2, ..., An
from r1, r2, ..., rm
where P
  • \(A_i\) represents an attribute

  • \(R_i\) represents a relation

  • \(P\) is a predicate.

The result of an SQL query is a relation.

3.4.1 The select Clause

The select clause list the attributes desired in the result of a query, corresponds to the projection operation of the relational algebra.

Note

SQL names are case insensitive (i.e., you may use upper- or lower-case letters.)

SQL allows duplicates in relations as well as in query results.

(1) To force the elimination of duplicates, insert the keyword distinct after select. 去重加一个关键字 distinct

Find the names of all departments with instructor, and remove duplicates

select distinct dept_name
from instructor

(2) The keyword all specifies that duplicates not be removed. 要得到所有的就加一个关键字 all

select all dept_name
from instructor

(3) An asterisk in the select clause denotes “all attributes”

select *
from instructor

(4) The select clause can contain arithmetic expressions involving the operation, +, –, *, and /, and operating on constants or attributes of tuples.

select ID, name, salary/12
from instructor

3.4.2 The where Clause

The where clause specifies conditions that the result must satisfy. Corresponds to the selection predicate of the relational algebra.

Example

To find all instructors in Comp. Sci. dept with salary > 80000

select name
from instructor
where dept_name = ‘Comp. Sci.'  and salary > 80000

Comparison results can be combined using the logical connectives and, or, and not. 通过与或非连接

Comparisons can be applied to results of arithmetic expressions. 可先进行算数运算

关于谓词,SQL includes a between comparison operator

Example

Find the names of all instructors with salary between $90,000 and $100,000 (that is, \(\ge\) $90,000 and \(\le\) $100,000)

select name
from instructor
where salary between 90000 and 100000

SQL 也支持元组比较

select name, course_id
from instructor, teaches
where (instructor.ID, dept_name) = (teaches.ID, ’Biology’);

3.4.3 The from Clause

The from clause lists the relations involved in the query. Corresponds to the Cartesian product operation of the relational algebra.

Cartesian product not very useful directly, but useful combined with where-clause condition (selection operation in relational algebra)

Example

Find the course ID, semester, year and title of each course offered by the Comp. Sci. department

 select section.course_id, semester, year, title
 from section, course
 where   section.course_id = course.course_id  and
 dept_name = ‘Comp. Sci.' 

我们也可以指定连接方式为自然连接, Natural join matches tuples with the same values for all common attributes, and retains only one copy of each common column

select name, course_id
from instructor natural join teaches;

Warning

Beware of unrelated attributes with same name which get equated incorrectly

e.g. List the names of instructors along with the titles of courses that they teach

course(course_id,title, dept_name,credits)
teaches( ID, course_id,sec_id,semester, year)
instructor(ID,name, dept_name,salary)

Incorrect version (makes course.dept_name = instructor.dept_name)

select name, title
from instructor natural join teaches natural join course;

Correct version

select name, title
from instructor natural join teaches, course
where teaches.course_id = course.course_id;

select name, title
from (instructor natural join teaches)join course using(course_id);

select name, title
from instructor,teaches, course
where instructor.ID=teaches .ID  and teaches.course_id =course.course_id;

3.5 Additional Basic Operations

3.5.1 The Rename Operation

The SQL allows renaming relations and attributes using the as clause:

old-name as new-name

比如说将年薪的1/12记为月薪,可以这么写

select ID, name, salary/12 as monthly_salary
from instructor

也可以对表进行 rename

select distinct T. name
from instructor as T, instructor as S
where T.salary > S.salary and S.dept_name = ‘Comp. Sci.

Keyword as is optional and may be omitted 关键字可省略

instructor as T ≡ instructor T

3.5.2 String Operation

SQL includes a string-matching operator for comparisons on character strings. The operator “like” uses patterns that are described using two special characters:

  • percent (%). The % character matches any substring.
  • underscore (_). The _ character matches any character.

这个字符串匹配是大小写敏感的

Pattern matching examples:

  • ‘Intro%’ matches any string beginning with “Intro”.
  • ‘%Comp%’ matches any string containing “Comp” as a substring.
  • _ ’ matches any string of exactly three characters. _
  • ‘_ _ _ %’ matches any string of at least three characters.

SQL supports a variety of string operations such as

  • concatenation (using “||”)
  • converting from upper to lower case (and vice versa)
  • finding string length, extracting substrings, etc.

Example

Find the names of all instructors whose name includes the substring “dar”.

select name
from instructor
where name like '%dar%' 

Example

以下式子均会匹配 "100 %"

like ‘100 \%'  escape  '\' 
like ‘100 \%'  
like ‘100  #%'  escape  ‘#' 

匹配中文字的小问题

由于中文在计算机中使用两个字节来存储的,而字符匹配的时候是一个字节一个字节匹配的,所以当我们匹配中文字的时候,可能会出现匹配结果并不包含相应中文字的情况

3.5.3 Ordering the Display of Tuples

可以使用下列语句对名字进行排列

order by name

We may specify desc for descending(降) order or asc for ascending(升) order, for each attribute; ascending order is the default.

order by name desc

Can sort on multiple attributes

order by  dept_name, name

3.5.4 The limit Clause

The limit clause can be used to constrain the number of rows returned by the select statement.

limit clause takes one or two numeric arguments, which must both be nonnegative integer constants:

 limit offset, row_count
 limit row_count     ==  limit 0, row_count

Example

List names of instructors whose salary is among top 3

select  name
from    instructor
order by salary desc
limit 3;   //  limit 0,3 

3.6 Set Operations

我们可以将得到的结果进行 set operations

Set operations union, intersect, and except , Each of the above operations automatically eliminates duplicates 会去重

To retain all duplicates use the corresponding multi-set versions union all, intersect all and except all. 不去重

Example

Suppose a tuple occurs m times in r and n times in s, then, it occurs:

  • \(m + n\) times in r union all s
  • \(\min(m,n)\) times in r intersect all s
  • \(\max(0, m – n)\) times in r except all s

3.7 Null Values

null signifies an unknown value or that a value does not exist.

(1)任何包含 null 的算数表达式得到的结果都是 null.

(2)The predicate is null can be used to check for null values.

select name
from instructor
where salary is null

(3)Comparisons with null values return the special truth value: unknown

(4)Three-valued logic using the truth value unknown:

  • OR
(unknown or true)         = true, 
(unknown or false)        = unknown
(unknown or unknown)    = unknown
  • AND:
 (true and unknown)         = unknown,
 (false and unknown)        = false,
 (unknown and unknown)        = unknown
  • NOT:
(not unknown) = unknown
  • In SQL “P is unknown” evaluates to true if predicate P evaluates to unknown

(5)Result of select predicate is treated as false if it evaluates to unknown 选择谓词是unknown,则被处理为 false

3.8 Aggregate Functions

These functions operate on the multiset of values of a column of a relation, and return a value

  • avg: average value

  • min: minimum value

  • max: maximum value

  • sum: sum of values

  • count: number of values

Example

Find the average salary of instructors in the Computer Science department

select avg (salary)
from instructor
where dept_name= ’Comp. Sci.;

Find the total number of instructors who teach a course in the Spring 2010 semester

select count (distinct ID)
from teaches
where semester = ’Spring’ and year = 2010

Find the number of tuples in the course relation

select count (*)
from course;

3.8.1 Group By

我们也可以分组使用聚合函数,比如说计算每个部门中老师的平均工资

select dept_name, avg (salary)
from instructor
group by dept_name;

Attributes in select clause outside of aggregate functions must appear in group by list

Warning

/* erroneous query */
select dept_name, ID, avg (salary)
from instructor
group by dept_name;

这里 ID 不属于聚合函数的参数,也不在 group by 中就会出现问题

3.8.2 Having Clause

这个语句实现的功能就是对聚合的结果进行选择,比如说

Example

Find the names and average salaries of all departments whose average salary is greater than 42000

select dept_name, avg(salary)
from instructor
group by dept_name
having avg (salary) > 42000;

Predicates in the having clause are applied after the formation of groups whereas predicates in the where clause are applied before forming groups

Note

SQL查询处理的基本流程

  1. FROM: 确定数据源。
  2. WHERE: 应用过滤条件来选择符合条件的行。
  3. GROUP BY: 将结果集按指定列进行分组。
  4. HAVING: 对分组后的结果应用过滤条件。
  5. SELECT: 选择需要显示的列或表达式。
  6. ORDER BY: 按指定列排序结果。

3.8.3 Null Values and Aggregates

All aggregate operations except count(*) ignore tuples with null values on the aggregated attributes

What if collection has only null values?

count returns 0

all other aggregates return null

3.9 Nested Sub-queries

SQL provides a mechanism for the nesting of subqueries.

A sub-query is a select-from-where expression that is nested within another query.

A common use of subqueries is to perform tests for :

  • set membership
  • set comparisons
  • set cardinality

3.9.1 Set Membership

目的:检查一个值是否属于另一个查询返回的集合。

Find courses offered in Fall 2009 and in Spring 2010

select distinct course_id
from section
where semester = ’Fall’ and year= 2009 
        and course_id in (select course_id
                           from section
                           where semester = ’Spring’ and year= 2010);

3.9.2 Set Comparisons

目的:比较两个或多个集合之间的关系,如相等、不相等、包含等。

Find names of instructors with salary greater than that of some (at least one) instructor in the Biology department.

可以这样做

select distinct T.name
from instructor as T, instructor as S
where T.salary > S.salary and S.dept_name = ’Biology’;

也可以利用嵌套查询,然后使用 >some 语句

select name
from instructor
where salary > some (select salary
                     from instructor
                     where dept_name = ’Biology’);

Definition of Some Clause

\(\mathsf{F}<\text{comp}>\mathsf{some}\;r \Leftrightarrow \exists\;t \in r\;\;\mathsf{such\;that}\;(\mathsf{F}<\text{comp}>t)\)

Where \(<\text{comp}>\) can be: <, ≤, >, =, ≠

相应的我们也有 all Clause 语句

Definition of All Clause

3.9.3 Scalar Sub-query

Scalar(标量) sub-query is one which is used where a single value is expected

 select name
 from instructor 
 where  salary * 10 >
         (select budget  
         from department
         where department.dept_name = instructor.dept_name)

Runtime error if sub-query returns more than one result tuple

3.9.4 Test for Empty Relations

The exists construct returns the value true if the argument sub-query is nonempty.

\(\text{exists } r \Leftrightarrow r\ne\empty\)

\(\text{not exists } r \Leftrightarrow r = \empty\)

Example

还是之前那个例子,使用 exist 关键字来写

select course_id
from section as S
where semester = ’Fall’ and year= 2009 and
            exists (select *
                    from section as T
                    where semester = ’Spring’ and year= 2010 and S.course_id= T.course_id);

Example

Find all students who have taken all courses offered in the Biology department.

select distinct S.ID, S.name
from student as S
where not exists ( (select course_id
                    from course
                    where dept_name = ’Biology’)
              except (select T.course_id
                      from takes as T
                      where S.ID = T.ID));

3.9.5 Test for Absence of Duplicate Tuples

The unique construct tests whether a sub-query has any duplicate tuples in its result. (Evaluates to “true” on an empty set)

Example

Find all courses that were offered at most once in 2009

select T.course_id
from course as T
where unique (  select R.course_id
                from section as R
                where T.course_id= R.course_id and R.year = 2009) ;

3.9.6 *Sub-queries in the From Clause

SQL allows a sub-query expression to be used in the from clause

Example

Find the average instructors’ salaries of those departments where the average salary is greater than $42,000.

select dept_name, avg_salary
from (select dept_name, avg (salary) as avg_salary
        from instructor
        group by dept_name)
where avg_salary > 42000;

Lateral clause permits later part of the from clause (after the lateral keyword) to access correlation variables from the earlier part.

select name, salary, avg_salary
from instructor I1,
        lateral (select avg(salary) as avg_salary
                from instructor I2 
                where I2.dept_name= I1.dept_name);

3.9.7 *With Clause

The with clause provides a way of defining a temporary view whose definition is available only to the query in which the with clause occurs.

Example

Find all departments with the maximum budget

with max_budget (value) as
        (select max(budget)
         from department)
select dept_name
from department, max_budget
where department.budget = max_budget.value;

等价的写法如下

select dept_name
from department
where budget = (select (max(budget) from department))

With clause is very useful for writing complex queries

Complex Queries using With Clause

Find all departments where the total salary is greater than the average of the total salary at all departments

with dept _total (dept_name, value) as
        (select dept_name, sum(salary)
         from instructor
         group by dept_name),
     dept_total_avg(value) as
        (select avg(value)
         from dept_total)
select dept_name
from dept_total, dept_total_avg
where dept_total.value >= dept_total_avg.value;

3.10 Modification of the Database

3.10.1 Deletion

Delete all instructors

delete from instructor 

Delete all instructors from the Finance department

delete from instructor
where dept_name= ’Finance’;

Delete all tuples in the instructor relation for those instructors associated with a department located in the Watson building.

delete from instructor
where dept_name in (select dept_name
                    from department
                    where building = ’Watson’);

Delete all instructors whose salary is less than the average salary of instructors

delete from instructor
where salary< (select avg (salary) from instructor);

Question

Problem: as we delete tuples from deposit, the average salary changes

Solution used in SQL:

1.First, compute avg salary and find all tuples to delete

2.Next, delete all tuples found above (without recomputing avg or retesting the tuples)

选择直接不管了,先计算原来的,剔除掉没超过的即可

3.10.2 Insertion

Add a new tuple to course

insert into course
   values (’CS-437’, ’Database Systems’, ’Comp. Sci., 4);

或者等价地写为

 insert into course (course_id, title, dept_name, credits)
    values (’CS-437’, ’Database Systems’, ’Comp. Sci., 4);

Add all instructors to the student relation with tot_creds set to 0

 insert into student
 select ID, name, dept_name, 0
 from   instructor

Warning

The select from where statement is evaluated fully before any of its results are inserted into the relation。

insert into table1 
select * from table1

这样可能导致重复的数据和可能的无限循环。

3.10.3 Update

Increase salaries of instructors whose salary is over $100,000 by 3%, and all others receive a 5% raise

update instructor
    set salary = salary * 1.03
    where salary > 100000;

update instructor
    set salary = salary * 1.05
    where salary <= 100000;

Updates with Scalar Sub-queries

Example

Recompute and update tot_creds value for all students

update student S
    set tot_cred = ( select sum(credits)
                    from takes natural join course
                    where S.ID= takes.ID and
                            takes.grade <> ’F’ and
                            takes.grade is not null);

评论区

对你有帮助的话请给我个赞和 star => GitHub stars
欢迎跟我探讨!!!