遇见时光

Keep Looking , Don't Settle
Stay Hungry , Stay Foolish

DatabaseNotes-Ch3,4

Ch3: Basic SQL Query Language

一. Setting up the Database

1. SQL Datatype

CHARACTER(n), CHAR(n): fixed-length character strings

CHARACTER VARYING(n), CHAR VARYING(n):variable-length character strings

NUMERIC(p, s), DECIMAL(p, s), DEC(p, s):

Precision: total number of digits

Scale:number of digits to the right of the decimal point

INTEGER, INT, SMALLINT

FLOAT(p)

REAL

DOUBLE PRECISION


2.Oracle Datatype

Character Datatype

CHAR(n):fixed-length character strings(1\leq n $\leq$2000)

VARCHAR(n):variable-length character strings(1\leq n \leq 4000)

LONG:variable-length character data(text data) (maximum size 2GB)


NUMBER Datatype

NUMBER:fixed and floating-point numbers

NUMBER(precision, scale)

NUMBER(*, scale)

Precision:total number of digits

Scale:number of digits to the right of the decimal point(default value is zero)

Negative Scale:rounds the actual data to the specified number of places to the left of the decimal point


3.SQL Statement

1) SQL statements for table creation for CAP database
1
2
3
4
5
6
CREATE TABLE  orders (
cid CHAR(4) NOT NULL,
cname VARCHAR(13),
city VARCHAR(20),
discnt REAL,
PRIMARY KEY(cid) );

1
2
3
4
5
6
7
8
9
CREATE TABLE  orders (
ordno INTEGER NOT NULL,
month CHAR(3),
cid CHAR(4),
aid CHAR(3),
pid CHAR(3),
qty INTEGER,
dollars DOUBLE PRECISION,
PRIMARY KEY(ordno) );

1
2
3
4
5
6
7
-- MYSQL
create table Student(
Sno int not null auto_increment primary key,
Sname varchar(10) not null,
Sex char(1) not null,
Sage tinyint(100) not null,
Sdept char(4) not null)comment = '学生表';

2) Select Statement
1
2
3
4
5
6
7
SELECT  * | colname { , colname ... }
FROM tablename { , tablename ... }
[ WHERE search_condition ]
[ GROUP BY colname { , colname ... }
[ HAVING search_condition ] ]
[ ORDER BY colname [ ASC | DESC ]
{ , colname [ ASC | DESC ] ... } ];


二. Simple Select Statements

1. Relational Algebra与SQL语法对比

1. Query in relational algebra :

(R where Condition) [A1_1, A2_2, …, Am_m]


Query in SQL:

1
Select A1, A2,...,Am from R where Condition;

2. Query in relational algebra :

((R1_1 ×\times R2_2×\times×\timesRn_n) where Condition) [A1_1, A2_2, …, Am_m]


Query in SQL:

1
Select A1, A2,...,Am from R1, R2,...,Rn where Condition;

3. Query in relational algebra :

Head® = {A1_1,…,An_n,B1_1,…,Bk_k}, Head(S) = {B1_1,…,Bk_k, C1_1,…,Cm_m}

((R\inftyS) where Condition) [A1_1, A2_2, …, Am_m]


Query in SQL:

1
Select A1, A2,...,Am from R, S where Condition and R.B1 = S.B1 and R2.B2 = S.B2 and...and R.Bk = S.Bk;

4. Query in relational algebra :

(R \inftyCondition_{Condition}S) [A1_1, A2_2, …, Am_m]


Query in SQL:

1
Select A1, A2,...,Am from R,S where Condition;

5. Query in relational algebra :

(AGENTS where city = ‘New York’) [aid, aname]


Query in SQL:

1
Select aid, aname from agents where city = 'New York';

6. Query in relational algebra :

(CUSTOMERS) [cid, cname, city, discnt]


Query in SQL:

1
Select * from Customers;
1
Select cid, cname, city, discnt from Customers;

7. Query in relational algebra :

((C[cid, cname] \inftyO)\inftyA) [cname, aname]; 

((C \infty\inftyA) where C.cid = O.cid and O.aid = A.aid) [cname, aname]


Query in SQL:

1
Select distinct cname, aname from customers, orders, agents where customers.cid = orders.cid and orders.aid = agents.aid;

2. Some Examples

1. Display all values of customers in table CUSTOMERS.

1
SELECT  cid, cname, city, discnt  FROM  customers ;
1
SELECT  *  FROM  customers ;

2. Retrieve all (cname, aname) pairs where the customer places an order through the agent.

1
2
3
SELECT  distinct  cname, aname
FROM customers, orders, agents
WHERE customers.cid=orders.cid and orders.aid=agents.aid ;
1
2
3
SELECT  distinct  cname, aname
FROM customers c, orders o, agents a -- (has alias)
WHERE c.cid=o.cid and o.aid=a.aid ;

3. Get the names of customers who order at least one product priced at $0.50.

1
2
3
4
SELECT  cname
FROM products p, orders o, customers c
WHERE price=0.50 and p.pid=o.pid and
o.cid=c.cid ;

4. Retrieve customers who place orders only through agent a03.

O [cid] - (O where aid <> ‘a03’) [cid]

1
2
3
4
5
6
SELECT  o1.cid
FROM orders o1
WHERE o1.cid NOT IN (
SELECT o2.cid
FROM orders o2
WHERE o2.aid <> ‘a03’ ) ;

5. List all pairs of customer cids based in the same city.

1
Select c1.cid, c2.cid from customers c1, customers c2 where c1.city = c2.city and c1.cid < c2.cid;

3. Table and column alias

Table alias in FROM clause

方法1:table_name as alias_name

方法2:table_name alias_name

column alias in SELECT clause

column_name as alias_name


4. ALL | DISTINCT

SELECT aid, pid FROM orders ;

SELECT distinct aid, pid FROM orders ;


Find pids of products ordered by at least two customers.

1
2
3
SELECT  distinct  pid
FROM orders x1, orders x2
WHERE x1.pid = x2.pid and x1.cid < x2.cid ;
1
2
3
SELECT  distinct  pid
FROM orders x1, orders x2
WHERE x1.pid = x2.pid and x1.cid <> x2.cid ;


三. Subqueries

1. The IN Predicate

1. Retrieve cids of customers who place orders with agents in Duluth or Dallas.

1
2
3
4
select  distinct  cid
from orders o, agents a
where a.aid=o.aid and
(a.city=‘Duluth’ or a.city=‘Dallas’); -- Slow
1
2
3
4
5
6
select distinct  cid
from orders
where aid IN (
select aid
from agents
where city= ‘Duluth’ or city=‘Dallas’ ); -- Fast

2. Get all information concerning agents based in Duluth or Dallas.

1
2
3
SELECT  *
FROM agents
WHERE city IN { ‘Duluth’, ‘Dallas’ } ;

3. Get the names and discounts of all customers who place orders through agents in Duluth or Dallas.

1
2
3
4
5
6
7
8
9
SELECT  cname,  discnt
FROM customers
WHERE cid IN (
SELECT o.cid
FROM orders o
WHERE o.aid IN (
SELECT a.aid
FROM agents a
WHERE a.city IN (‘Duluth’, ‘Dallas’))) ;

4. Find the names of customers who order product p05.

1
2
3
Select distinct cname 
from customers c, orders o
where c.cid = o.cid and o.pid = 'p05';
1
2
3
4
5
Select distinct cname 
from customers c
where 'p05' IN (
Select pid from orders o
where o.cid = c.id);

5. Get cids of customers who place an order through at least one agent who places an order for product p03

T1 := (O where pid = ‘p03’) [aid]

T2 := (T1 $\infty $ O) [cid]

1
2
select o2.cid from orders o1, orders o2
where o1.pid = 'p03' and o1.aid = o2.pid;
1
2
select o2.cid from orders o2 where o2.aid IN
(select o1.aid from orders o1 where o1.pid = 'p03');

6. Get cids of all customers who have the same discount as any customer in Dallas or Boston.

T1 := (C where city = ‘Dallas’ or city = ‘Boston’) [discnt]

T2 := (T1  \infty  C) [cid]

1
2
select c2.cid from customers c1, customers c2
where (c1.city = 'Dallas' or c1.city = 'Boston') and c1.discnt = c2.discnt;
1
2
select c2.cid from customers c2 where c2.discnt IN
(select c1.discnt from customers c1 where c1.city = 'Dallas' or c1.city = 'Boston');

7. Get pids of products not ordered by any customer living in a city whose name begin with the letter D.

1
2
3
select p.pid from products p where p.pid NOT IN
(select o.pid from orders o, customers c where
o.cid = c.cid and c.city LIKE 'D%');

注意:IN 操作符允许您在 WHERE 子句中规定多个值。


4. Find ordno values for all orders placed by customers in Duluth through agents in New York.

1
2
3
4
5
6
SELECT  ordno
FROM orders
WHERE (cid, aid) IN -- IN的多组值的合并判断
(select cid, aid
from customers c, agents a
where c.city=‘Duluth’ and a.city=New York’);

2. The Quantified Comparison Predicate

expr Θ\Theta { SOME | ANY | ALL } ( subquery )

IN is =SOME,=ANY

NOT IN is <>ALL

注意:主要用于某一类属性值的比较;或用于替代IN;或求某一类属性最大/最小时的其他属性的值


1. Find aid values of agents with a minimum percent commission.

1
2
3
4
SELECT  aid
FROM agents
WHERE percent <= ALL (
select percent from agents ) ;
1
2
3
4
Select aid
From agents
Where percent IN (
Select min(percent) from agents );

2. Find all customers who have the same discount as that of any of the customers in Dallas or Boston.

1
2
3
4
5
6
SELECT  cid,  cname
FROM customers
WHERE discnt = SOME (
select discnt
from customers
where city=‘Dallas’ or city=‘Boston’ ) ;

3. Get cid values of customers with discnt smaller than those of any customers who live in Duluth.

1
2
3
4
5
6
SELECT  cid,  cname
FROM customers
WHERE discnt < ALL (
select discnt
from customers
where city=‘Duluth’ ) ;

4. Find cid values of customers with discnt smaller than a customer who lives in Duluth.

1
2
3
4
5
6
SELECT  cid
FROM customers
WHERE discnt < SOME (
SELECT discnt
FROM customers
WHERE city = ‘Duluth’ ) ;

注意:

1) > ALL 父查询中的结果集大于子查询中每一个结果集中的值,则为真

2) > ANY, SOME 父查询中的结果集大于子查询中任意一个结果集中的值,则为真

3) = ANY 与子查询 IN相同

4) <> ANY 与 NOT IN 父查询中的结果集不等于子查询中的a或者b或者c,则为真 


3. The EXISTS Predicate

实例:

exists,not exists的使用方法示例,需要的朋友可以参考下。

学生表:

1
2
3
4
5
6
create table t_student
(
id number(8) primary key,
name varchar2(10),
deptment number(8)
);

选课表:

1
2
3
4
5
6
create table select_course
(
ID NUMBER(8) primary key,
STUDENT_ID NUMBER(8) foreign key (COURSE_ID) references course(ID),
COURSE_ID NUMBER(8) foreign key (STUDENT_ID) references student(ID)
);

课程表:

1
2
3
4
5
6
create table COURSE
(
ID NUMBER(8) not null,
C_NAME VARCHAR2(20),
C_NO VARCHAR2(10)
);

1.查询选修了所有课程的学生id、name:(即不存在这样一个学生,他有一门课程没有选。)

分析:

如果有一门课没有选,则此时

(1)select * from select_course sc,t_student ts,course c where sc.student_id=ts.id and sc.course_id=c.id存在null,
这说明(2)select * from course c 的查询结果中确实有记录不存在(1查询中),查询结果返回没有选的课程,
此时select * from t_student ts 后的not exists 判断结果为false,不执行查询。

1
2
3
select id, name from t_student ts where not exists
(select * from course c where not exists
(select * from select_course sc where sc.student_id=ts.id and sc.course_id=c.id));
1
2
3
4
select id, name from t_student ts, 
(select studentid from select_course sc where NOT EXISTS
(select * from course c where c.id = sc.id)) T1
where ts.id = T1.student_id;

2.查询没有选择所有课程的学生,即没有全选的学生。(存在这样的一个学生,他至少有一门课没有选)

分析:

只要有一个门没有选,即

select * from select_course sc,t_student ts,courses c where student_id=t_student.id and courses_id=courses.id 有一条为空,

即not exists null 为true,此时select * from courses有查询结果(id为子查询中的courses.id ),
因此select id,name from t_student 将执行查询(id为子查询中t_student.id )。

1
2
3
select id,name from t_student where exists
(select * from course where not exists
(select * from select_course sc where student_id=t_student.id and course_id=course.id));
1
2
3
4
5
select id name from t_student where (id ,name) NOT IN 
(select id, name from t_student ts,
(select studentid from select_course sc where NOT EXISTS
(select * from course c where c.id = sc.id)) T1
where ts.id = T1.student_id) T2;

3.查询一门课也没有选的学生。(不存这样的一个学生,他至少选修一门课程)

分析:

如果他选修了一门select * from course结果集不为空,not exists 判断结果为false;
select id,name from t_student 不执行查询。

1
2
3
select id,name from t_student where not exists
(select * from course where exists
(select * from select_course sc where student_id=t_student.id and course_id=course.id));
1
2
select id, name from t_student where id NOT IN
(select student_id from select_course );

4.查询至少选修了一门课程的学生。(存在这样的一个学生,他至少选修一门课程)

1
2
3
select id,name from t_student where exists
(select * from course where exists
(select * from select_course sc where student_id=t_student.id and course_id=course.id));
1
2
3
select id, name from t_student where (id, name) NOT IN
(select id, name from t_student where id NOT IN
(select studentid from select_course));

注意:在插入记录前,需要检查这条记录是否已经存在,只有当记录不存在时才执行插入操作,可以通过使用 EXISTS 条件句防止插入重复记录。

1
2
3
INSERT INTO TableIn (ANAME,ASEX) 
SELECT top 1 '张三', '男' FROM TableIn
WHERE not exists (select * from TableIn where TableIn.AID = 7);

Relational AlgebraSQL Predicate
natural joinIN
=SOME
EXISTS
differenceNOT IN
<>ALL
NOT EXISTS

1. Retrieve all customer names where the customer places an order through agent a05.

1
2
3
4
5
6
Select  distinct  c.cname
From customers c
Where EXISTS (
Select
From orders x
Where c.cid=x.cid and x.aid=‘a05’ );
1
2
select distinct c.cname 
from customers c, order o where o.cid = c.cid and o.aid = 'a05';

2. Find all customer names where the customer does not place an order through agent a05.

1
2
3
4
5
6
SELECT  cname
FROM customers c
WHERE cid NOT IN (
SELECT o.cid
FROM orders o
WHERE o.aid = ‘a05’ ) ;
1
2
3
4
5
6
SELECT  cname
FROM customers c
WHERE NOT EXISTS (
SELECT *
FROM orders o
WHERE o.cid = c.cid and o.aid = ‘a05’ ) ;

3. Find cids of all customers who don’t place any order through agent a03.

1
2
3
4
5
6
SELECT  cid
FROM customers
WHERE cid NOT IN (
SELECT o.cid
FROM orders o
WHERE o.aid = ‘a03’ ) ;
1
2
3
4
5
6
SELECT  cid
FROM customers c
WHERE NOT EXISTS (
SELECT *
FROM orders o
WHERE o.cid = c.cid and o.aid = ‘a03’ );

注意:

EXISTS用于检查子查询是否至少会返回一行数据,该子查询实际上并不返回任何数据,而是返回值True或False


not in和not exists的区别:

1. not in 只有当子查询中,select 关键字后的字段有not null约束或者有这种暗示时用not in,另外如果主查询中表大,子查询中的表小但是记录多,则应当使用not in,

例如:查询那些班级中没有学生的:

1
2
select * from class where cid not in 
(select distinct cid from stu);

2. 当表中cid存在null值,not in 不对空值进行处理

解决:

1
2
select * from class where cid not in
(select distinct cid from stu where cid is not null);

not in的执行顺序是:是在表中一条记录一条记录的查询(查询每条记录)符合要求的就返回结果集,不符合的就继续查询下一条记录,直到把表中的记录查询完。也就是说为了证明找不到,所以只能查询全部记录才能证明,并没有用到索引。
not exists:如果主查询表中记录少,子查询表中记录多,并有索引。

例如:查询那些班级中没有学生的

1
2
3
select * from class2
where not exists
(select * from stu1 where stu1.cid =class2.cid)

not exists的执行顺序是:在表中查询,是根据索引查询的,如果存在就返回true,如果不存在就返回false,不会每条记录都去查询。
之所以要多用not exists,而不用not in,也就是not exists查询的效率远远高与not in查询的效率。


4. The LIKE Predicate

zxk


1. Get pids of products not ordered by any customer living in a city whose name begin with the letter D.

1
2
3
4
5
6
SELECT  p.pid
FROM products p
WHERE p.pid NOT IN (
SELECT o.pid
FROM orders o, customers c
WHERE o.cid = c.cid and c.city LIKE ‘D%’ ) ;

2. Retrieve cid values of customers whose cname does not have a third letter equal to ‘%’.

1
2
3
4
Select  cid
From customers
Where cname NOT LIKE ‘_ _ \ % %
ESCAPE ‘ \ ’ ;

注意:% 原为通配符,指定用 ’ \ '符号来说明跟在其后面的通配符字符为普通字符。


3. Retrieve cid values of customers whose cname begins “Tip_” and has an arbitrary number of characters following.

1
2
3
4
Select  cid
From customers
Where cname LIKE ‘Tip\_%
ESCAPE '\' ;

4. Retrieve cid values of customers whose cname starts with the sequence “ab\”.

1
2
3
Select  *
From customers
Where cname LIKE ‘ab\ \%ESCAPE ‘\’ ;

( 连续的两个转义指示字符表示 ‘转义指示符 ’自己)


注意:

‘A_Z’: 所有以 ‘A’ 起头,另一个任何值的字原,且以 ‘Z’ 为结尾的字串。

‘ABC%’: 所有以 ‘ABC’ 起头的字串。

‘%XYZ’: 所有以 ‘XYZ’ 结尾的字串。

‘%AN%’: 所有含有 ‘AN’ 这个套式的字串。

使用 ESCAPE 关键字定义转义符。

WHERE ColumnA LIKE ‘%5/%%’ ESCAPE ‘/’ 即 ‘/’为转义字符,第二个‘%’为普通字符,第一、第三个为通配符。


5. The NULL Predicate

NULL表示不可知不确定,NULL不与任何值相等(包括其本身)

IS NULL 判断某个字符是否为空,并不代表空字符或者是0

=NULL 是判断某个值是否等于NULL

总之,要判断一个数是否等于NULL只能用 IS NULL 或者 IS NOT NULL 来判断

1
2
3
4
5
-- 选择非空列
select * from tableName where colName IS NOT NULL;

-- 删除列为NULL的行
delete from tableName where colName IS NULL;

6.Example of Subqueries

1. Retrieve all customer names where the customer places an order through agent a05.

1
2
3
SELECT  distinct  cname
FROM customers c, orders o
WHERE c.cid = o.cid and o.aid = ‘a05’ ;
1
2
3
4
SELECT  distinct  cname
FROM customers
WHERE cid IN
(SELECT cid FROM orders WHERE aid=‘a05’) ;
1
2
3
4
SELECT  distinct  cname
FROM customers c
WHERE EXISTS
(SELECT * FROM orders o WHERE o.cid=c.cid and o.aid=‘a05’) ;

2. Get cids of customers who order both products p01 and p07.

1
2
3
SELECT  o1.cid  FROM  orders  o1, orders  o2
WHERE o1.cid=o2.cid and
o1.pid=‘p01’ and o2.pid=‘p07’;
1
2
3
4
SELECT  o1.cid  FROM orders  o1
WHERE o1.pid=‘p01’ and o1.cid IN (
SELECT o2.cid FROM orders o2
WHERE o2.pid=‘p07’ );
1
2
3
Select cid from orders where pid = 'p01'
UNION
Select cid from orders where pid = 'p07';

3. Get cids of customers who place an order through at least one agent who places an order for product p03.

T1 := (O where pid = ‘p03’)[aid]

T2 := (T1  \infty O) [cid]

1
2
3
SELECT  o2.cid
FROM orders o1, orders o2
WHERE o1.pid=‘p03’ and o1.aid=o2.aid;

4. Get cids of all customers who have the same discount as any customer in Dallas or Boston.

1
2
3
4
SELECT  c1.cid
FROM customers c1, customers c2
WHERE ( c1.city = ‘Dallas’ or c1.city = ‘Boston’ )
and c1.discnt = c2.discnt and c1.cid < c2.cid;

5. Find all customer names where the customer does not place an order through agent a05.

((C[cid] – (O where aid = ‘a05’) [cid])  \infty C) [cname]

1
2
3
4
5
6
SELECT  cname
FROM customers
WHERE cid NOT IN (
SELECT o.cid
FROM orders o
WHERE o.aid = ‘a05’ ) ;
1
2
3
4
5
6
SELECT  cname
FROM customers c
WHERE NOT EXISTS (
SELECT *
FROM orders o
WHERE o.cid = c.cid and o.aid = ‘a05’ ) ;

注意:此处可以看见NOT EXISTS需要将索引联结起来;而NOT IN则不需要


6. Find all cid, aid pairs where the customer does not place an order through the agent.

1
2
3
4
5
6
SELECT  cid,  aid
FROM customers c, agents a
WHERE NOT EXISTS (
SELECT *
FROM orders o
WHERE o.cid = c.cid and o.aid = a.aid );
1
2
3
4
SELECT  cid,  aid
FROM customers c, agents a
WHERE (cid, aid) NOT IN (
SELECT o.cid, o.aid FROM orders o );

注意:此处也可以看见NOT EXISTS需要将索引联结起来;而NOT IN则不需要


7. Get the names of customers who order at least one product priced at $0.50.

1
2
3
4
SELECT  cname
FROM products p, orders o, customers c
WHERE price=0.50 and p.pid=o.pid and
o.cid=c.cid ;

8. Retrieve customers who place orders only through agent a03

O [cid] - (O where aid <> ‘a03’) [cid]

1
2
3
4
5
6
SELECT  o1.cid
FROM orders o1
WHERE o1.cid NOT IN (
SELECT o2.cid
FROM orders o2
WHERE o2.aid <> ‘a03’) ;


四.UNION Operators and FOR ALL Conditions

1. The UNION Operator(并集/交集拓展)

1. Retrieve all cities where either a customer or an agent, or both, is based.

1
2
3
select  city  from  customers 
UNION
select city from agents
1
2
3
select  city  from  customers 
UNION
select city from agents
1
2
3
select  city  from  customers
UNION ALL
select city from agents

注意:

如果我们需要将两个或多个select语句的结果作为一个整体显示出来,这是我们就需要用到union或者union all关键字。

union和union all的区别是,union会自动压缩多个结果集合中的重复结果,而union all则将所有的结果全部显示出来,不管是不是重复。


拓展:求交集

1. Get cid values of customers who order both products p01 and p07.

(Orders where pid = ‘p01’) [cid] \cap (Orders where pid = ‘p07’) [cid]

1
2
3
4
select o1.cid from Orders o1
where pid = 'p01' and pid IN
(select o2.cid from Orders o2
where pid = 'p02' and o1.cid = o2.cid);
1
2
3
select cid from (select cid from Orders where pid = 'p01') t1,
(select cid from Orders where pid = 'p07') t2
where t1.cid = t2.cid;

2. The Division Operation in Relational Algebra

注意:

1. 可将关系代数中的除法运算转换为SQL语句的NOT EXISTS二重条件判断

2. 写出关系代数式(包含除法运算)根据表关系构造NOT EXISTS二重条件判断

3. NOT EXISTS判断为Select *,且需要建立索引


1. Find cids of customers who place orders with ALL agents based in New York.

o[cid, aid] ÷\div (a where city=‘New York’)[aid]

1
2
3
4
5
6
7
8
9
SELECT  c.cid
FROM customers c
WHERE NOT EXISTS (
SELECT *
FROM agents a
WHERE a.city =New York’ and NOT EXISTS (
SELECT *
FROM orders o
WHERE o.cid = c.cid and o.aid = a.aid ) );

zxk


2. Get the aid values of agents in New York or Duluth who place orders for all products costing more than a dollar.

1
2
3
4
5
6
7
8
9
10
SELECT  aid
FROM agents a
WHERE (city=New York’ or city=‘Duluth’) and
NOT EXISTS (
SELECT *
FROM products p
WHERE p.price > 1 and NOT EXISTS (
SELECT *
FROM orders o
WHERE o.aid=a.aid and o.pid=p.pid ) );

3. Find aid values of agents who place orders for product p01 as well as for all products costing more than a dollar.

1
2
3
4
5
6
7
8
9
SELECT  aid    FROM  agents  a
WHERE aid IN (
select aid from orders
where pid=‘p01’ )
and NOT EXISTS (
SELECT * FROM products p
WHERE p.price > 1 and NOT EXISTS (
SELECT * FROM orders o
WHERE o.aid=a.aid and o.pid=p.pid));

4. Get cids of customers who order all products that anybody orders.

O[cid, pid] ÷ O[pid]

1
2
3
4
5
6
SELECT  c.cid  FROM  customers  c
WHERE NOT EXISTS (
SELECT * FROM orders o1
WHERE NOT EXISTS (
SELECT * FROM orders o2
WHERE o2.cid=c.cid and o2.pid=o1.pid ));

5. Get aids of agents who take orders on at least that set of products ordered by c004.

O[aid, pid] ÷ (O where cid = ‘c004’)[pid]

1
2
3
4
5
6
7
8
9
SELECT  a.aid
FROM agents a
WHERE NOT EXISTS (
SELECT *
FROM orders o1
WHERE o1.cid=‘c004’ and NOT EXISTS (
SELECT *
FROM orders o2
WHERE o2.aid=a.aid and o2.pid=o1.pid ));

6. Get cids for customers with the following property:

If customer c006 orders a product x through agent y, so the customer orders the product x through the agent y.

1
2
3
4
5
6
7
8
9
10
SELECT  cid
FROM customers c
WHERE NOT EXISTS (
SELECT *
FROM orders o1
WHERE o1.cid=‘c006’ and NOT EXISTS (
SELECT *
FROM orders o2
WHERE o2.cid = c.cid and o2.pid = o1.pid
and o2.aid = o1.aid ) );

7. Get aids of agents who place orders for all customers who place orders for all products costing more than a dollar through the agent.

1
2
3
4
5
6
7
8
9
10
SELECT  aid
FROM agents a
WHERE not exists (
SELECT *
FROM customers c, products p
WHERE p.price > 1 and not exists (
SELECT *
FROM orders x
WHERE x.cid = c.cid and x.pid = p.pid
and x.aid = a.aid ) );

注意:

Relational Algebra:

R(x, y) ÷\div S(y)

Relational Calculus:

\forallz ( \exists y ( P(z, y) )) \longleftrightarrow \nexistsz ( \nexists y ( P(z, y) ))


3. Set Functions in SQL(聚合函数)

zxk


1. Get the total dollar amount of all orders.

1
2
select  sum ( dollars )  as  totaldollars
from orders;

2. Get the total quantity of product p03 that has been ordered.

1
2
3
select  sum ( qty )  as  TOTAL
from orders
where pid = ‘p03’;

3. List the cid values of all customers who have a discount less than the maximum discount.

1
2
3
4
5
select  cid
from customers c1
where discnt < all (
select max(c2.discnt)
from customers c2 );

4. Find products ordered by at least two customers.

1
2
3
4
5
6
select   p.pid
from products p
where 2 <= ALL (
select count ( distinct cid )
from orders o
where o.pid = p.pid );

注意:

1. 此处表示至少有两个顾客购买的产品;

2. 具体语句运行过程:

a. 先将两表连接;

b. 然后返回相等pid下cid不重复的记录数;

c. 若相同pid的该记录数大于2,则返回其pid;


1
2
3
4
select p.pid from (
select pid, count(distinct cid) dcd
from orders as o group by pid) t, products p
where t.pid = t.pid and t.dcd >= 2;

注意:

1. 先将orders表pid,cid不重复记录分组构造新表t;

2. 再将produsts表与t表连接,最后返回cid不重复记录数大于2的pid;


1
2
3
4
SELECT  pid
FROM orders
GROUP BY pid
HAVING count ( distinct cid ) >= 2 ;

注意:使用Group By对pid进行分组,Having对cid进行过滤


注意:

1. count(*):获取数量

2. sum():求和(这里要注意求和是忽略null值的,null与其他数值相加结果为null,所以可以通过ifnull(xxx,0)将null的值赋为0

3. avg():求平均数

4. max():求最大值

5. min():求最小值


4. Groups of Rows in SQL

GROUP BY clause & HAVING clause

SQL语句的执行顺序:

from–where–group by–having–select–order by

from:需要从哪个数据表检索数据
where:过滤表中数据的条件
group by:如何将上面过滤出的数据分组
having:对上面已经分组的数据进行过滤的条件
select:查看结果集中的哪个列,或列的计算结果
order by:按照什么样的顺序来查看返回的数据


1) GROUP BY

  • GROUP BY语句用来与聚合函数(aggregate functions such as COUNT, SUM, AVG, MIN, or MAX.)联合使用来得到一个或多个列的结果集。

  • 因为聚合函数通过作用于一组数据而只返回一个单个值,因此,在SELECT语句中出现的元素要么为一个聚合函数的输入值,要么为GROUP BY语句的参数,否则会出错。


例如,对于上面提到的表格,我们做一个这样的查询:

1
2
3
SELECT id, COUNT(course) as numcourse, score
FROM student
GROUP BY id;

此时查询便会出错,错误提示如下:

Column ‘student.score’ is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

出现以上错误的原因是因为一个学生id对应多个分数,如果我们简单的在

SELECT语句中写上score,则无法判断应该输出哪一个分数。如果想用score作为

select语句的参数可以将它用作一个聚合函数的输入值,如下例,我们可以得到每个

学生所选的课程门数以及每个学生的平均分数:

1
2
3
SELECT id, COUNT(course) as numcourse, AVG(score) as avgscore
FROM student
GROUP BY id;

1. Find the agent name and aid, and the product name and pid, together with the total quantity each agent supplies of that product to customers c002 and c003.

1
2
3
4
5
SELECT  a.aid, a.aname, p.pid, p.pname, sum(qty)
FROM agents a, products p, orders o
WHERE a.aid = o.aid and p.pid = o.pid and
o.cid IN ('c002','c003')
GROUP BY a.aid, a.aname, p.pid, p.pname

注意:多表连接也可以使用Group By进行分组


2.对某个字段分组且进行数据统计,可使用count(*)技巧

例如说明现在有一个这样的表:每个部门有多少人 就要用到分组的技术

1
2
select DepartmentID as '部门名称',
COUNT(*) as '个数' from BasicDepartment group by DepartmentID

这个就是使用了group by +字段进行了分组,其中我们就可以理解为我们按照了部门的名称ID,DepartmentID将数据集进行了分组;然后再进行各个组的统计数据分别有多少;
通俗一点说:group by 字段1,字段2。。。(整个表中不止这两个字段)表示数据集中字段1相等,字段2也相等的数据归为一组,只显示一条数据。那么你可以对字段3进行统计(求和,求平均值等);


3.MySQL中group by的时候字段不加聚合函数的情况

首先对于不加聚合函数的sql语句来说,它的功能结合了limit来得出结果,仔细想想的时候有点Oracle分析函数的味道,limit的时候得到的并不是最大最小的值(元素按照原来在表中的默认顺序结合分组),而是某一下group by结果集中第一行,也就是刚才说的相当于先group by, 然后在每个group by下面进行limit 1


其次,刚才还说了常规的group by结合聚合函数的时候,选择的字段除了聚合函数外,必须在group by中存在,但是在mysql中不是这样了,它具有隐含字段的功能,例如:

1
select a,b,c,count(*) from test1 group by a,b;
abccount(*)
1113
1253

注意:对于没有选择的字段,上面是c,c的值还是和上面说到的一样,是根据得到的结果集然后根据每个group by 进行limit 1得到的结果。


2) HAVING

  • HAVING语句通常与GROUP BY语句联合使用,用来过滤由GROUP BY语句返回的记录集。

  • HAVING语句的存在弥补了WHERE关键字不能与聚合函数联合使用的不足(只能接聚合函数进行筛选)。


1. Find all product and agent IDs and the total quantity ordered of the product by the agent, when this quantity exceeds 1000.

1
2
3
4
SELECT  pid,  aid,  sum(qty)  as  total
FROM orders
GROUP BY pid, aid
HAVING sum(qty) > 1000 ;

2. Provide pid values of all products purchased by at least two customers.

1
2
3
4
SELECT  pid
FROM orders
GROUP BY pid
HAVING count ( distinct cid ) >= 2 ;

3. Find the average, over all agents, of the maximum dollar sales made by each agent.

Right:

1
2
3
4
SELECT  avg ( t.x )
FROM ( select aid, max(dollars) as x
from orders
group by aid ) t ;

Wrong:

1
2
3
SELECT  avg ( select max ( dollars ) from orders )
FROM orders
GROUP BY aid ;

5. Insert, Update, and Delete Statements

1) The Insert Statement

1
2
INSERT INTO tablename [ ( colname, ...... ) ]
VALUES ( expr|NULL, ...... ) | subquery

  • 带有子查询
1
2
INSERT INTO T_ReaderFavorite2 (FCategoryId, FReaderId)
SELECT FCategoryId, FReaderId FROM T_ReaderFavorite Where ....

这里使用SELECT FCategoryId,FReaderId FROM T_ReaderFavorite将T_ReaderFavorite表中的数据读出,然后使用INSERT INTO T_ReaderFavorite2(FCategoryId,FReaderId)将检索结果插入到T_ReaderFavorite2 表中,注意上下的列顺序必须是一一对应的。

1
2
INSERT INTO T_ReaderFavorite(FCategoryId,FReaderId)
SELECT 1,FId FROM T_Reader Where ....

SELECT语句从T_Reader表中检索所有的读者信息,并且将第一列设定为固定值1,而将第二列设定为读者的主键,执行完毕查看T_ReaderFavorite表中的内容。


2) The Update Statements

1
2
3
UPDATE  tablename
SET colname = expr | NULL | subquery, ......
[ WHERE search-condition ];

1. Give all customers who have total orders of more than $1000 a 10% increase in the discnt they receive.

1
2
3
4
5
6
7
UPDATE  customers
SET discnt = 1.1 * discnt
WHERE cid in ( select cid
from orders
group by cid
having sum(dollars) > 1000 );

注意:可利用Select不包含聚合函数,Having包含聚合函数进行单一筛选!


1
2
3
4
UPDATE customers c
SET discnt = 1.1 * discnt
WHERE 1000 < ( Select sum(dollars)
from orders o where o.cid = c.cid);

  • 多张表UPDATE用法
  • 当用一个表中的数据来更新另一个表中的数据,T-SQL提供多种写法(下面列出了二种),但建议用第一种写法,虽然传统,但结构清晰。
    并且要注意,当用一个表中的数据来更新另一个表中的数据时,二个表一定要有关联!
1
2
3
update t1 set t1.c2 = t2.c2
from t2
where t1.c1 = t2.c1;
1
2
Update t1 set t1.c2 = t2.c2
from t1 inner join t2 on t1.c1 = t2.c1;
1
2
Update t1 set t1.c2 = t2.c2
from t1 inner join t2 using(c1);

注意:用using关键字进行简化

1.查询必须是等值连接。

2.等值连接中的列必须具有相同的名称和数据类型。


表联结(join)拓展:

相同多列联结:

1
2
3
select... from table1 inner join table2

using(column1,column2)

相当于

1
2
3
4
5
select... from table1 inner join table2

on table1.column1=table2.column1

and table1.column2=table2.column2;
1
2
3
4
5
select... from table1, table2

where table1.column1=table2.column1

and table1.column2=table2.column2;

多表联结:

1
2
3
4
5
select... from table1

inner join table2 using(column1)

inner join table3 using(column2);

相当于

1
2
3
4
5
select... from table1,table2,table3

where table1.column1=table2.column1

and table1.column2=table3.column2;
1
2
3
4
5
select... from table1 

inner join table2 on table1.column1=table2.column1

inner join table3 on table1.column2=table3.column2;

  • FROM 子句中指定的表的别名不能作为 SET column_name 子句中被修改字段的限定符使用。
1
2
3
4
5
UPDATE titles
SET t.ytd_sales = t.ytd_sales + s.qty --此处的t.ytd_sales使用不当
FROM titles t, sales s
WHERE t.title_id = s.title_id
AND s.ord_date = (SELECT MAX(sales.ord_date) FROM sales);
  • 若要使上例合法,请从列名中删除别名 t 或使用本身的表名。
1
2
3
4
5
UPDATE titles
SET ytd_sales = t.ytd_sales + s.qty --将别名t改为表名
FROM titles t, sales s
WHERE t.title_id = s.title_id
AND s.ord_date = (SELECT MAX(sales.ord_date) FROM sales);

  • 更新多列
1
2
3
4
5
6
7
update mytab a 
set (name,address)=(select b.name,b.address
from goal b
where b.id=a.id)
where exists (select 1
from goal b
where b.id=a.id );

1. 多表关联update的时候,记得要加exists()条件,否则不满足条件的记录被update称NULL:
比如:stu表存在,但stu1表不存在的数据,对应的字段会被update成NULL;

2. oracle在进行更新时无法join,  所以我们用exists来进行连表条件查询,相当于mytab表和goal表关联,当id相等时才执行更新!


  • Update与判断操作的结合

1. if(true,a,b), if(false,a,b) 这个就是第一个如果是true,就等于a,false就等于b,有点像三元表达式

2. case when then else end

(Eg.)性别互换:

1
2
3
update salary set sex = if(sex = 'm','f','m');
select (case sex when '男' then '女' when '女' then '男' end ) from salary;
update salary set sex = case sex when 'f' then 'm' else 'f' end ;

3) The Delete Statement

1
2
3
4
5
DELETE
FROM tablename
[ WHERE search-condition ]
[ORDER BY c1, c2, ...]
[LIMIT row_count];

  • 多表删除

  • 多表删除是指从一个或多个表中删除满足条件的数据,其中的 table_references代表了多个表的join操作,例如以下两个例子代表

  • 从t 1和t 2表中删除满足条件的数据

1
2
DELETE FROM somelog WHERE user = 'jcole' ORDER BY timestamp_column LIMIT 1;
DELETE t1, t2 FROM t1 INNER JOIN t2 on t1.id=t2.id INNER JOIN t3 on t1.id=t3.id;

  • 对多表删除语句而言,如果想对表使用别名,则只能在 table_references子句中使用,否则会报错。
1
2
DELETE a1, a2 FROM t1 AS a1 INNER JOIN t2 AS a2 ON a1.id=a2.id;      -- 正确
DELETE t1 AS a1, t2 AS a2 FROM t1 INNER JOIN t2 WHERE a1.id=a2.id; -- 错误

1. Give all agents in New York a 10% raise in the percent commission they earn on an order.

1
2
3
UPDATE  agents
SET percent = 1.1 * percent
WHERE city =New York’;

2. Give all customers who have total orders of more than $1000 a 10% increase in the discnt they receive.

1
2
3
4
5
6
UPDATE  customers
SET discnt = 1.1 * discnt
WHERE cid in ( select cid
from orders
group by cid
having sum(dollars) > 1000 );

3. Delete all agents in New York.

1
2
DELETE  FROM  agents
WHERE city =New York’;

4. Delete all agents who have total orders of less than $600.

1
2
3
4
5
6
DELETE  FROM  agents
WHERE aid IN (
select aid
from orders
group by aid
having sum(dollars) < 600 );


Ch4.Object-Relational SQL

一. Object Types

1.定义新的数据类型(对象类型)

  • 对象类型的创建与删除
1
2
CREATE  TYPE  typename  AS  OBJECT
(attrname datatype, ......);
1
DROP  TYPE  typename;

  • 创建一个 ’姓名’(name_t) 类型
1
2
3
4
5
CREATE TYPE name_t AS OBJECT (
lname varchar(30),
fname varchar(30),
mi char(1)
);

2.对象类型的使用方法

  • 使用已有的对象类型来创建新的类型或表
1
2
3
4
CREATE  TYPE  person_t  AS  OBJECT (
ssno int,
pname name_t, -- (类型的嵌套定义关系)
age int );

3.使用对象数据类型来直接创建一张表

  • CREATE TABLE tablename OF typename
    { ( constraint-define ) } ;
1
CREATE  TABLE  people  OF  person_t (PRIMARY  KEY(ssno)) ;

4.对象值的创建、查询与更新

  • 对象值的创建
1
2
Insert  into  teachers  values
(1234, name_t(‘Einstein’, ‘Albert’, ‘E’), 120); -- 对象构造函数
  • 对象值的查询
1
2
3
Select   t.tid, t.name_t.fname, t.name_t.lname
From teachers t
Where t.room = 123
  • 对象值的更新

1) 修改元组中的对象属性值

1
2
3
update  people  p
set p.pname = name_t('Gould', 'Ben', null)
where ssno = 321341223;

2) 修改某个对象属性值的成员属性

1
2
3
update  people  p
set p.pname.mi = ‘C’
where ssno = 321341223;

3) 修改整个元组

1
2
3
4
5
update  people  p
set p = person_t (
332341223,
name_t('Gould', 'Glen', 'A'), 55 )
where ssno = 321341223;

5.对象的引用类型

  • 对象的引用类型:REF  < object type >
  1. 是指向某个元组对象的指针类型

  2. 可用于实现对象类型之间的嵌套引用


  • 在使用含有 REF类型的对象类型(Object Type)来创建关系表时,必须使用Scope for子句来限制REF属性的取值范围

1) 定义类型之间的引用关系

定义基本的对象类型:
customer_t, agents_t, products_t

1
2
3
4
5
create  type customer_t as object (
cid char(4),
cname varchar(13),
city varchar(20),
discnt real);
1
2
3
4
5
create  type  agent_t as object (  
aid char(3),
aname varchar(13),
city varchar(20),
percent smallint);
1
2
3
4
5
6
create  type  product_t as object (
pid char(3),
pname varchar(13),
city varchar(20),
quantity integer,
price double precision);

定义类型之间的引用关系 :

1
2
3
4
5
6
7
8
9
10
11
create type order_t as object  (
ordno int,
month char(3),
cid char(4),
aid char(3),
pid char(3),
qty int,
dollars double precision,
ordcust ref customer_t,
ordagent ref agent_t,
ordprod ref product_t);

(新增三个REF属性,分别引用(指向)三个不同的元组对象(Row Object))


2) 创建含有引用类型的关系表

使用前面定义的对象类型创建基本关系表 :

1
2
create table customers of customer_t
(primary key (cid));
1
2
create table products of product_t
(primary key (pid));
1
2
create table agents of agent_t
(primary key (aid));

创建含有引用类型的关系表

1
2
3
4
5
create table orders of order_t  (
primary key (ordno),
scope for (ordcust) is customers,
scope for (ordagent) is agents,
scope for (ordprod) is products );

3) 引用关系查询

根据元组之间的REF引用关系进行查询

1
2
3
select   o.ordno, o.ordcust.cname
from orders o
where o.dollars > 200.00

1. Find pid values of products that have been ordered by at least two customers.

1
2
3
select  distinct  x1.pid 
from orders x1, orders x2
where x1.pid = x2.pid and x1.ordcust < x2.ordcust;

ordcust 承担起了 cid 所担负的责任!

1
2
3
select  distinct  x1.pid 
from orders x1, orders x2
where x1.pid = x2.pid and x1.cid < x2.cid;

4) 函数与谓词

  • 函数与谓词

1.获取对象(元组)的引用指针: REF(……)

2.返回引用指针所指向对象的值: DEREF(……)


  • 两个谓词

    • IS DANGLING

      1. 用于判断所引用的元组对象是否存在

      2. 如果所引用的元组对象不存在,那么该谓词返回逻辑真(TRUE),否则返回逻辑假(FALSE)。

      3. 该谓词主要用于检查那些错误的对象引用指针。

    • IS NULL


1. Retrieve all customer names where the customer does not place an order through agent a05.

1
2
3
4
5
select  c.cname
from customers c
where not exists (
select * from orders x
where x.ordcust = ref(c) and x.aid = 'a05‘ );

1
2
select  o.cid    from  orders o
where o.ordcust IS DANGLING;

等价于:

1
2
3
4
select o.cid   from orders o
where o.ordcust <>
(select ref(c) from customers c
where c.cid = o.cid)

5) 类型的循环嵌套定义

  • 对象类型(object type)不能嵌套定义,但 REF 关系可以实现嵌套引用。
1
2
3
4
create  type  police_officer_t as object
( pol_person person_t,
badge_number integer,
partner ref police_officer_t );

  • 可以使用对象类型 police_officer_t 来创建表
1
2
3
create table police_officers of police_officer_t
( primary key (badge_number),
scope for (partner) is police_officers );

  • 在表 police_officers 上的对象查询的例子

1. Retrieve the last names of all police officers who have partners over sixty years of age.

1
2
3
select  p.pol_person.pname.lname
from police_officers p
where p.partner.pol_person.age > 60;

6) 其它约束

  • 有关 REF定义的其它约束 (REF Dependencies)

1) 两张表之间的相互 REF关系的定义

  1. 首先,定义两个具有相互REF关系的对象类型 (create type …… )

  2. 再用创建好的对象类型创建关系表


2) 两个具有相互REF关系的表/类型的删除

  1. 在删除类型(drop type)之前需要先删除表(drop table)

  2. 在删除类型(drop type)时需要采用强制删除的方式

1
DROP TYPE typename FORCE;

先不管REF属性的赋值(先置为NULL),然后再使用UPDATE操作修改REF属性上的取值

1
2
3
4
5
6
7
update  orders  o
set ordcust = (select ref(c) from customers c
where c.cid = o.cid),
ordagent = (select ref(a) from agents a
where a.aid = o.aid),
ordprod = (select ref(p) from products p
where p.pid = o.pid);


二. Collection Types(集合类型)

1.Table Types ( Nested Tables )

1) 创建一个新的表类型 (table type)

1
2
CREATE  TYPE  dependents_t
AS TABLE OF person_t;

2) 使用类型dependents_t来定义表employees中的属性并形成一个嵌套表(nested table)定义。

1
2
3
4
5
6
create table employees (
eid int,
eperson person_t,
dependents dependents_t,
primary key (eid)
) nested table dependents store as dependents_tab;

3) Nested table的访问

  • table ( … ) 的使用

    Oracle 不支持直接对嵌套表属性的统计查询功能,即下述的统计查询操作是错误的:

    1
    2
    3
    select  count(e.dependents)
    from employees e
    where e.eid = 101;

    应为:

    1
    2
    3
    4
    Select  count(*)
    From table (
    select e.dependents
    from employees e where e.eid = 101);

  • Oracle 数据库没有提供 nested table 的相等比较运算。

    可以使用 IN 操作符来实现某些需要通过 nested table 进行的查询功能


  • Oracle提供了单个对象的相等比较功能

    1. Retrieve eids with dependents that have name given by
    name_t(‘Lukas’, ‘David’, ‘E’)

    1
    2
    3
    4
    select  eid  from  employees e
    where name_t(‘Lukas’, ‘David’, ‘E’) in
    ( select d.pname
    from table(e.dependents) d);

  • Nested Cursors (嵌套游标)

    (i) 普通的查询操作

    1
    2
    3
    select e.eid, d.ssno as dep_sso
    from employees e, table(e.dependents) d
    where d.age < 16;

    (ii) 使用 nested cursor 的查询操作

    1
    2
    3
    4
    select e.eid, cursor(select d.ssno as dep_ssno 
    from table(e.dependents) d
    where d.age < 16) dep_tab
    from employees e;

2.Array Types

1) Array Types for VARRAYs

1. 自定义一个TYPE使用 VARRAY来得到一个数组但只能对基本类型定义。

2. VARRAY数据结构可存储相同类型元素的一个固定大小的连续集合。
3. VARRAY用于存储数据的有序集合,但它往往是更加有用认为数组作为相同类型的变量的集合。


  • 一个VARRAY类型是用CREATE TYPE语句创建。必须指定最大长度,并存储在 VARRAY元素的类型。
    PL/SQL 块创建 VRRAY类型的基本语法:
1
TYPE  varray_type_name  IS  VARRAY(n) of <element_type>

create type extensions_t as varray(4) of int;

使用 Array Types 定义表中的属性

1
2
3
create  table  phonebook  (
phperson person_t,
extensions extensions_t );
大爷,赏点?