时间:2021-07-01 10:21:17 帮助过:9人阅读
CREATE TABLE celebs ( id INTEGER, name TEXT, age INTEGER ); # 第一列id,数据类型整数;第二列name,数据类型文本;第三列age,数据类型整数
1.5 Insert 向表中插入行
INSERT INTO celebs ( id, name, age) VALUES ( 1, ‘Alan Mathison Turing‘, 42); # 在celebs表最下方插入数据:id列为1,name列为Alan Mathion Turing,age列为42
1.6 Select 选取数据
SELECT * FROM celebs; # 显示celebs表所有数据
1.7 Update 修改数据
UPDATE celebs SET age = 22 WHERE id = 1; # 将celebs表中id=1的行的age改为22
1.8 Alert 更改表结构或数据类型
ALERT TABLE celebs ADD COLUMN twitter_handle TEXT; # 在celebs表增加twitter_handle列
ALERT TABLE ‘test‘.‘data‘ CHANGE COLUMN ‘Mobile‘ ‘Mobile‘ BLOB NULL DEFAULT NULL; # 将表test.data的Mobile列的数据类型改为BLOB,该列数据默认为NULL
1.9 DELETE 删除行
DELETE FROM celebs WHERE twitter_handle IS NULL; # 删除表celebs中twitter_handle为NULL的行
2. Queries - Retrieve data 2.3 Select Distinct 返回唯一不同的值
SELECT DISTINCT genre FROM movies; # 查询movies表中genre列的所有不重复值
2.4 Where 规定选择的标准
SELECT * FROM movies WHERE imdb_rating > 8; # 查询movies表中imdb_rating大于8的行
= equals != not equals > greater than < less than >= greater than or equal to <= less than or equal to 2.5 Like I 在 WHERE 子句中搜索列中的指定模式
SELECT * FROM movies WHERE name LIKE ‘ Se_en‘;
2.6 Like II
SELECT * FROM movies WHERE name LIKE ‘a%‘;
SELECT * FROM movies WHERE name LIKE ‘%man%‘;
NB 通配符 ‘_‘ substitutes any individual character ‘%‘ matches zero or more missing characters ‘[charlist]%‘ any individual character in string: WHERE city LIKE ‘[ALN]%‘ 以“A"或”L“或”N“开头的城市 ‘[!charlist]%‘ any individual character not in string: WHERE city LIKE ‘[!ALN]%‘ 不以“A"或”L“或”N“开头的城市 2.7 Between 在 WHERE 子句中使用,选取介于两个值之间的数据范围 The BETWEEN operator is used to filter the result set within a certain range. The values can be numbers, text or dates.
SELECT * FROM movies WHERE name BETWEEN ‘A‘ AND ‘J‘; # 查询movies中name以A至J开头的所有行
NB: names that begin with letter "A" up to but not including "J". 不同的数据库对 BETWEEN...AND 操作符的处理方式是有差异的,有开区间、闭区间,也有半开半闭区间。
SELECT * FROM movies WHERE year BETWEEN 1990 AND 2000; # 查询movies中year在1990至2000年间的行
NB: years between 1990 up to and including 2000 2.8 And 且运算符 AND is an operator that combines two conditions. Both conditions must be true for the row to be included in the result set.
SELECT * FROM movies WHERE year BETWEEN 1990 AND 2000 AND genre = ‘comedy‘; # 查询movies中year在1990至2000间,且genre为comedy的行
2.9 Or 或运算符 OR is used to combine more than one condition in WHERE clause. It evaluates each condition separately and if any of the conditions are true than the row is added to the result set. OR is an operator that filters the result set to only include rows where either condition is true.
SELECT * FROM movies WHERE genre = ‘comedy‘ OR year < 1980; # 查询movies中genre为comedy,或year小于1980的行
2.10 Order By 对结果集进行排序
SELECT * FROM movies ORDER BY imdb_rating DESC; # 查询movies中的行,结果以imdb_rating降序排列
DESC sorts the result by a particular column in descending order (high to low or Z - A). ASC ascending order (low to high or A - Z). 2.11 Limit 规定返回的记录的数目 LIMIT is a clause that lets you specify the maximum number of rows the result set will have.
SELECT * FROM movies ORDER BY imdb_rating ASC LIMIT 3; # 查询movies中的行,结果以imdb_rating升序排列,仅返回前3行
MS SQL Server中使用SELECT TOP 3,Oracle中使用WHERE ROWNUM <= 5(?) 3. Aggregate Function 3.2 Count 返回匹配指定条件的行数 COUNT( ) is a function that takes the name of a column as an argument and counts the number of rows where the column is not NULL.
SELECT COUNT(*) FROM fake_apps WHERE price = 0; # 返回fake_apps中price=0的行数
3.3 Group By 合计函数
SELECT price, COUNT(*) FROM fake_apps WHERE downloads > 2000 GROUP BY price; # 查询fake_apps表中downloads大于2000的行,将结果集根据price分组,返回price和行数
Here, our aggregate function is COUNT( ) and we are passing price as an argument(参数) to GROUP BY. SQL will count the total number of apps for each price in the table. It is usually helpful to SELECT the column you pass as an argument to GROUP BY. Here we SELECT price and COUNT(*). 3.4 Sum 返回数值列的总数(总额) SUM is a function that takes the name of a column as an argument and returns the sum of all the values in that column.
SELECT category, SUM(downloads) FROM fake_apps GROUP BY category;3.5 Max 返回一列中的最大值(NULL 值不包括在计算中) MAX( ) is a function that takes the name of a column as an argument and returns the largest value in that column.
SELECT name, category, MAX(downloads) FROM fake_apps GROUP BY category;3.6 Min 返回一列中的最小值(NULL 值不包括在计算中) MIN( ) is a function that takes the name of a column as an argument and returns the smallest value in that column.
SELECT name, category, MIN(downloads) FROM fake_apps GROUP BY category;3.7 Average 返回数值列的平均值(NULL 值不包括在计算中)
SELECT price, AVG(downloads) FROM fake_apps GROUP BY price;
3.8 Round 把数值字段舍入为指定的小数位数 ROUND( ) is a function that takes a column name and an integer as an argument. It rounds the values in the column to the number of decimal places specified by the integer.
SELECT price, ROUND(AVG(downloads), 2) FROM fake_apps GROUP BY price;4. Multiple Tables 4.2 Primary Key 主键 A primary key serves as a unique identifier for each row or record in a given table. The primary key is literally an "id" value for a record. We could use this value to connect the table to other tables.
CREATE TABLE artists ( id INTEGER PRIMARY KET, name TEXT );NB By specifying that the "id" column is the "PRIMARY KEY", SQL make sure that: 1. None of the values in this column are "NULL"; 2. Each value in this column is unique. A table can not have more than one "PRIMARY KEY" column. 4.3 Foreign Key 外键
SELECT * FROM albums WHERE artist_id = 3;
A foreign key is a column that contains the primary key of another table in the database. We use foreign keys and primary keys to connect rows in two different tables. One table‘s foreign key holds the value of another table‘s primary key. Unlike primary keys, foreign keys do not need to be unique and can be NULL. Here, artist_id is a foreign key in the "albums" table. The relationship between the "artists" table and the "albums" table is the "id" value of the artists. 4.4 Cross Join 用于生成两张表的笛卡尔集
SELECT albums.name, albums.year, artists.name FROM albums, artists;
One way to query multiple tables is to write a SELECT statement with multiple table names seperated by a comma. This is also known as a "cross join". When querying more than one table, column names need to be specified by table_name.column_name. Unfortunately, the result of this cross join is not very useful. It combines every row of the "artists" table with every row of the "albums" table. It would be more useful to only combine the rows where the album was created by the artist. 4.5 Inner Join 内连接:在表中存在至少一个匹配时,INNER JOIN 关键字返回行
SELECT * FROM albums JOIN artists ON albums.artist_id = artists.id; # INNER JOIN等价于JOIN,写JOIN默认为INNER JOIN
In SQL, joins are used to combine rows from two or more tables. The most common type of join in SQL is an inner join. An inner join will combine rows from different tables if the join condition is true. 1. SELECT *: specifies the columns our result set will have. Here * refers to every column in both tables; 2. FROM albums: specifies first table we are querying; 3. JOIN artists ON: specifies the type of join as well as the second table; 4. albums.artist_id = artists.id: is the join condition that describes how the two tables are related to each other. Here, SQL uses the foreign key column "artist_id" in the "albums" table to match it with exactly one row in the "artists" table with the same value in the "id" column. It will only match one row in the "artists" table because "id" is the PRIMARY KEY of "artists". 4.6 Left Outer Join 左外连接:即使右表中没有匹配,也从左表返回所有的行
SELECT * FROM albums LEFT JOIN artists ON albums.artist_id = artists.id;
Outer joins also combine rows from two or more tables, but unlike inner joins, they do not require the join condition to be met. Instead, every row in the left table is returned in the result set, and if the join condition is not met, the NULL values are used to fill in the columns from the right table. RIGHT JOIN 右外链接:即使左表中没有匹配,也从右表返回所有的行 FULL JOIN 全链接:只要其中一个表中存在匹配,就返回行 4.7 Aliases 为列名称和表名称指定别名 AS is a keyword in SQL that allows you to rename a column or table using an alias. The new name can be anything you want as long as you put it inside of single quotes.
SELECT albums.name AS ‘Album‘, albums.year, artists.name AS ‘Artist‘ FROM albums JOIN artists ON albums.artist_id = artists.id WHERE albums.year > 1980;NB The columns have not been renamed in either table. The aliases only appear in the result set. II. SQL: Table Transformation 1. Subqueries 子查询 1.2 Non-Correlated Subqueries I 不相关子查询
SELECT * FROM flights WHERE origin IN (SELECT code FROM airports WHERE elevation > 2000);
1.4 Non-Correlated Subqueries III
SELECT a.dep_month, a.dep_day_of_week, AVG(a.flight_count) AS average_flights FROM (SELECT dep_month, dep_day_of_week, dep_date, COUNT(*) AS flight_count FROM flights GROUP BY 1 , 2 , 3) a WHERE a.dep_day_of_week = ‘Friday‘ GROUP BY 1 , 2 ORDER BY 1 , 2; # 返回每个月中,每个星期五的平均航班数量
结构 [outer query] FROM [inner query] a WHERE GROUP BY ORDER BY NB "a": With the inner query, we create a virtual table. In the outer query, we can refer to the inner query as "a". "1,2,3" in inner query: refer to the first, second and third columns selected for display DBMS SELECT dep_month, (1) dep_day_of_week, (2) dep_date, (3) COUNT(*) AS flight_count (4) FROM flights
SELECT a.dep_month, a.dep_day_of_week, AVG(a.flight_distance) AS average_distance FROM (SELECT dep_month, dep_day_of_week, dep_date, SUM(distance) AS flight_distance FROM flights GROUP BY 1 , 2 , 3) a GROUP BY 1 , 2 ORDER BY 1 , 2; # 返回每个月中,每个周一、周二……至周日的平均飞行距离
1.5 Correlated Subqueries I 相关子查询 NB In a correlated subquery, the subquery can not be run independently of the outer query. The order of operations is important in a correlated subquery: 1. A row is processed in the outer query; 2. Then, for that particular row in the outer query, the subquery is executed. This means that for each row processed by the outer query, the subquery will also be processed for that row.
SELECT id FROM flights AS f WHERE distance > (SELECT AVG(distance) FROM flights WHERE carrier = f.carrier); # the list of all flights whose distance is above average for their carrier
1.6 Correlated Subqueries II In the above query, the inner query has to be reexecuted for each flight. Correlated subqueries may appear elsewhere besides the WHERE clause, they can also appear in the SELECT.
SELECT carrier, id, (SELECT COUNT(*) FROM flights f WHERE f.id < flights.id AND f.carrier = flights.carrier) + 1 AS flight_sequence_number FROM flights; # 结果集为航空公司,航班id以及序号。相同航空公司的航班,id越大则序号越大
相关子查询中,对于外查询执行的每一行,子查询都会为这一行执行一次。在这段代码中,每当外查询提取一行数据中的carrier和id,子查询就会COUNT表中有多少行的carrier与外查询中的行的carrier相同,且id小于外查询中的行,并在COUNT结果上+1,这一结果列别名为flight_sequence_number。于是,id越大的航班,序号就越大。 如果将"<"改为">",则id越大的航班,序号越小。 2. Set Operation 2.2 Union 并集 (only distinct values) Sometimes, we need to merge two tables together and then query the merged result. There are two ways of doing this: 1) Merge the rows, called a join. 2) Merge the columns, called a union.
SELECT item_name FROM legacy_products UNION SELECT item_name FROM new_products;
Each SELECT statement within the UNION must have the same number of columns with similar data types. The columns in each SELECT statement must be in the same order. By default, the UNION operator selects only distinct values. 2.3 Union All 并集 (allows duplicate values)
SELECT AVG(sale_price) FROM (SELECT id, sale_price FROM order_items UNION ALL SELECT id, sale_price FROM order_items_historic) AS a;
2.4 Intersect 交集 Microsoft SQL Server‘s INTERSECT returns any distinct values that are returned by both the query on the left and right sides of the INTERSECT operand.
SELECT category FROM new_products INTERSECT SELECT category FROM legacy_products;
NB MySQL不滋瓷INTERSECT,但可以用INNER JOIN+DISTINCT或WHERE...IN+DISTINCT或WHERE EXISTS实现:
SELECT DISTINCT category FROM new_products INNER JOIN legacy_products USING (category);
或
SELECT DISTINCT category FROM new_products WHERE category IN (SELECT category FROM legacy_products);
http://stackoverflow.com/questions/2621382/alternative-to-intersect-in-mysql 网上很多通过UNION ALL 实现的办法(如下)是错误的,可能会返回仅在一个表中出现且COUNT(*) > 1的值:
SELECT category, COUNT(*) FROM (SELECT category FROM new_products UNION ALL SELECT category FROM legacy_products) a GROUP BY category HAVING COUNT(*) > 1;
2.5 Except (MS SQL Server) / Minus (Oracle) 差集
SELECT category FROM legacy_products EXCEPT # 在Oracle中为MINUS SELECT category FROM new_products;
NB MySQL不滋瓷差集,但可以用WHERE...IS NULL+DISTINCT或WHERE...NOT IN+DISTINCT或WHERE EXISTS实现:
SELECT DISTINCT category FROM legacy_products LEFT JOIN new_products USING (category) WHERE new_products.category IS NULL;
或
SELECT DISTINCT category FROM legacy_products WHERE category NOT IN (SELECT category FROM new_products);
3. Conditional Aggregates 3.2 NULL use IS NULL or IS NOT NULL in the WHERE clause to test whether a value is or is not null.
SELECT COUNT(*) FROM flights WHERE arr_time IS NOT NULL AND destination = ‘ATL‘;
3.3 CASE WHEN "if, then, else"
SELECT CASE WHEN elevation < 250 THEN ‘Low‘ WHEN elevation BETWEEN 250 AND 1749 THEN ‘Medium‘ WHEN elevation >= 1750 THEN ‘High‘ ELSE ‘Unknown‘ END AS elevation_tier , COUNT(*) FROM airports GROUP BY 1;
END is required to terminate the statement, but ELSE is optional. If ELSE is not included, the result will be NULL. 3.4 COUNT(CASE WHEN) count the number of low elevation airports by state where low elevation is defined as less than 1000 ft.
SELECT state, COUNT(CASE WHEN elevation < 1000 THEN 1 ELSE NULL END) AS count_low_elevaton_airports FROM airports GROUP BY state;
3.5 SUM(CASE WHEN) sum the total flight distance and compare that to the sum of flight distance from a particular airline (in this case, Delta) by origin airport.
SELECT origin, SUM(distance) AS total_flight_distance, SUM(CASE WHEN carrier = ‘DL‘ THEN distance ELSE 0 END) AS total_delta_flight_distance FROM flights GROUP BY origin;
3.6 Combining aggregates find out the percentage of flight distance that is from Delta by origin airport.
SELECT origin, 100.0 * (SUM(CASE WHEN carrier = ‘DL‘ THEN distance ELSE 0 END) / SUM(distance)) AS percentage_flight_distance_from_delta FROM flights GROUP BY origin;
3.7 Combining aggregates II Find the percentage of high elevation airports (elevation >= 2000) by state from the airports table.
SELECT state, 100.0 * COUNT(CASE WHEN elevation >= 2000 THEN 1 ELSE NULL END) / COUNT(elevation) AS percentage_high_elevation_airports FROM airports GROUP BY 1;
或
SELECT state, 100.0 * SUM(CASE WHEN elevation >= 2000 THEN 1 ELSE 0 END) / COUNT(elevation) AS percentage_high_elevation_airports FROM airports GROUP BY 1;
4. Date, Number and String Functions MySQL Date 函数: https://dev.mysql.com/doc/refman/5.7/en/date-and-time-functions.html NOW() 返回当前的日期和时间 CURDATE() 返回当前的日期 CURTIME() 返回当前的时间 DATE() 提取日期或日期/时间表达式的日期部分