1 SQL and sqlite3#
1.1 Database#
Database Management System (DBMS): the software system that manages and interacts with databases, providing functionalities such as data storage, retrieval, manipulation, and security (The library).
Database: the actual collection of data organized in a structured format (The bookshelves).
Relational databases (or SQL databases) and nonrelational databases (or NoSQL databases) are two major types of database management systems (DBMS) that differ in their structure, storage models, and suitability for different types of applications.
Feature |
Relational Databases |
Non-Relational Databases |
---|---|---|
Schema |
Fixed schema with predefined tables and columns |
Flexible schema without fixed structure |
Query Language |
Structured Query Language (SQL) |
Various (e.g., MongoDB Query Language) |
Scalability |
Vertical scaling (adding more power to a single server) |
Horizontal scaling (adding more servers) |
Data Model |
Tabular, structured data |
Various (document, key-value, graph) |
Data Relationships |
Enforced through foreign keys and joins |
Managed within the application or inherent to the database type (e.g., graph) |
Use Cases |
Financial systems; Enterprise management applications; E-commerce |
Large scale; Big Data and Real-Time applications; Content management |
Popular database |
MySQL, PostgreSQL, Oracle Database, Microsoft SQL Server, SQLite |
*Graph databases are not based on tables with rows and columns, so they are not SQL databases or traditional relational databases.
1.2 sqlite3#
SQLite is an embedded relational database management system (RDBMS). It’s a software library that implements a self-contained, serverless, zero-configuration, transactional SQL database engine.
sqlite3 refers to both the C library used to interface with SQLite databases and the command-line tool provided with SQLite for interacting with SQLite databases. In addition, sqlite3 is a Python library that provides an interface for interacting with SQLite databases.
import sqlite3
import warnings
warnings.filterwarnings('ignore')
1.2.1 Connect / create db#
# Connect/create to the database.
conn = sqlite3.connect('MCU.db')
1.2.2 Create a cursor#
A cursor is a database object used in various database management systems (DBMS) to retrieve, manipulate, and navigate through the rows of a result set obtained from a query.
# Creates a cursor object usedto execute SQL queries and interact with the database.
cur = conn.cursor()
# Print the memory location (hexadecimal format) of the object.
print(cur)
<sqlite3.Cursor object at 0x105b835c0>
1.2.3 Execute a query#
# Create table with multi-lines (use ''' ''').
cur.execute(
'''
CREATE TABLE MCU_Movies (
movie_id INTEGER PRIMARY KEY,
title TEXT NOT NULL,
year INTEGER,
date DATE,
time TIME,
score REAL)
'''
)
<sqlite3.Cursor at 0x105b835c0>
# Create data with a Python list.
movie_data = [
(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9),
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7),
(1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0),
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0),
(1005, 'Captain America: The First Avenger', 2011, '2011-03-15', '06:33:00', 6.9),
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0),
(1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2),
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9),
(1009, 'Captain America: The Winter Soldier', 2014, '2014-11-12', '13:56:00', 7.7),
(1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0),
(1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3),
(1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3),
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8),
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5),
(1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6),
(1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4),
(1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9),
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3),
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
(1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1),
(1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9),
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)
]
# Insert data
# INSERT INTO: This is the standard SQL command to add new records (rows) into a specified table.
# OR IGNORE: This clause modifies the behaviour of the INSERT command.
# If there is a conflict (such as a duplicate primary key or a unique constraint violation),
# the OR IGNORE clause tells SQLite to ignore the insertion of that particular row instead of throwing an error.
# This is useful for avoiding errors when you are not sure if the data might already exist in the table.
# The VALUES clause specifies the values to be inserted into the columns of the table.
# Each ? corresponds to a value for each column in the MCU_Movies table.
# For example, if the table has four columns, the placeholders will be replaced with actual values when the query is executed.
cur.executemany("INSERT OR IGNORE INTO MCU_Movies VALUES(?,?,?,?,?,?)", movie_data)
<sqlite3.Cursor at 0x105b835c0>
1.2.4. Result fetchall and fetchone#
.fetchall() is used to retrieve each row sequentially from the result set.
.fetchone() is a method used in database programming to retrieve the next row of a query result set.
# Execute an SQL query to fetch all records from the MCU_Movies table (The table info has been saved in database MCU.db).
res = cur.execute('SELECT * FROM MCU_Movies')
# Retrieve and return all the rows from the executed query above.
res.fetchall()
[(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9),
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7),
(1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0),
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0),
(1005,
'Captain America: The First Avenger',
2011,
'2011-03-15',
'06:33:00',
6.9),
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0),
(1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2),
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9),
(1009,
'Captain America: The Winter Soldier',
2014,
'2014-11-12',
'13:56:00',
7.7),
(1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0),
(1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3),
(1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3),
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8),
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5),
(1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6),
(1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4),
(1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9),
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3),
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
(1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1),
(1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9),
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)]
# Executes an SQL query to retrieve the 'title' column from the 'MCU_Movies' table.
res.execute('SELECT title FROM MCU_Movies')
# Fetches the first row of the result set returned by the query.
res.fetchone()
('Iron Man',)
# Executes an SQL query to retrieve the 'title' column from the 'MCU_Movies' table.
res.execute('SELECT title FROM MCU_Movies')
# Fetches all rows returned by the query.
res.fetchall()
[('Iron Man',),
('The Incredible Hulk',),
('Iron Man 2',),
('Thor',),
('Captain America: The First Avenger',),
('The Avengers',),
('Iron Man 3',),
('Thor: The Dark World',),
('Captain America: The Winter Soldier',),
('Guardians of the Galaxy',),
('Avengers: Age of Ultron',),
('Ant-Man',),
('Captain America: Civil War',),
('Doctor Strange',),
('Guardians of the Galaxy Vol. 2',),
('Spider-Man: Homecoming',),
('Thor: Ragnarok',),
('Black Panther',),
('Avengers: Infinity War',),
('Ant-Man and The Wasp',),
('Captain Marvel',),
('Avengers: Endgame',)]
1.2.5. Commit#
con.commit() is used in the context of database operations, specifically when working with SQL databases through a connection object (named con). This function is crucial for managing transactions.
# Commit the connection object to the database (transaction from table MCU_movies to MCU.db now)
conn.commit()
1.2.6. Close the cursor#
cur.close() effectively releases the resources that the cursor was using, like memory or database connections.
# Close the cursor.
cur.close()
1.2.7. Close the connection#
conn.close() is used to close a database connection in Python.
# Close the connection.
conn.close()
1.3 SQL syntax (PART 1)#
In SQL, a clause is a component of a query that specifies a particular operation or condition. Each clause performs a specific function in the query and contributes to the overall task of retrieving, modifying, or managing data within a database.
In this section, we will practice primary clauses in SQL and compare them to Pandas in python.
SQL Order of Execution:
FROM – Determines the source tables.
JOIN – Combines data from multiple tables.
WHERE – Filters rows before grouping.
GROUP BY – Groups the data for aggregation.
HAVING – Filters grouped records.
SELECT – Chooses which columns to return.
DISTINCT – Removes duplicate rows.
ORDER BY – Sorts the final result.
LIMIT/OFFSET – Restricts the number of rows returned.
SQL code example:
SELECT company, COUNT(*) AS num_employees -- 5
FROM employees -- 1
WHERE salary > 5000 -- 2
GROUP BY company -- 3
HAVING COUNT(*) > 5 -- 4
ORDER BY num_employees DESC -- 6
LIMIT 10; -- 7
1.3.1 SELECT FROM#
Identifies the columns and tables to be retrieved from the database.
# As the connection has been closed we need to reconnect to the database created in the last section.
conn = sqlite3.connect('MCU.db')
# Create cur.
cur = conn.cursor()
cur
<sqlite3.Cursor at 0x105ea88c0>
Select all column values from the table#
# Select all columns (with values) from the table.
query = "SELECT * FROM MCU_Movies"
# Executation and fetch all info.
res = cur.execute(query)
res.fetchall()
[(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9),
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7),
(1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0),
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0),
(1005,
'Captain America: The First Avenger',
2011,
'2011-03-15',
'06:33:00',
6.9),
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0),
(1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2),
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9),
(1009,
'Captain America: The Winter Soldier',
2014,
'2014-11-12',
'13:56:00',
7.7),
(1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0),
(1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3),
(1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3),
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8),
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5),
(1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6),
(1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4),
(1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9),
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3),
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
(1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1),
(1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9),
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)]
# Use pandas to open the table: retrieving the data from the query (`"SELECT * FROM MCU_Movies"`) and then save it to a DataFrame (`df`).
import pandas as pd
df = pd.read_sql_query(query, conn)
# To be notified, the DataFrame is saved in a different memory location with a different index compared to the cursor.
print(hex(id(df)))
0x127b4ded0
# Df info.
df
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
0 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 |
1 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 |
2 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 |
3 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 |
4 | 1005 | Captain America: The First Avenger | 2011 | 2011-03-15 | 06:33:00 | 6.9 |
5 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 |
6 | 1007 | Iron Man 3 | 2013 | 2013-07-20 | 19:12:00 | 7.2 |
7 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 |
8 | 1009 | Captain America: The Winter Soldier | 2014 | 2014-11-12 | 13:56:00 | 7.7 |
9 | 1010 | Guardians of the Galaxy | 2014 | 2014-12-03 | 22:25:00 | 8.0 |
10 | 1011 | Avengers: Age of Ultron | 2015 | 2015-02-21 | 20:50:00 | 7.3 |
11 | 1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 |
12 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 |
13 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 |
14 | 1015 | Guardians of the Galaxy Vol. 2 | 2017 | 2017-10-20 | 12:41:00 | 7.6 |
15 | 1016 | Spider-Man: Homecoming | 2017 | 2017-12-14 | 02:39:00 | 7.4 |
16 | 1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 |
17 | 1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 |
18 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 |
19 | 1020 | Ant-Man and The Wasp | 2018 | 2018-07-06 | 16:29:00 | 7.1 |
20 | 1021 | Captain Marvel | 2019 | 2019-03-08 | 14:19:00 | 6.9 |
21 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 |
Select one column values from the table#
res = cur.execute("SELECT title FROM MCU_Movies")
res.fetchall()
[('Iron Man',),
('The Incredible Hulk',),
('Iron Man 2',),
('Thor',),
('Captain America: The First Avenger',),
('The Avengers',),
('Iron Man 3',),
('Thor: The Dark World',),
('Captain America: The Winter Soldier',),
('Guardians of the Galaxy',),
('Avengers: Age of Ultron',),
('Ant-Man',),
('Captain America: Civil War',),
('Doctor Strange',),
('Guardians of the Galaxy Vol. 2',),
('Spider-Man: Homecoming',),
('Thor: Ragnarok',),
('Black Panther',),
('Avengers: Infinity War',),
('Ant-Man and The Wasp',),
('Captain Marvel',),
('Avengers: Endgame',)]
df.title
0 Iron Man
1 The Incredible Hulk
2 Iron Man 2
3 Thor
4 Captain America: The First Avenger
5 The Avengers
6 Iron Man 3
7 Thor: The Dark World
8 Captain America: The Winter Soldier
9 Guardians of the Galaxy
10 Avengers: Age of Ultron
11 Ant-Man
12 Captain America: Civil War
13 Doctor Strange
14 Guardians of the Galaxy Vol. 2
15 Spider-Man: Homecoming
16 Thor: Ragnarok
17 Black Panther
18 Avengers: Infinity War
19 Ant-Man and The Wasp
20 Captain Marvel
21 Avengers: Endgame
Name: title, dtype: object
df['title']
0 Iron Man
1 The Incredible Hulk
2 Iron Man 2
3 Thor
4 Captain America: The First Avenger
5 The Avengers
6 Iron Man 3
7 Thor: The Dark World
8 Captain America: The Winter Soldier
9 Guardians of the Galaxy
10 Avengers: Age of Ultron
11 Ant-Man
12 Captain America: Civil War
13 Doctor Strange
14 Guardians of the Galaxy Vol. 2
15 Spider-Man: Homecoming
16 Thor: Ragnarok
17 Black Panther
18 Avengers: Infinity War
19 Ant-Man and The Wasp
20 Captain Marvel
21 Avengers: Endgame
Name: title, dtype: object
1.3.2 WHERE#
Filters the records based on specified conditions.
This section also introduces df.query(), which is a powerful data manipulation library in python. This method provides a more readable and concise way to filter rows based on column values, compared to using traditional indexing and boolean indexing.
score >= 8#
res = cur.execute("SELECT * FROM MCU_Movies WHERE score >= 8")
res.fetchall()
[(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0),
(1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0),
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)]
df[df.score >= 8]
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
5 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 |
9 | 1010 | Guardians of the Galaxy | 2014 | 2014-12-03 | 22:25:00 | 8.0 |
18 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 |
21 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 |
df.query('score >= 8')
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
5 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 |
9 | 1010 | Guardians of the Galaxy | 2014 | 2014-12-03 | 22:25:00 | 8.0 |
18 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 |
21 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 |
year >= 2017#
res = cur.execute("SELECT * FROM MCU_Movies WHERE year >= 2017")
res.fetchall()
[(1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6),
(1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4),
(1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9),
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3),
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
(1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1),
(1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9),
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)]
df.year.dtype
dtype('int64')
df[df.year >= 2017] # 2017 is integer
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
14 | 1015 | Guardians of the Galaxy Vol. 2 | 2017 | 2017-10-20 | 12:41:00 | 7.6 |
15 | 1016 | Spider-Man: Homecoming | 2017 | 2017-12-14 | 02:39:00 | 7.4 |
16 | 1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 |
17 | 1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 |
18 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 |
19 | 1020 | Ant-Man and The Wasp | 2018 | 2018-07-06 | 16:29:00 | 7.1 |
20 | 1021 | Captain Marvel | 2019 | 2019-03-08 | 14:19:00 | 6.9 |
21 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 |
date < 2018-01-01#
res = cur.execute("SELECT * FROM MCU_Movies WHERE date < '2018-01-01'")
res.fetchall()
[(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9),
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7),
(1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0),
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0),
(1005,
'Captain America: The First Avenger',
2011,
'2011-03-15',
'06:33:00',
6.9),
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0),
(1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2),
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9),
(1009,
'Captain America: The Winter Soldier',
2014,
'2014-11-12',
'13:56:00',
7.7),
(1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0),
(1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3),
(1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3),
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8),
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5),
(1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6),
(1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4),
(1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9)]
Note:
When you read date or datetime columns from a SQL table into a pandas DataFrame and they are of object type (strings), you can still use them for filtering conditions directly, but it requires understanding how string comparisons work.
In pandas, if a date column is stored as an object type (string), you can still perform filtering operations because string comparisons in Python follow lexicographical order, which can be useful for date strings in the format ‘YYYY-MM-DD’ or ‘YYYY-MM-DD HH:MM:SS’.
This means you cannot use some datetime attribute like ‘date.dayofweek’ or ‘date.dayofyear’ unless the object type changed to pandas datetime type.
print(df.date.dtype) # Here, date and time column have not been changed to datetime.
object
type(df.date[1])
str
df[df.date < '2018-01-01']
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
0 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 |
1 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 |
2 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 |
3 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 |
4 | 1005 | Captain America: The First Avenger | 2011 | 2011-03-15 | 06:33:00 | 6.9 |
5 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 |
6 | 1007 | Iron Man 3 | 2013 | 2013-07-20 | 19:12:00 | 7.2 |
7 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 |
8 | 1009 | Captain America: The Winter Soldier | 2014 | 2014-11-12 | 13:56:00 | 7.7 |
9 | 1010 | Guardians of the Galaxy | 2014 | 2014-12-03 | 22:25:00 | 8.0 |
10 | 1011 | Avengers: Age of Ultron | 2015 | 2015-02-21 | 20:50:00 | 7.3 |
11 | 1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 |
12 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 |
13 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 |
14 | 1015 | Guardians of the Galaxy Vol. 2 | 2017 | 2017-10-20 | 12:41:00 | 7.6 |
15 | 1016 | Spider-Man: Homecoming | 2017 | 2017-12-14 | 02:39:00 | 7.4 |
16 | 1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 |
df.query("date < '2018-01-01'")
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
0 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 |
1 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 |
2 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 |
3 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 |
4 | 1005 | Captain America: The First Avenger | 2011 | 2011-03-15 | 06:33:00 | 6.9 |
5 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 |
6 | 1007 | Iron Man 3 | 2013 | 2013-07-20 | 19:12:00 | 7.2 |
7 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 |
8 | 1009 | Captain America: The Winter Soldier | 2014 | 2014-11-12 | 13:56:00 | 7.7 |
9 | 1010 | Guardians of the Galaxy | 2014 | 2014-12-03 | 22:25:00 | 8.0 |
10 | 1011 | Avengers: Age of Ultron | 2015 | 2015-02-21 | 20:50:00 | 7.3 |
11 | 1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 |
12 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 |
13 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 |
14 | 1015 | Guardians of the Galaxy Vol. 2 | 2017 | 2017-10-20 | 12:41:00 | 7.6 |
15 | 1016 | Spider-Man: Homecoming | 2017 | 2017-12-14 | 02:39:00 | 7.4 |
16 | 1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 |
time < 12:20:01#
res = cur.execute("SELECT * FROM MCU_Movies WHERE date < time < '12:20:01'")
res.fetchall()
[(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9),
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7),
(1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0),
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0),
(1005,
'Captain America: The First Avenger',
2011,
'2011-03-15',
'06:33:00',
6.9),
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0),
(1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2),
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9),
(1009,
'Captain America: The Winter Soldier',
2014,
'2014-11-12',
'13:56:00',
7.7),
(1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0),
(1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3),
(1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3),
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8),
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5),
(1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6),
(1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4),
(1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9),
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3),
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
(1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1),
(1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9),
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)]
df[df.time < '12:20:01']
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
2 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 |
3 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 |
4 | 1005 | Captain America: The First Avenger | 2011 | 2011-03-15 | 06:33:00 | 6.9 |
5 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 |
12 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 |
15 | 1016 | Spider-Man: Homecoming | 2017 | 2017-12-14 | 02:39:00 | 7.4 |
16 | 1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 |
17 | 1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 |
21 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 |
df.query("time < '12:20:01'")
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
2 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 |
3 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 |
4 | 1005 | Captain America: The First Avenger | 2011 | 2011-03-15 | 06:33:00 | 6.9 |
5 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 |
12 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 |
15 | 1016 | Spider-Man: Homecoming | 2017 | 2017-12-14 | 02:39:00 | 7.4 |
16 | 1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 |
17 | 1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 |
21 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 |
title = Thor#
res = cur.execute("SELECT * FROM MCU_Movies WHERE title = 'Thor'")
res.fetchall()
[(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0)]
df[df.title == 'Thor']
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
3 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 |
df.query("title == 'Thor'")
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
3 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 |
1.3.3 AND, OR, NOT used in WHERE#
Filters the records based on multi-conditions.
AND#
# Selects the 'title' of MCU movies released after 2018 with a score greater than 7.
res = cur.execute("SELECT title FROM MCU_Movies WHERE year > 2018 AND score > 7")
res.fetchall()
[('Avengers: Endgame',)]
df[(df.year > 2018) & (df.score > 7)].title
21 Avengers: Endgame
Name: title, dtype: object
df.query('year > 2018 & score > 7') .title
21 Avengers: Endgame
Name: title, dtype: object
OR#
# Selects movie titles from the MCU_Movies table where the release date is before January 1, 2012,
# or the movie has a score greater than 8.
res = cur.execute("SELECT title FROM MCU_Movies WHERE date < '2012-01-01' OR score > 8")
res.fetchall()
[('Iron Man',),
('The Incredible Hulk',),
('Iron Man 2',),
('Thor',),
('Captain America: The First Avenger',),
('Avengers: Infinity War',),
('Avengers: Endgame',)]
df[(df.date < '2012-01-01') | (df.score > 8)].title
0 Iron Man
1 The Incredible Hulk
2 Iron Man 2
3 Thor
4 Captain America: The First Avenger
18 Avengers: Infinity War
21 Avengers: Endgame
Name: title, dtype: object
df.query('date < "2012-01-01" | score > 8').title # you can use '' or "" to quote the syntax.
0 Iron Man
1 The Incredible Hulk
2 Iron Man 2
3 Thor
4 Captain America: The First Avenger
18 Avengers: Infinity War
21 Avengers: Endgame
Name: title, dtype: object
NOT#
# Execute query to fetch movie titles from MCU_Movies table where the release date is on or after '2016-01-01'
# and the score is not equal to 6
res = cur.execute("SELECT title FROM MCU_Movies WHERE NOT date < '2016-01-01' AND score != 6")
res.fetchall()
[('Captain America: Civil War',),
('Doctor Strange',),
('Guardians of the Galaxy Vol. 2',),
('Spider-Man: Homecoming',),
('Thor: Ragnarok',),
('Black Panther',),
('Avengers: Infinity War',),
('Ant-Man and The Wasp',),
('Captain Marvel',),
('Avengers: Endgame',)]
df[~(df.date < '2016-01-01') & (df.score != 6)].title
12 Captain America: Civil War
13 Doctor Strange
14 Guardians of the Galaxy Vol. 2
15 Spider-Man: Homecoming
16 Thor: Ragnarok
17 Black Panther
18 Avengers: Infinity War
19 Ant-Man and The Wasp
20 Captain Marvel
21 Avengers: Endgame
Name: title, dtype: object
df.query('~ (date < "2016-01-01" & score != 6)').title
12 Captain America: Civil War
13 Doctor Strange
14 Guardians of the Galaxy Vol. 2
15 Spider-Man: Homecoming
16 Thor: Ragnarok
17 Black Panther
18 Avengers: Infinity War
19 Ant-Man and The Wasp
20 Captain Marvel
21 Avengers: Endgame
Name: title, dtype: object
1.3.4 COUNT and DISTINCT#
The COUNT function in SQL returns the number of rows.
COUNT(DISTINCT column) returns the number of unique, non-null values in a specified column.
# Execute the SQL query to count the number of MCU movies with a score greater than 8
res = cur.execute("SELECT COUNT(*) FROM MCU_Movies WHERE score > 8")
res.fetchall()
[(2,)]
len(df[df.score > 8])
2
# SQL query to fetch unique years from the MCU_Movies table
res = cur.execute("SELECT DISTINCT year FROM MCU_Movies")
res.fetchall()
[(2008,),
(2010,),
(2011,),
(2012,),
(2013,),
(2014,),
(2015,),
(2016,),
(2017,),
(2018,),
(2019,)]
df.year.unique()
array([2008, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019])
# Executes a query to count the number of distinct/unique years in the MCU_Movies table
res = cur.execute("SELECT COUNT(DISTINCT year) FROM MCU_Movies")
res.fetchall()
[(11,)]
df.year.nunique()
11
len(df.year.unique())
11
1.3.5 MAX, MIN, AVG#
# Fetches the lowest score from the MCU_Movies table.
res = cur.execute("SELECT MIN(score) FROM MCU_Movies")
res.fetchall()
[(6.7,)]
df.score.min()
6.7
# Fetches the highest score from the MCU_Movies table.
res = cur.execute("SELECT MAX(score) FROM MCU_Movies")
res.fetchall()
[(8.4,)]
df.score.max()
8.4
# Executes a query to get the average score from the MCU_Movies table.
res = cur.execute("SELECT AVG(score) FROM MCU_Movies")
res.fetchall()
[(7.463636363636363,)]
df.score.mean()
7.463636363636365
1.3.6 IN NOT IN#
IN operator checks if a value matches any value in a list or subquery.
NOT IN checks if a value does not match any value in the list or subquery.
IN#
# Fetch all movies from the 'MCU_Movies' table where the release year is either 2011, 2015, or 2016.
res = cur.execute("SELECT * FROM MCU_Movies WHERE year IN (2011, 2015, 2016)")
res.fetchall()
[(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0),
(1005,
'Captain America: The First Avenger',
2011,
'2011-03-15',
'06:33:00',
6.9),
(1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3),
(1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3),
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8),
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5)]
select_years = [2011, 2015, 2016]
df[df.year.isin(select_years)]
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
3 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 |
4 | 1005 | Captain America: The First Avenger | 2011 | 2011-03-15 | 06:33:00 | 6.9 |
10 | 1011 | Avengers: Age of Ultron | 2015 | 2015-02-21 | 20:50:00 | 7.3 |
11 | 1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 |
12 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 |
13 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 |
df.query('year in [2011, 2015, 2016]')
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
3 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 |
4 | 1005 | Captain America: The First Avenger | 2011 | 2011-03-15 | 06:33:00 | 6.9 |
10 | 1011 | Avengers: Age of Ultron | 2015 | 2015-02-21 | 20:50:00 | 7.3 |
11 | 1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 |
12 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 |
13 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 |
df.query('year in @select_years') # the @ symbol is used to denote a variable name when constructing a query string with parameters.
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
3 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 |
4 | 1005 | Captain America: The First Avenger | 2011 | 2011-03-15 | 06:33:00 | 6.9 |
10 | 1011 | Avengers: Age of Ultron | 2015 | 2015-02-21 | 20:50:00 | 7.3 |
11 | 1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 |
12 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 |
13 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 |
NOT IN#
res = cur.execute("SELECT * FROM MCU_Movies WHERE year NOT IN (2008, 2011, 2012, 2013, 2015, 2016, 2017)")
res.fetchall()
[(1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0),
(1009,
'Captain America: The Winter Soldier',
2014,
'2014-11-12',
'13:56:00',
7.7),
(1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0),
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3),
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
(1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1),
(1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9),
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)]
select_years = [2008, 2011, 2012, 2013, 2015, 2016, 2017]
df[~df.year.isin(select_years)]
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
2 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 |
8 | 1009 | Captain America: The Winter Soldier | 2014 | 2014-11-12 | 13:56:00 | 7.7 |
9 | 1010 | Guardians of the Galaxy | 2014 | 2014-12-03 | 22:25:00 | 8.0 |
17 | 1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 |
18 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 |
19 | 1020 | Ant-Man and The Wasp | 2018 | 2018-07-06 | 16:29:00 | 7.1 |
20 | 1021 | Captain Marvel | 2019 | 2019-03-08 | 14:19:00 | 6.9 |
21 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 |
df.query('year not in @select_years')
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
2 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 |
8 | 1009 | Captain America: The Winter Soldier | 2014 | 2014-11-12 | 13:56:00 | 7.7 |
9 | 1010 | Guardians of the Galaxy | 2014 | 2014-12-03 | 22:25:00 | 8.0 |
17 | 1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 |
18 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 |
19 | 1020 | Ant-Man and The Wasp | 2018 | 2018-07-06 | 16:29:00 | 7.1 |
20 | 1021 | Captain Marvel | 2019 | 2019-03-08 | 14:19:00 | 6.9 |
21 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 |
1.3.7 BETWEEN#
BETWEEN operator is used to filter the result set within a specified range, inclusive of the boundary values.
# Fetch all MCU movies with a score between 8 and 9.
res = cur.execute("SELECT * FROM MCU_Movies WHERE score BETWEEN 8 AND 9")
res.fetchall()
[(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0),
(1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0),
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)]
df[df.score.between(8, 9)]
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
5 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 |
9 | 1010 | Guardians of the Galaxy | 2014 | 2014-12-03 | 22:25:00 | 8.0 |
18 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 |
21 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 |
df[(df.score >= 8) & (df.score <= 9)]
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
5 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 |
9 | 1010 | Guardians of the Galaxy | 2014 | 2014-12-03 | 22:25:00 | 8.0 |
18 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 |
21 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 |
df.query('score >= 8 and score <= 9')
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
5 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 |
9 | 1010 | Guardians of the Galaxy | 2014 | 2014-12-03 | 22:25:00 | 8.0 |
18 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 |
21 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 |
1.3.8 LIKE#
LIKE operator is used to search for a specified string pattern within a column.
# Selects all columns from the MCU_Movies table where the title starts with 'a'.
res = cur.execute("SELECT * FROM MCU_Movies WHERE title LIKE 'a%'") # upper and lower case is not sensitive in SQL
res.fetchall()
[(1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3),
(1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3),
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
(1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1),
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)]
df[df.title.str.startswith('A')] # upper and lower case is sensitive in python
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
10 | 1011 | Avengers: Age of Ultron | 2015 | 2015-02-21 | 20:50:00 | 7.3 |
11 | 1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 |
18 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 |
19 | 1020 | Ant-Man and The Wasp | 2018 | 2018-07-06 | 16:29:00 | 7.1 |
21 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 |
df.query("title.str.startswith('A')")
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
10 | 1011 | Avengers: Age of Ultron | 2015 | 2015-02-21 | 20:50:00 | 7.3 |
11 | 1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 |
18 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 |
19 | 1020 | Ant-Man and The Wasp | 2018 | 2018-07-06 | 16:29:00 | 7.1 |
21 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 |
# Selects all columns from the MCU_Movies table where the title ends with 'k'.
res = cur.execute("SELECT * FROM MCU_Movies WHERE title LIKE '%k'")
res.fetchall()
[(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7),
(1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9)]
df[df.title.str.endswith('k')]
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
1 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 |
16 | 1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 |
# Selects all columns from the MCU_Movies table where the title starts with 'T' and ends with 'r'.
res = cur.execute("SELECT * FROM MCU_Movies WHERE title LIKE 'T__r'")
res.fetchall()
[(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0)]
Python symbols
^: Start with
T: Followed by a literal ‘T’
..: Followed by any two characters
r: Followed by a literal ‘r’
$: End with
df[df.title.str.contains(r'^T..r$')]
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
3 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 |
1.3.9 ORDER BY#
ORDER BY in SQL is used to sort the result set of a query based on specified columns in ascending or descending order.
# Retrieves all columns from the 'MCU_Movies' table, sorting the results in descending order by the 'title' column.
res = cur.execute("SELECT * FROM MCU_Movies ORDER BY title DESC")
res.fetchall()
[(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9),
(1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9),
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0),
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7),
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0),
(1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4),
(1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2),
(1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0),
(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9),
(1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6),
(1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0),
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5),
(1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9),
(1009,
'Captain America: The Winter Soldier',
2014,
'2014-11-12',
'13:56:00',
7.7),
(1005,
'Captain America: The First Avenger',
2011,
'2011-03-15',
'06:33:00',
6.9),
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8),
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3),
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4),
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4),
(1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3),
(1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1),
(1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3)]
df.sort_values(by='title', ascending=False)
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
7 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 |
16 | 1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 |
3 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 |
1 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 |
5 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 |
15 | 1016 | Spider-Man: Homecoming | 2017 | 2017-12-14 | 02:39:00 | 7.4 |
6 | 1007 | Iron Man 3 | 2013 | 2013-07-20 | 19:12:00 | 7.2 |
2 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 |
0 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 |
14 | 1015 | Guardians of the Galaxy Vol. 2 | 2017 | 2017-10-20 | 12:41:00 | 7.6 |
9 | 1010 | Guardians of the Galaxy | 2014 | 2014-12-03 | 22:25:00 | 8.0 |
13 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 |
20 | 1021 | Captain Marvel | 2019 | 2019-03-08 | 14:19:00 | 6.9 |
8 | 1009 | Captain America: The Winter Soldier | 2014 | 2014-11-12 | 13:56:00 | 7.7 |
4 | 1005 | Captain America: The First Avenger | 2011 | 2011-03-15 | 06:33:00 | 6.9 |
12 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 |
17 | 1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 |
18 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 |
21 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 |
10 | 1011 | Avengers: Age of Ultron | 2015 | 2015-02-21 | 20:50:00 | 7.3 |
19 | 1020 | Ant-Man and The Wasp | 2018 | 2018-07-06 | 16:29:00 | 7.1 |
11 | 1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 |
1.3.10 LIMIT / OFFSET#
LIMIT / OFFSET controls the number of rows returned by a query and to specify where to start returning rows from, respectively.
# Selects all columns from the first 3 rows of the MCU_Movies table
res = cur.execute("SELECT * FROM MCU_Movies LIMIT 3")
res.fetchall()
[(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9),
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7),
(1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0)]
df.head(3)
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
0 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 |
1 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 |
2 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 |
df.iloc[0:3]
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
0 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 |
1 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 |
2 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 |
# Fetch 3 rows starting from the 6th row (offset 5) in the MCU_Movies table
res = cur.execute("SELECT * FROM MCU_Movies LIMIT 3 OFFSET 5")
res.fetchall()
[(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0),
(1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2),
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9)]
df.iloc[5:8]
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
5 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 |
6 | 1007 | Iron Man 3 | 2013 | 2013-07-20 | 19:12:00 | 7.2 |
7 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 |
1.4 SQL syntax (PART 2)#
1.4.1 PRAGMA#
The PRAGMA statement in SQL is used to query or modify various internal parameters of the SQLite database engine.
# Connect to the created database.
conn = sqlite3.connect('MCU.db')
# Creates a cursor object usedto execute SQL queries and interact with the database as cursor colsed in the last section.
cur = conn.cursor()
# Execute the PRAGMA table_info command to retrieve metadata in the 'MCU_movies' table
res = cur.execute("PRAGMA table_info(MCU_movies)")
res.fetchall()
[(0, 'movie_id', 'INTEGER', 0, None, 1),
(1, 'title', 'TEXT', 1, None, 0),
(2, 'year', 'INTEGER', 0, None, 0),
(3, 'date', 'DATE', 0, None, 0),
(4, 'time', 'TIME', 0, None, 0),
(5, 'score', 'REAL', 0, None, 0)]
1.4.2 Primary key and Foregin key#
Students |
---|
StudentID (SI) |
StudentName |
HomePhoneNumber |
HomeAddress |
FeeStatus |
Note: PK: SI |
Modules |
---|
ModuleID (MI) |
ModuleName |
Lecturers |
NumberofCredits |
HELevel |
Note: PK: MI |
Enrolments |
---|
EnrolmentID (EI) |
StudentID (SI) |
ModuleID (MI) |
EnrollmentDate |
AdditionalInformation |
Note: PK: EI, FK: SI, MI |
A PRIMARY KEY (PK) constraint uniquely identifies each record in a table (It ensures that no two rows have the same value).
Primary keys must contain UNIQUE values and cannot contain NULL values;
A table can have only one primary key (ususally one column);
CREATE TABLE Students (
StudentID INT PRIMARY KEY, -- Student ID is the primary key
StudentName VARCHAR(50)
);
ALTER TABLE Students
ADD PRIMARY KEY (StudentID);
And a primary key can also consist of multiple columns (called a composite key) when a single column isn’t enough to uniquely identify a row.
CREATE TABLE Enrollments (
StudentID INT,
ModuleID INT,
EnrollmentDate DATE,
PRIMARY KEY (StudentID, ModuleID) -- Composite primary key
);
A Foreign Key (FK) is a column (or a set of columns) in one table that establishes a relationship with the with the Primary Key of another table. Can be NULL, Multiple foreign keys allowed.
CREATE TABLE Enrollments (
EnrollmentID INT PRIMARY KEY,
StudentID INT,
CourseID INT,
FOREIGN KEY (StudentID) REFERENCES Students(StudentID), -- StudentID references the primary key in Students table
FOREIGN KEY (ModuleID) REFERENCES Modules(ModuleID) -- Foreign key for ModuleID
);
ALTER TABLE Enrollments
ADD FOREIGN KEY (StudentID) REFERENCES Students(StudentID);
ALTER TABLE Enrollments
ADD CONSTRAINT fk_student
FOREIGN KEY (StudentID) REFERENCES Students(StudentID);
ALTER TABLE Enrollments
DROP CONSTRAINT fk_student;
Check Primary key in a table#
# No primary key found in MCU_movies
res = cur.execute("PRAGMA primary_key_list(MCU_movies)")
res.fetchall()
[]
Check Foreign key in a table#
# No foreign key found in MCU_movies
res = cur.execute("PRAGMA foreign_key_list(MCU_movies)")
res.fetchall()
[]
Linking tables using keys#
# Create table Actors
res = cur.execute('''
CREATE TABLE Actors (
actor_id INTEGER PRIMARY KEY,
actor_name TEXT,
movie_id INTEGER,
gender TEXT,
nationality TEXT,
date_birth DATE,
FOREIGN KEY (movie_id) REFERENCES MCU_Movies(movie_id)
)''')
res.fetchall()
[]
Checking tables in a database#
The sqlite_master table is a special system table that stores information about all the schema objects in the database, such as tables, indexes, triggers, and views.
# Here we have table Actors without any values
res = cur.execute("SELECT name FROM sqlite_master WHERE type='table'")
res.fetchall()
[('MCU_Movies',), ('Actors',)]
## actor table information
actor_data = [
(1, 'Robert Downey Jr.', 1001, 'Male', 'American', '1965-04-04'),
(2, 'Chris Evans', 1005, 'Male', 'American', '1981-06-13'),
(3, 'Mark Ruffalo', 1006, 'Male', 'American', '1967-11-22'),
(4, 'Chris Hemsworth', 1004, 'Male', 'Australian', '1983-08-11'),
(5, 'Scarlett Johansson', 1006, 'Female', 'American', '1984-11-22'),
(6, 'Jeremy Renner', 1006, 'Male', 'American', '1971-01-07'),
(7, 'Tom Hiddleston', 1004, 'Male', 'British', '1981-02-09'),
(8, 'Clark Gregg', 1001, 'Male', 'American', '1962-04-02'),
(9, 'Paul Rudd', 1012, 'Male', 'American', '1969-04-06'),
(10, 'Benedict Cumberbatch', 1014, 'Male', 'British', '1976-07-19'),
(11, 'Tom Holland', 1008, 'Male', 'British', '1996-06-01'),
(12, 'Elizabeth Olsen', 1022, 'Female', 'American', '1989-02-16'),
(13, 'Anthony Mackie', 1017, 'Male', 'American', '1978-09-23'),
(14, 'Chadwick Boseman', 1008, 'Male', 'American', '1976-11-29'),
(15, 'Chris Pratt', 1017, 'Male', 'American', '1979-06-21'),
(16, 'Zoe Saldana', 1004, 'Female', 'American', '1978-06-19'),
(17, 'Karen Gillan', 1008, 'Female', 'British', '1987-11-28'),
(18, 'Dave Bautista', 1013, 'Male', 'American', '1969-01-18'),
(19, 'Vin Diesel', 1013, 'Male', 'American', '1967-07-18'),
(20, 'Bradley Cooper', 1006, 'Male', 'American', '1975-01-05'),
(21, 'Brie Larson', 1008, 'Female', 'American', '1989-10-01'),
(22, 'Michael B. Jordan', 1022, 'Male', 'American', '1987-02-09'),
(23, 'Tessa Thompson', 1002, 'Female', 'American', '1983-10-03'),
(24, 'Samuel L. Jackson', 1012, 'Male', 'American', '1948-12-21'),
(25, 'Gwyneth Paltrow', 1013, 'Female', 'American', '1972-09-27'),
(26, 'Don Cheadle', 1014, 'Male', 'American', '1964-11-29'),
(27, 'Paul Bettany', 1021, 'Male', 'British', '1971-05-27'),
(28, 'Sebastian Stan', 1007, 'Male', 'Romanian', '1982-08-13'),
(29, 'Josh Brolin', 1019, 'Male', 'American', '1968-02-12'),
(30, 'Karen Gillan', 1019, 'Female', 'British', '1987-11-28'),
(31, 'Pom Klementieff', 1002, 'Female', 'French', '1986-05-03'),
(32, 'Evangeline Lilly', 1020, 'Female', 'Canadian', '1979-08-03'),
(33, 'Tilda Swinton', 1019, 'Female', 'British', '1960-11-05'),
(34, 'Jeff Goldblum', 1021, 'Male', 'American', '1952-10-22'),
(35, 'Jon Favreau', 1018, 'Male', 'American', '1966-10-19'),
(36, 'Ben Kingsley', 1013, 'Male', 'British', '1943-12-31'),
(37, 'Cobie Smulders', 1022, 'Female', 'Canadian', '1982-04-03'),
(38, 'Hayley Atwell', 1003, 'Female', 'British', '1982-04-05'),
(39, 'Natalie Portman', 1008, 'Female', 'Israeli', '1981-06-09'),
(40, 'Rachel McAdams', 1004, 'Female', 'Canadian', '1978-11-17'),
(41, 'Benedict Wong', 1011, 'Male', 'British', '1971-07-03'),
(42, 'Letitia Wright', 1004, 'Female', 'British', '1993-10-31'),
(43, 'Winston Duke', 1018, 'Male', 'Tobagonian', '1986-11-15'),
(44, 'Danai Gurira', 1016, 'Female', 'American', '1978-02-14'),
(45, 'Lee Pace', 1002, 'Male', 'American', '1979-03-25'),
(46, 'Djimon Hounsou', 1014, 'Male', 'Beninese', '1964-04-24'),
(47, 'Clark Gregg', 1006, 'Male', 'American', '1962-04-02'),
(48, 'Kat Dennings', 1018, 'Female', 'American', '1986-06-13'),
(49, 'Stellan Skarsgård', 1014, 'Male', 'Swedish', '1951-06-13'),
(50, 'Idris Elba', 1015, 'Male', 'British', '1972-09-06')]
cur.executemany("INSERT OR IGNORE INTO Actors VALUES(?,?,?,?,?,?)", actor_data) # Define the SQL statement with placeholders for multiple rows
conn.commit() # Commit the changes to the database
# Actors table values
res = cur.execute("SELECT * FROM Actors")
res.fetchall()
[(1, 'Robert Downey Jr.', 1001, 'Male', 'American', '1965-04-04'),
(2, 'Chris Evans', 1005, 'Male', 'American', '1981-06-13'),
(3, 'Mark Ruffalo', 1006, 'Male', 'American', '1967-11-22'),
(4, 'Chris Hemsworth', 1004, 'Male', 'Australian', '1983-08-11'),
(5, 'Scarlett Johansson', 1006, 'Female', 'American', '1984-11-22'),
(6, 'Jeremy Renner', 1006, 'Male', 'American', '1971-01-07'),
(7, 'Tom Hiddleston', 1004, 'Male', 'British', '1981-02-09'),
(8, 'Clark Gregg', 1001, 'Male', 'American', '1962-04-02'),
(9, 'Paul Rudd', 1012, 'Male', 'American', '1969-04-06'),
(10, 'Benedict Cumberbatch', 1014, 'Male', 'British', '1976-07-19'),
(11, 'Tom Holland', 1008, 'Male', 'British', '1996-06-01'),
(12, 'Elizabeth Olsen', 1022, 'Female', 'American', '1989-02-16'),
(13, 'Anthony Mackie', 1017, 'Male', 'American', '1978-09-23'),
(14, 'Chadwick Boseman', 1008, 'Male', 'American', '1976-11-29'),
(15, 'Chris Pratt', 1017, 'Male', 'American', '1979-06-21'),
(16, 'Zoe Saldana', 1004, 'Female', 'American', '1978-06-19'),
(17, 'Karen Gillan', 1008, 'Female', 'British', '1987-11-28'),
(18, 'Dave Bautista', 1013, 'Male', 'American', '1969-01-18'),
(19, 'Vin Diesel', 1013, 'Male', 'American', '1967-07-18'),
(20, 'Bradley Cooper', 1006, 'Male', 'American', '1975-01-05'),
(21, 'Brie Larson', 1008, 'Female', 'American', '1989-10-01'),
(22, 'Michael B. Jordan', 1022, 'Male', 'American', '1987-02-09'),
(23, 'Tessa Thompson', 1002, 'Female', 'American', '1983-10-03'),
(24, 'Samuel L. Jackson', 1012, 'Male', 'American', '1948-12-21'),
(25, 'Gwyneth Paltrow', 1013, 'Female', 'American', '1972-09-27'),
(26, 'Don Cheadle', 1014, 'Male', 'American', '1964-11-29'),
(27, 'Paul Bettany', 1021, 'Male', 'British', '1971-05-27'),
(28, 'Sebastian Stan', 1007, 'Male', 'Romanian', '1982-08-13'),
(29, 'Josh Brolin', 1019, 'Male', 'American', '1968-02-12'),
(30, 'Karen Gillan', 1019, 'Female', 'British', '1987-11-28'),
(31, 'Pom Klementieff', 1002, 'Female', 'French', '1986-05-03'),
(32, 'Evangeline Lilly', 1020, 'Female', 'Canadian', '1979-08-03'),
(33, 'Tilda Swinton', 1019, 'Female', 'British', '1960-11-05'),
(34, 'Jeff Goldblum', 1021, 'Male', 'American', '1952-10-22'),
(35, 'Jon Favreau', 1018, 'Male', 'American', '1966-10-19'),
(36, 'Ben Kingsley', 1013, 'Male', 'British', '1943-12-31'),
(37, 'Cobie Smulders', 1022, 'Female', 'Canadian', '1982-04-03'),
(38, 'Hayley Atwell', 1003, 'Female', 'British', '1982-04-05'),
(39, 'Natalie Portman', 1008, 'Female', 'Israeli', '1981-06-09'),
(40, 'Rachel McAdams', 1004, 'Female', 'Canadian', '1978-11-17'),
(41, 'Benedict Wong', 1011, 'Male', 'British', '1971-07-03'),
(42, 'Letitia Wright', 1004, 'Female', 'British', '1993-10-31'),
(43, 'Winston Duke', 1018, 'Male', 'Tobagonian', '1986-11-15'),
(44, 'Danai Gurira', 1016, 'Female', 'American', '1978-02-14'),
(45, 'Lee Pace', 1002, 'Male', 'American', '1979-03-25'),
(46, 'Djimon Hounsou', 1014, 'Male', 'Beninese', '1964-04-24'),
(47, 'Clark Gregg', 1006, 'Male', 'American', '1962-04-02'),
(48, 'Kat Dennings', 1018, 'Female', 'American', '1986-06-13'),
(49, 'Stellan Skarsgård', 1014, 'Male', 'Swedish', '1951-06-13'),
(50, 'Idris Elba', 1015, 'Male', 'British', '1972-09-06')]
# Check the foreign keys in Actors
res = cur.execute("PRAGMA foreign_key_list(Actors)")
foreign_key_info = res.fetchall()
for fk in foreign_key_info:
print(f"Foreign Key ID: {fk[0]}")
print(f"Foreign Key Sequence: {fk[1]}")
print(f"Referenced Table: {fk[2]}")
print(f"Foreign Key Column (in Actors): {fk[3]}")
print(f"Referenced Column (in MCU_Movies): {fk[4]}")
print(f"ON DELETE Action: {fk[5]} (NO ACTION means no change to Actors when MCU_Movies entry is deleted)")
print(f"ON UPDATE Action: {fk[6]} (NO ACTION means no change to Actors when MCU_Movies entry is updated)")
print(f"Constraint Type: {fk[7]} (NONE means no enforcement on the foreign key constraint)")
Foreign Key ID: 0
Foreign Key Sequence: 0
Referenced Table: MCU_Movies
Foreign Key Column (in Actors): movie_id
Referenced Column (in MCU_Movies): movie_id
ON DELETE Action: NO ACTION (NO ACTION means no change to Actors when MCU_Movies entry is deleted)
ON UPDATE Action: NO ACTION (NO ACTION means no change to Actors when MCU_Movies entry is updated)
Constraint Type: NONE (NONE means no enforcement on the foreign key constraint)
cur.description is a useful attribute for accessing metadata about query results, helping you to dynamically process and handle database query results based on column information.
# id: The ID of the foreign key constraint.
# seq: The sequence number of the column within the foreign key constraint.
# table: The name of the table that the foreign key references.
# from: The name of the column in the current table (in this case, Actors) that is part of the foreign key.
# to: The name of the column in the referenced table that the foreign key points to.
# on_update: The action that occurs if the referenced column is updated.
# on_delete: The action that occurs if the referenced column is deleted.
# match: The matching algorithm used for enforcing the foreign key constraint.
cur.description
(('id', None, None, None, None, None, None),
('seq', None, None, None, None, None, None),
('table', None, None, None, None, None, None),
('from', None, None, None, None, None, None),
('to', None, None, None, None, None, None),
('on_update', None, None, None, None, None, None),
('on_delete', None, None, None, None, None, None),
('match', None, None, None, None, None, None))
cur.close()
conn.close()
1.4.3 JOIN#
The JOIN in SQL is used to combine rows from two or more tables based on a related column between them.
# create connection with db
conn = sqlite3.connect('MCU.db')
# create cur
cur = conn.cursor()
# load movie table as df_m
import pandas as pd
df_m = pd.read_sql_query('SELECT * FROM MCU_movies', conn)
df_m.head()
movie_id | title | year | date | time | score | |
---|---|---|---|---|---|---|
0 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 |
1 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 |
2 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 |
3 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 |
4 | 1005 | Captain America: The First Avenger | 2011 | 2011-03-15 | 06:33:00 | 6.9 |
# load actor table as df_a
df_a = pd.read_sql_query('SELECT * FROM Actors', conn)
df_a.head()
actor_id | actor_name | movie_id | gender | nationality | date_birth | |
---|---|---|---|---|---|---|
0 | 1 | Robert Downey Jr. | 1001 | Male | American | 1965-04-04 |
1 | 2 | Chris Evans | 1005 | Male | American | 1981-06-13 |
2 | 3 | Mark Ruffalo | 1006 | Male | American | 1967-11-22 |
3 | 4 | Chris Hemsworth | 1004 | Male | Australian | 1983-08-11 |
4 | 5 | Scarlett Johansson | 1006 | Female | American | 1984-11-22 |
INNER JOIN#
res = cur.execute('''
SELECT *
FROM MCU_movies
INNER JOIN Actors
ON MCU_movies.movie_id = Actors.movie_id
ORDER BY MCU_movies.movie_id
''')
for i in res.fetchall():
print(i)
(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9, 1, 'Robert Downey Jr.', 1001, 'Male', 'American', '1965-04-04')
(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9, 8, 'Clark Gregg', 1001, 'Male', 'American', '1962-04-02')
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7, 23, 'Tessa Thompson', 1002, 'Female', 'American', '1983-10-03')
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7, 31, 'Pom Klementieff', 1002, 'Female', 'French', '1986-05-03')
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7, 45, 'Lee Pace', 1002, 'Male', 'American', '1979-03-25')
(1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0, 38, 'Hayley Atwell', 1003, 'Female', 'British', '1982-04-05')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 4, 'Chris Hemsworth', 1004, 'Male', 'Australian', '1983-08-11')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 7, 'Tom Hiddleston', 1004, 'Male', 'British', '1981-02-09')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 16, 'Zoe Saldana', 1004, 'Female', 'American', '1978-06-19')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 40, 'Rachel McAdams', 1004, 'Female', 'Canadian', '1978-11-17')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 42, 'Letitia Wright', 1004, 'Female', 'British', '1993-10-31')
(1005, 'Captain America: The First Avenger', 2011, '2011-03-15', '06:33:00', 6.9, 2, 'Chris Evans', 1005, 'Male', 'American', '1981-06-13')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 3, 'Mark Ruffalo', 1006, 'Male', 'American', '1967-11-22')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 5, 'Scarlett Johansson', 1006, 'Female', 'American', '1984-11-22')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 6, 'Jeremy Renner', 1006, 'Male', 'American', '1971-01-07')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 20, 'Bradley Cooper', 1006, 'Male', 'American', '1975-01-05')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 47, 'Clark Gregg', 1006, 'Male', 'American', '1962-04-02')
(1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2, 28, 'Sebastian Stan', 1007, 'Male', 'Romanian', '1982-08-13')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 11, 'Tom Holland', 1008, 'Male', 'British', '1996-06-01')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 14, 'Chadwick Boseman', 1008, 'Male', 'American', '1976-11-29')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 17, 'Karen Gillan', 1008, 'Female', 'British', '1987-11-28')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 21, 'Brie Larson', 1008, 'Female', 'American', '1989-10-01')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 39, 'Natalie Portman', 1008, 'Female', 'Israeli', '1981-06-09')
(1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3, 41, 'Benedict Wong', 1011, 'Male', 'British', '1971-07-03')
(1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3, 9, 'Paul Rudd', 1012, 'Male', 'American', '1969-04-06')
(1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3, 24, 'Samuel L. Jackson', 1012, 'Male', 'American', '1948-12-21')
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8, 18, 'Dave Bautista', 1013, 'Male', 'American', '1969-01-18')
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8, 19, 'Vin Diesel', 1013, 'Male', 'American', '1967-07-18')
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8, 25, 'Gwyneth Paltrow', 1013, 'Female', 'American', '1972-09-27')
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8, 36, 'Ben Kingsley', 1013, 'Male', 'British', '1943-12-31')
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5, 10, 'Benedict Cumberbatch', 1014, 'Male', 'British', '1976-07-19')
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5, 26, 'Don Cheadle', 1014, 'Male', 'American', '1964-11-29')
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5, 46, 'Djimon Hounsou', 1014, 'Male', 'Beninese', '1964-04-24')
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5, 49, 'Stellan Skarsgård', 1014, 'Male', 'Swedish', '1951-06-13')
(1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6, 50, 'Idris Elba', 1015, 'Male', 'British', '1972-09-06')
(1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4, 44, 'Danai Gurira', 1016, 'Female', 'American', '1978-02-14')
(1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9, 13, 'Anthony Mackie', 1017, 'Male', 'American', '1978-09-23')
(1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9, 15, 'Chris Pratt', 1017, 'Male', 'American', '1979-06-21')
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3, 35, 'Jon Favreau', 1018, 'Male', 'American', '1966-10-19')
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3, 43, 'Winston Duke', 1018, 'Male', 'Tobagonian', '1986-11-15')
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3, 48, 'Kat Dennings', 1018, 'Female', 'American', '1986-06-13')
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4, 29, 'Josh Brolin', 1019, 'Male', 'American', '1968-02-12')
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4, 30, 'Karen Gillan', 1019, 'Female', 'British', '1987-11-28')
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4, 33, 'Tilda Swinton', 1019, 'Female', 'British', '1960-11-05')
(1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1, 32, 'Evangeline Lilly', 1020, 'Female', 'Canadian', '1979-08-03')
(1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9, 27, 'Paul Bettany', 1021, 'Male', 'British', '1971-05-27')
(1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9, 34, 'Jeff Goldblum', 1021, 'Male', 'American', '1952-10-22')
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4, 12, 'Elizabeth Olsen', 1022, 'Female', 'American', '1989-02-16')
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4, 22, 'Michael B. Jordan', 1022, 'Male', 'American', '1987-02-09')
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4, 37, 'Cobie Smulders', 1022, 'Female', 'Canadian', '1982-04-03')
.join() function
df_m.set_index('movie_id').join(df_a.set_index('movie_id'), on='movie_id', how='inner', validate='m:m')
title | year | date | time | score | actor_id | actor_name | gender | nationality | date_birth | |
---|---|---|---|---|---|---|---|---|---|---|
movie_id | ||||||||||
1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | 1 | Robert Downey Jr. | Male | American | 1965-04-04 |
1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | 8 | Clark Gregg | Male | American | 1962-04-02 |
1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 23 | Tessa Thompson | Female | American | 1983-10-03 |
1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 31 | Pom Klementieff | Female | French | 1986-05-03 |
1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 45 | Lee Pace | Male | American | 1979-03-25 |
1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 | 38 | Hayley Atwell | Female | British | 1982-04-05 |
1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 4 | Chris Hemsworth | Male | Australian | 1983-08-11 |
1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 7 | Tom Hiddleston | Male | British | 1981-02-09 |
1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 16 | Zoe Saldana | Female | American | 1978-06-19 |
1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 40 | Rachel McAdams | Female | Canadian | 1978-11-17 |
1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 42 | Letitia Wright | Female | British | 1993-10-31 |
1005 | Captain America: The First Avenger | 2011 | 2011-03-15 | 06:33:00 | 6.9 | 2 | Chris Evans | Male | American | 1981-06-13 |
1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 3 | Mark Ruffalo | Male | American | 1967-11-22 |
1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 5 | Scarlett Johansson | Female | American | 1984-11-22 |
1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 6 | Jeremy Renner | Male | American | 1971-01-07 |
1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 20 | Bradley Cooper | Male | American | 1975-01-05 |
1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 47 | Clark Gregg | Male | American | 1962-04-02 |
1007 | Iron Man 3 | 2013 | 2013-07-20 | 19:12:00 | 7.2 | 28 | Sebastian Stan | Male | Romanian | 1982-08-13 |
1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 11 | Tom Holland | Male | British | 1996-06-01 |
1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 14 | Chadwick Boseman | Male | American | 1976-11-29 |
1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 17 | Karen Gillan | Female | British | 1987-11-28 |
1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 21 | Brie Larson | Female | American | 1989-10-01 |
1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 39 | Natalie Portman | Female | Israeli | 1981-06-09 |
1011 | Avengers: Age of Ultron | 2015 | 2015-02-21 | 20:50:00 | 7.3 | 41 | Benedict Wong | Male | British | 1971-07-03 |
1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 | 9 | Paul Rudd | Male | American | 1969-04-06 |
1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 | 24 | Samuel L. Jackson | Male | American | 1948-12-21 |
1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 18 | Dave Bautista | Male | American | 1969-01-18 |
1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 19 | Vin Diesel | Male | American | 1967-07-18 |
1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 25 | Gwyneth Paltrow | Female | American | 1972-09-27 |
1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 36 | Ben Kingsley | Male | British | 1943-12-31 |
1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 10 | Benedict Cumberbatch | Male | British | 1976-07-19 |
1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 26 | Don Cheadle | Male | American | 1964-11-29 |
1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 46 | Djimon Hounsou | Male | Beninese | 1964-04-24 |
1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 49 | Stellan Skarsgård | Male | Swedish | 1951-06-13 |
1015 | Guardians of the Galaxy Vol. 2 | 2017 | 2017-10-20 | 12:41:00 | 7.6 | 50 | Idris Elba | Male | British | 1972-09-06 |
1016 | Spider-Man: Homecoming | 2017 | 2017-12-14 | 02:39:00 | 7.4 | 44 | Danai Gurira | Female | American | 1978-02-14 |
1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 | 13 | Anthony Mackie | Male | American | 1978-09-23 |
1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 | 15 | Chris Pratt | Male | American | 1979-06-21 |
1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | 35 | Jon Favreau | Male | American | 1966-10-19 |
1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | 43 | Winston Duke | Male | Tobagonian | 1986-11-15 |
1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | 48 | Kat Dennings | Female | American | 1986-06-13 |
1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | 29 | Josh Brolin | Male | American | 1968-02-12 |
1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | 30 | Karen Gillan | Female | British | 1987-11-28 |
1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | 33 | Tilda Swinton | Female | British | 1960-11-05 |
1020 | Ant-Man and The Wasp | 2018 | 2018-07-06 | 16:29:00 | 7.1 | 32 | Evangeline Lilly | Female | Canadian | 1979-08-03 |
1021 | Captain Marvel | 2019 | 2019-03-08 | 14:19:00 | 6.9 | 27 | Paul Bettany | Male | British | 1971-05-27 |
1021 | Captain Marvel | 2019 | 2019-03-08 | 14:19:00 | 6.9 | 34 | Jeff Goldblum | Male | American | 1952-10-22 |
1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | 12 | Elizabeth Olsen | Female | American | 1989-02-16 |
1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | 22 | Michael B. Jordan | Male | American | 1987-02-09 |
1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | 37 | Cobie Smulders | Female | Canadian | 1982-04-03 |
pd.merge() function
pd.merge(df_m, df_a, left_on='movie_id', right_on='movie_id', how='inner')
movie_id | title | year | date | time | score | actor_id | actor_name | gender | nationality | date_birth | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | 1 | Robert Downey Jr. | Male | American | 1965-04-04 |
1 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | 8 | Clark Gregg | Male | American | 1962-04-02 |
2 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 23 | Tessa Thompson | Female | American | 1983-10-03 |
3 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 31 | Pom Klementieff | Female | French | 1986-05-03 |
4 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 45 | Lee Pace | Male | American | 1979-03-25 |
5 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 | 38 | Hayley Atwell | Female | British | 1982-04-05 |
6 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 4 | Chris Hemsworth | Male | Australian | 1983-08-11 |
7 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 7 | Tom Hiddleston | Male | British | 1981-02-09 |
8 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 16 | Zoe Saldana | Female | American | 1978-06-19 |
9 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 40 | Rachel McAdams | Female | Canadian | 1978-11-17 |
10 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 42 | Letitia Wright | Female | British | 1993-10-31 |
11 | 1005 | Captain America: The First Avenger | 2011 | 2011-03-15 | 06:33:00 | 6.9 | 2 | Chris Evans | Male | American | 1981-06-13 |
12 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 3 | Mark Ruffalo | Male | American | 1967-11-22 |
13 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 5 | Scarlett Johansson | Female | American | 1984-11-22 |
14 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 6 | Jeremy Renner | Male | American | 1971-01-07 |
15 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 20 | Bradley Cooper | Male | American | 1975-01-05 |
16 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 47 | Clark Gregg | Male | American | 1962-04-02 |
17 | 1007 | Iron Man 3 | 2013 | 2013-07-20 | 19:12:00 | 7.2 | 28 | Sebastian Stan | Male | Romanian | 1982-08-13 |
18 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 11 | Tom Holland | Male | British | 1996-06-01 |
19 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 14 | Chadwick Boseman | Male | American | 1976-11-29 |
20 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 17 | Karen Gillan | Female | British | 1987-11-28 |
21 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 21 | Brie Larson | Female | American | 1989-10-01 |
22 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 39 | Natalie Portman | Female | Israeli | 1981-06-09 |
23 | 1011 | Avengers: Age of Ultron | 2015 | 2015-02-21 | 20:50:00 | 7.3 | 41 | Benedict Wong | Male | British | 1971-07-03 |
24 | 1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 | 9 | Paul Rudd | Male | American | 1969-04-06 |
25 | 1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 | 24 | Samuel L. Jackson | Male | American | 1948-12-21 |
26 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 18 | Dave Bautista | Male | American | 1969-01-18 |
27 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 19 | Vin Diesel | Male | American | 1967-07-18 |
28 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 25 | Gwyneth Paltrow | Female | American | 1972-09-27 |
29 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 36 | Ben Kingsley | Male | British | 1943-12-31 |
30 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 10 | Benedict Cumberbatch | Male | British | 1976-07-19 |
31 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 26 | Don Cheadle | Male | American | 1964-11-29 |
32 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 46 | Djimon Hounsou | Male | Beninese | 1964-04-24 |
33 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 49 | Stellan Skarsgård | Male | Swedish | 1951-06-13 |
34 | 1015 | Guardians of the Galaxy Vol. 2 | 2017 | 2017-10-20 | 12:41:00 | 7.6 | 50 | Idris Elba | Male | British | 1972-09-06 |
35 | 1016 | Spider-Man: Homecoming | 2017 | 2017-12-14 | 02:39:00 | 7.4 | 44 | Danai Gurira | Female | American | 1978-02-14 |
36 | 1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 | 13 | Anthony Mackie | Male | American | 1978-09-23 |
37 | 1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 | 15 | Chris Pratt | Male | American | 1979-06-21 |
38 | 1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | 35 | Jon Favreau | Male | American | 1966-10-19 |
39 | 1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | 43 | Winston Duke | Male | Tobagonian | 1986-11-15 |
40 | 1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | 48 | Kat Dennings | Female | American | 1986-06-13 |
41 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | 29 | Josh Brolin | Male | American | 1968-02-12 |
42 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | 30 | Karen Gillan | Female | British | 1987-11-28 |
43 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | 33 | Tilda Swinton | Female | British | 1960-11-05 |
44 | 1020 | Ant-Man and The Wasp | 2018 | 2018-07-06 | 16:29:00 | 7.1 | 32 | Evangeline Lilly | Female | Canadian | 1979-08-03 |
45 | 1021 | Captain Marvel | 2019 | 2019-03-08 | 14:19:00 | 6.9 | 27 | Paul Bettany | Male | British | 1971-05-27 |
46 | 1021 | Captain Marvel | 2019 | 2019-03-08 | 14:19:00 | 6.9 | 34 | Jeff Goldblum | Male | American | 1952-10-22 |
47 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | 12 | Elizabeth Olsen | Female | American | 1989-02-16 |
48 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | 22 | Michael B. Jordan | Male | American | 1987-02-09 |
49 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | 37 | Cobie Smulders | Female | Canadian | 1982-04-03 |
LEFT JOIN#
res = cur.execute('''
SELECT *
FROM MCU_movies
LEFT JOIN Actors
ON MCU_movies.movie_id = Actors.movie_id
ORDER BY MCU_movies.movie_id
''')
for i in res.fetchall():
print(i)
(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9, 8, 'Clark Gregg', 1001, 'Male', 'American', '1962-04-02')
(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9, 1, 'Robert Downey Jr.', 1001, 'Male', 'American', '1965-04-04')
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7, 45, 'Lee Pace', 1002, 'Male', 'American', '1979-03-25')
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7, 31, 'Pom Klementieff', 1002, 'Female', 'French', '1986-05-03')
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7, 23, 'Tessa Thompson', 1002, 'Female', 'American', '1983-10-03')
(1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0, 38, 'Hayley Atwell', 1003, 'Female', 'British', '1982-04-05')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 4, 'Chris Hemsworth', 1004, 'Male', 'Australian', '1983-08-11')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 42, 'Letitia Wright', 1004, 'Female', 'British', '1993-10-31')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 40, 'Rachel McAdams', 1004, 'Female', 'Canadian', '1978-11-17')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 7, 'Tom Hiddleston', 1004, 'Male', 'British', '1981-02-09')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 16, 'Zoe Saldana', 1004, 'Female', 'American', '1978-06-19')
(1005, 'Captain America: The First Avenger', 2011, '2011-03-15', '06:33:00', 6.9, 2, 'Chris Evans', 1005, 'Male', 'American', '1981-06-13')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 20, 'Bradley Cooper', 1006, 'Male', 'American', '1975-01-05')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 47, 'Clark Gregg', 1006, 'Male', 'American', '1962-04-02')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 6, 'Jeremy Renner', 1006, 'Male', 'American', '1971-01-07')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 3, 'Mark Ruffalo', 1006, 'Male', 'American', '1967-11-22')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 5, 'Scarlett Johansson', 1006, 'Female', 'American', '1984-11-22')
(1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2, 28, 'Sebastian Stan', 1007, 'Male', 'Romanian', '1982-08-13')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 21, 'Brie Larson', 1008, 'Female', 'American', '1989-10-01')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 14, 'Chadwick Boseman', 1008, 'Male', 'American', '1976-11-29')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 17, 'Karen Gillan', 1008, 'Female', 'British', '1987-11-28')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 39, 'Natalie Portman', 1008, 'Female', 'Israeli', '1981-06-09')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 11, 'Tom Holland', 1008, 'Male', 'British', '1996-06-01')
(1009, 'Captain America: The Winter Soldier', 2014, '2014-11-12', '13:56:00', 7.7, None, None, None, None, None, None)
(1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0, None, None, None, None, None, None)
(1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3, 41, 'Benedict Wong', 1011, 'Male', 'British', '1971-07-03')
(1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3, 9, 'Paul Rudd', 1012, 'Male', 'American', '1969-04-06')
(1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3, 24, 'Samuel L. Jackson', 1012, 'Male', 'American', '1948-12-21')
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8, 36, 'Ben Kingsley', 1013, 'Male', 'British', '1943-12-31')
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8, 18, 'Dave Bautista', 1013, 'Male', 'American', '1969-01-18')
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8, 25, 'Gwyneth Paltrow', 1013, 'Female', 'American', '1972-09-27')
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8, 19, 'Vin Diesel', 1013, 'Male', 'American', '1967-07-18')
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5, 10, 'Benedict Cumberbatch', 1014, 'Male', 'British', '1976-07-19')
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5, 46, 'Djimon Hounsou', 1014, 'Male', 'Beninese', '1964-04-24')
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5, 26, 'Don Cheadle', 1014, 'Male', 'American', '1964-11-29')
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5, 49, 'Stellan Skarsgård', 1014, 'Male', 'Swedish', '1951-06-13')
(1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6, 50, 'Idris Elba', 1015, 'Male', 'British', '1972-09-06')
(1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4, 44, 'Danai Gurira', 1016, 'Female', 'American', '1978-02-14')
(1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9, 13, 'Anthony Mackie', 1017, 'Male', 'American', '1978-09-23')
(1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9, 15, 'Chris Pratt', 1017, 'Male', 'American', '1979-06-21')
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3, 35, 'Jon Favreau', 1018, 'Male', 'American', '1966-10-19')
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3, 48, 'Kat Dennings', 1018, 'Female', 'American', '1986-06-13')
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3, 43, 'Winston Duke', 1018, 'Male', 'Tobagonian', '1986-11-15')
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4, 29, 'Josh Brolin', 1019, 'Male', 'American', '1968-02-12')
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4, 30, 'Karen Gillan', 1019, 'Female', 'British', '1987-11-28')
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4, 33, 'Tilda Swinton', 1019, 'Female', 'British', '1960-11-05')
(1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1, 32, 'Evangeline Lilly', 1020, 'Female', 'Canadian', '1979-08-03')
(1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9, 34, 'Jeff Goldblum', 1021, 'Male', 'American', '1952-10-22')
(1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9, 27, 'Paul Bettany', 1021, 'Male', 'British', '1971-05-27')
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4, 37, 'Cobie Smulders', 1022, 'Female', 'Canadian', '1982-04-03')
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4, 12, 'Elizabeth Olsen', 1022, 'Female', 'American', '1989-02-16')
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4, 22, 'Michael B. Jordan', 1022, 'Male', 'American', '1987-02-09')
df_m.set_index('movie_id').join(df_a.set_index('movie_id'), on='movie_id', how='left', validate='m:m')
title | year | date | time | score | actor_id | actor_name | gender | nationality | date_birth | |
---|---|---|---|---|---|---|---|---|---|---|
movie_id | ||||||||||
1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | 1.0 | Robert Downey Jr. | Male | American | 1965-04-04 |
1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | 8.0 | Clark Gregg | Male | American | 1962-04-02 |
1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 23.0 | Tessa Thompson | Female | American | 1983-10-03 |
1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 31.0 | Pom Klementieff | Female | French | 1986-05-03 |
1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 45.0 | Lee Pace | Male | American | 1979-03-25 |
1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 | 38.0 | Hayley Atwell | Female | British | 1982-04-05 |
1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 4.0 | Chris Hemsworth | Male | Australian | 1983-08-11 |
1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 7.0 | Tom Hiddleston | Male | British | 1981-02-09 |
1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 16.0 | Zoe Saldana | Female | American | 1978-06-19 |
1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 40.0 | Rachel McAdams | Female | Canadian | 1978-11-17 |
1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 42.0 | Letitia Wright | Female | British | 1993-10-31 |
1005 | Captain America: The First Avenger | 2011 | 2011-03-15 | 06:33:00 | 6.9 | 2.0 | Chris Evans | Male | American | 1981-06-13 |
1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 3.0 | Mark Ruffalo | Male | American | 1967-11-22 |
1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 5.0 | Scarlett Johansson | Female | American | 1984-11-22 |
1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 6.0 | Jeremy Renner | Male | American | 1971-01-07 |
1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 20.0 | Bradley Cooper | Male | American | 1975-01-05 |
1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 47.0 | Clark Gregg | Male | American | 1962-04-02 |
1007 | Iron Man 3 | 2013 | 2013-07-20 | 19:12:00 | 7.2 | 28.0 | Sebastian Stan | Male | Romanian | 1982-08-13 |
1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 11.0 | Tom Holland | Male | British | 1996-06-01 |
1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 14.0 | Chadwick Boseman | Male | American | 1976-11-29 |
1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 17.0 | Karen Gillan | Female | British | 1987-11-28 |
1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 21.0 | Brie Larson | Female | American | 1989-10-01 |
1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 39.0 | Natalie Portman | Female | Israeli | 1981-06-09 |
1009 | Captain America: The Winter Soldier | 2014 | 2014-11-12 | 13:56:00 | 7.7 | NaN | NaN | NaN | NaN | NaN |
1010 | Guardians of the Galaxy | 2014 | 2014-12-03 | 22:25:00 | 8.0 | NaN | NaN | NaN | NaN | NaN |
1011 | Avengers: Age of Ultron | 2015 | 2015-02-21 | 20:50:00 | 7.3 | 41.0 | Benedict Wong | Male | British | 1971-07-03 |
1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 | 9.0 | Paul Rudd | Male | American | 1969-04-06 |
1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 | 24.0 | Samuel L. Jackson | Male | American | 1948-12-21 |
1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 18.0 | Dave Bautista | Male | American | 1969-01-18 |
1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 19.0 | Vin Diesel | Male | American | 1967-07-18 |
1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 25.0 | Gwyneth Paltrow | Female | American | 1972-09-27 |
1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 36.0 | Ben Kingsley | Male | British | 1943-12-31 |
1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 10.0 | Benedict Cumberbatch | Male | British | 1976-07-19 |
1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 26.0 | Don Cheadle | Male | American | 1964-11-29 |
1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 46.0 | Djimon Hounsou | Male | Beninese | 1964-04-24 |
1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 49.0 | Stellan Skarsgård | Male | Swedish | 1951-06-13 |
1015 | Guardians of the Galaxy Vol. 2 | 2017 | 2017-10-20 | 12:41:00 | 7.6 | 50.0 | Idris Elba | Male | British | 1972-09-06 |
1016 | Spider-Man: Homecoming | 2017 | 2017-12-14 | 02:39:00 | 7.4 | 44.0 | Danai Gurira | Female | American | 1978-02-14 |
1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 | 13.0 | Anthony Mackie | Male | American | 1978-09-23 |
1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 | 15.0 | Chris Pratt | Male | American | 1979-06-21 |
1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | 35.0 | Jon Favreau | Male | American | 1966-10-19 |
1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | 43.0 | Winston Duke | Male | Tobagonian | 1986-11-15 |
1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | 48.0 | Kat Dennings | Female | American | 1986-06-13 |
1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | 29.0 | Josh Brolin | Male | American | 1968-02-12 |
1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | 30.0 | Karen Gillan | Female | British | 1987-11-28 |
1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | 33.0 | Tilda Swinton | Female | British | 1960-11-05 |
1020 | Ant-Man and The Wasp | 2018 | 2018-07-06 | 16:29:00 | 7.1 | 32.0 | Evangeline Lilly | Female | Canadian | 1979-08-03 |
1021 | Captain Marvel | 2019 | 2019-03-08 | 14:19:00 | 6.9 | 27.0 | Paul Bettany | Male | British | 1971-05-27 |
1021 | Captain Marvel | 2019 | 2019-03-08 | 14:19:00 | 6.9 | 34.0 | Jeff Goldblum | Male | American | 1952-10-22 |
1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | 12.0 | Elizabeth Olsen | Female | American | 1989-02-16 |
1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | 22.0 | Michael B. Jordan | Male | American | 1987-02-09 |
1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | 37.0 | Cobie Smulders | Female | Canadian | 1982-04-03 |
pd.merge(df_m, df_a, left_on='movie_id', right_on='movie_id', how='left')
movie_id | title | year | date | time | score | actor_id | actor_name | gender | nationality | date_birth | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | 1.0 | Robert Downey Jr. | Male | American | 1965-04-04 |
1 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | 8.0 | Clark Gregg | Male | American | 1962-04-02 |
2 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 23.0 | Tessa Thompson | Female | American | 1983-10-03 |
3 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 31.0 | Pom Klementieff | Female | French | 1986-05-03 |
4 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 45.0 | Lee Pace | Male | American | 1979-03-25 |
5 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 | 38.0 | Hayley Atwell | Female | British | 1982-04-05 |
6 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 4.0 | Chris Hemsworth | Male | Australian | 1983-08-11 |
7 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 7.0 | Tom Hiddleston | Male | British | 1981-02-09 |
8 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 16.0 | Zoe Saldana | Female | American | 1978-06-19 |
9 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 40.0 | Rachel McAdams | Female | Canadian | 1978-11-17 |
10 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 42.0 | Letitia Wright | Female | British | 1993-10-31 |
11 | 1005 | Captain America: The First Avenger | 2011 | 2011-03-15 | 06:33:00 | 6.9 | 2.0 | Chris Evans | Male | American | 1981-06-13 |
12 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 3.0 | Mark Ruffalo | Male | American | 1967-11-22 |
13 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 5.0 | Scarlett Johansson | Female | American | 1984-11-22 |
14 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 6.0 | Jeremy Renner | Male | American | 1971-01-07 |
15 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 20.0 | Bradley Cooper | Male | American | 1975-01-05 |
16 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 47.0 | Clark Gregg | Male | American | 1962-04-02 |
17 | 1007 | Iron Man 3 | 2013 | 2013-07-20 | 19:12:00 | 7.2 | 28.0 | Sebastian Stan | Male | Romanian | 1982-08-13 |
18 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 11.0 | Tom Holland | Male | British | 1996-06-01 |
19 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 14.0 | Chadwick Boseman | Male | American | 1976-11-29 |
20 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 17.0 | Karen Gillan | Female | British | 1987-11-28 |
21 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 21.0 | Brie Larson | Female | American | 1989-10-01 |
22 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 39.0 | Natalie Portman | Female | Israeli | 1981-06-09 |
23 | 1009 | Captain America: The Winter Soldier | 2014 | 2014-11-12 | 13:56:00 | 7.7 | NaN | NaN | NaN | NaN | NaN |
24 | 1010 | Guardians of the Galaxy | 2014 | 2014-12-03 | 22:25:00 | 8.0 | NaN | NaN | NaN | NaN | NaN |
25 | 1011 | Avengers: Age of Ultron | 2015 | 2015-02-21 | 20:50:00 | 7.3 | 41.0 | Benedict Wong | Male | British | 1971-07-03 |
26 | 1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 | 9.0 | Paul Rudd | Male | American | 1969-04-06 |
27 | 1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 | 24.0 | Samuel L. Jackson | Male | American | 1948-12-21 |
28 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 18.0 | Dave Bautista | Male | American | 1969-01-18 |
29 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 19.0 | Vin Diesel | Male | American | 1967-07-18 |
30 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 25.0 | Gwyneth Paltrow | Female | American | 1972-09-27 |
31 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 36.0 | Ben Kingsley | Male | British | 1943-12-31 |
32 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 10.0 | Benedict Cumberbatch | Male | British | 1976-07-19 |
33 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 26.0 | Don Cheadle | Male | American | 1964-11-29 |
34 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 46.0 | Djimon Hounsou | Male | Beninese | 1964-04-24 |
35 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 49.0 | Stellan Skarsgård | Male | Swedish | 1951-06-13 |
36 | 1015 | Guardians of the Galaxy Vol. 2 | 2017 | 2017-10-20 | 12:41:00 | 7.6 | 50.0 | Idris Elba | Male | British | 1972-09-06 |
37 | 1016 | Spider-Man: Homecoming | 2017 | 2017-12-14 | 02:39:00 | 7.4 | 44.0 | Danai Gurira | Female | American | 1978-02-14 |
38 | 1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 | 13.0 | Anthony Mackie | Male | American | 1978-09-23 |
39 | 1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 | 15.0 | Chris Pratt | Male | American | 1979-06-21 |
40 | 1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | 35.0 | Jon Favreau | Male | American | 1966-10-19 |
41 | 1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | 43.0 | Winston Duke | Male | Tobagonian | 1986-11-15 |
42 | 1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | 48.0 | Kat Dennings | Female | American | 1986-06-13 |
43 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | 29.0 | Josh Brolin | Male | American | 1968-02-12 |
44 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | 30.0 | Karen Gillan | Female | British | 1987-11-28 |
45 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | 33.0 | Tilda Swinton | Female | British | 1960-11-05 |
46 | 1020 | Ant-Man and The Wasp | 2018 | 2018-07-06 | 16:29:00 | 7.1 | 32.0 | Evangeline Lilly | Female | Canadian | 1979-08-03 |
47 | 1021 | Captain Marvel | 2019 | 2019-03-08 | 14:19:00 | 6.9 | 27.0 | Paul Bettany | Male | British | 1971-05-27 |
48 | 1021 | Captain Marvel | 2019 | 2019-03-08 | 14:19:00 | 6.9 | 34.0 | Jeff Goldblum | Male | American | 1952-10-22 |
49 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | 12.0 | Elizabeth Olsen | Female | American | 1989-02-16 |
50 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | 22.0 | Michael B. Jordan | Male | American | 1987-02-09 |
51 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | 37.0 | Cobie Smulders | Female | Canadian | 1982-04-03 |
RIGHT JOIN#
You may get an error message in sqlite3: RIGHT and FULL OUTER JOINs are not currently supported.
By swapping the tables in the LEFT JOIN, you get the same result as a RIGHT JOIN.
# OperationalError: RIGHT and FULL OUTER JOINs are not currently supported
# res = cur.execute('''
# SELECT *
# FROM MCU_movies
# RIGHT JOIN Actors
# ON MCU_movies.movie_id = Actors.movie_id
# ORDER BY MCU_movies.movie_id
# ''')
# for i in res.fetchall():
# print(i)
# swapping left and ringht table
res = cur.execute('''
SELECT *
FROM Actors
LEFT JOIN MCU_movies
ON MCU_movies.movie_id = Actors.movie_id
ORDER BY Actors.actor_id
''')
for i in res.fetchall():
print(i)
(1, 'Robert Downey Jr.', 1001, 'Male', 'American', '1965-04-04', 1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9)
(2, 'Chris Evans', 1005, 'Male', 'American', '1981-06-13', 1005, 'Captain America: The First Avenger', 2011, '2011-03-15', '06:33:00', 6.9)
(3, 'Mark Ruffalo', 1006, 'Male', 'American', '1967-11-22', 1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0)
(4, 'Chris Hemsworth', 1004, 'Male', 'Australian', '1983-08-11', 1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0)
(5, 'Scarlett Johansson', 1006, 'Female', 'American', '1984-11-22', 1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0)
(6, 'Jeremy Renner', 1006, 'Male', 'American', '1971-01-07', 1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0)
(7, 'Tom Hiddleston', 1004, 'Male', 'British', '1981-02-09', 1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0)
(8, 'Clark Gregg', 1001, 'Male', 'American', '1962-04-02', 1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9)
(9, 'Paul Rudd', 1012, 'Male', 'American', '1969-04-06', 1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3)
(10, 'Benedict Cumberbatch', 1014, 'Male', 'British', '1976-07-19', 1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5)
(11, 'Tom Holland', 1008, 'Male', 'British', '1996-06-01', 1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9)
(12, 'Elizabeth Olsen', 1022, 'Female', 'American', '1989-02-16', 1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)
(13, 'Anthony Mackie', 1017, 'Male', 'American', '1978-09-23', 1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9)
(14, 'Chadwick Boseman', 1008, 'Male', 'American', '1976-11-29', 1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9)
(15, 'Chris Pratt', 1017, 'Male', 'American', '1979-06-21', 1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9)
(16, 'Zoe Saldana', 1004, 'Female', 'American', '1978-06-19', 1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0)
(17, 'Karen Gillan', 1008, 'Female', 'British', '1987-11-28', 1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9)
(18, 'Dave Bautista', 1013, 'Male', 'American', '1969-01-18', 1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8)
(19, 'Vin Diesel', 1013, 'Male', 'American', '1967-07-18', 1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8)
(20, 'Bradley Cooper', 1006, 'Male', 'American', '1975-01-05', 1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0)
(21, 'Brie Larson', 1008, 'Female', 'American', '1989-10-01', 1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9)
(22, 'Michael B. Jordan', 1022, 'Male', 'American', '1987-02-09', 1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)
(23, 'Tessa Thompson', 1002, 'Female', 'American', '1983-10-03', 1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7)
(24, 'Samuel L. Jackson', 1012, 'Male', 'American', '1948-12-21', 1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3)
(25, 'Gwyneth Paltrow', 1013, 'Female', 'American', '1972-09-27', 1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8)
(26, 'Don Cheadle', 1014, 'Male', 'American', '1964-11-29', 1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5)
(27, 'Paul Bettany', 1021, 'Male', 'British', '1971-05-27', 1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9)
(28, 'Sebastian Stan', 1007, 'Male', 'Romanian', '1982-08-13', 1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2)
(29, 'Josh Brolin', 1019, 'Male', 'American', '1968-02-12', 1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4)
(30, 'Karen Gillan', 1019, 'Female', 'British', '1987-11-28', 1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4)
(31, 'Pom Klementieff', 1002, 'Female', 'French', '1986-05-03', 1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7)
(32, 'Evangeline Lilly', 1020, 'Female', 'Canadian', '1979-08-03', 1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1)
(33, 'Tilda Swinton', 1019, 'Female', 'British', '1960-11-05', 1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4)
(34, 'Jeff Goldblum', 1021, 'Male', 'American', '1952-10-22', 1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9)
(35, 'Jon Favreau', 1018, 'Male', 'American', '1966-10-19', 1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3)
(36, 'Ben Kingsley', 1013, 'Male', 'British', '1943-12-31', 1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8)
(37, 'Cobie Smulders', 1022, 'Female', 'Canadian', '1982-04-03', 1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4)
(38, 'Hayley Atwell', 1003, 'Female', 'British', '1982-04-05', 1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0)
(39, 'Natalie Portman', 1008, 'Female', 'Israeli', '1981-06-09', 1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9)
(40, 'Rachel McAdams', 1004, 'Female', 'Canadian', '1978-11-17', 1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0)
(41, 'Benedict Wong', 1011, 'Male', 'British', '1971-07-03', 1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3)
(42, 'Letitia Wright', 1004, 'Female', 'British', '1993-10-31', 1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0)
(43, 'Winston Duke', 1018, 'Male', 'Tobagonian', '1986-11-15', 1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3)
(44, 'Danai Gurira', 1016, 'Female', 'American', '1978-02-14', 1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4)
(45, 'Lee Pace', 1002, 'Male', 'American', '1979-03-25', 1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7)
(46, 'Djimon Hounsou', 1014, 'Male', 'Beninese', '1964-04-24', 1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5)
(47, 'Clark Gregg', 1006, 'Male', 'American', '1962-04-02', 1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0)
(48, 'Kat Dennings', 1018, 'Female', 'American', '1986-06-13', 1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3)
(49, 'Stellan Skarsgård', 1014, 'Male', 'Swedish', '1951-06-13', 1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5)
(50, 'Idris Elba', 1015, 'Male', 'British', '1972-09-06', 1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6)
df_m.set_index('movie_id').join(df_a.set_index('movie_id'), on='movie_id', how='right', validate='m:m')
title | year | date | time | score | actor_id | actor_name | gender | nationality | date_birth | |
---|---|---|---|---|---|---|---|---|---|---|
movie_id | ||||||||||
1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | 1 | Robert Downey Jr. | Male | American | 1965-04-04 |
1005 | Captain America: The First Avenger | 2011 | 2011-03-15 | 06:33:00 | 6.9 | 2 | Chris Evans | Male | American | 1981-06-13 |
1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 3 | Mark Ruffalo | Male | American | 1967-11-22 |
1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 4 | Chris Hemsworth | Male | Australian | 1983-08-11 |
1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 5 | Scarlett Johansson | Female | American | 1984-11-22 |
1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 6 | Jeremy Renner | Male | American | 1971-01-07 |
1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 7 | Tom Hiddleston | Male | British | 1981-02-09 |
1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | 8 | Clark Gregg | Male | American | 1962-04-02 |
1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 | 9 | Paul Rudd | Male | American | 1969-04-06 |
1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 10 | Benedict Cumberbatch | Male | British | 1976-07-19 |
1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 11 | Tom Holland | Male | British | 1996-06-01 |
1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | 12 | Elizabeth Olsen | Female | American | 1989-02-16 |
1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 | 13 | Anthony Mackie | Male | American | 1978-09-23 |
1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 14 | Chadwick Boseman | Male | American | 1976-11-29 |
1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 | 15 | Chris Pratt | Male | American | 1979-06-21 |
1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 16 | Zoe Saldana | Female | American | 1978-06-19 |
1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 17 | Karen Gillan | Female | British | 1987-11-28 |
1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 18 | Dave Bautista | Male | American | 1969-01-18 |
1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 19 | Vin Diesel | Male | American | 1967-07-18 |
1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 20 | Bradley Cooper | Male | American | 1975-01-05 |
1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 21 | Brie Larson | Female | American | 1989-10-01 |
1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | 22 | Michael B. Jordan | Male | American | 1987-02-09 |
1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 23 | Tessa Thompson | Female | American | 1983-10-03 |
1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 | 24 | Samuel L. Jackson | Male | American | 1948-12-21 |
1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 25 | Gwyneth Paltrow | Female | American | 1972-09-27 |
1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 26 | Don Cheadle | Male | American | 1964-11-29 |
1021 | Captain Marvel | 2019 | 2019-03-08 | 14:19:00 | 6.9 | 27 | Paul Bettany | Male | British | 1971-05-27 |
1007 | Iron Man 3 | 2013 | 2013-07-20 | 19:12:00 | 7.2 | 28 | Sebastian Stan | Male | Romanian | 1982-08-13 |
1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | 29 | Josh Brolin | Male | American | 1968-02-12 |
1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | 30 | Karen Gillan | Female | British | 1987-11-28 |
1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 31 | Pom Klementieff | Female | French | 1986-05-03 |
1020 | Ant-Man and The Wasp | 2018 | 2018-07-06 | 16:29:00 | 7.1 | 32 | Evangeline Lilly | Female | Canadian | 1979-08-03 |
1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | 33 | Tilda Swinton | Female | British | 1960-11-05 |
1021 | Captain Marvel | 2019 | 2019-03-08 | 14:19:00 | 6.9 | 34 | Jeff Goldblum | Male | American | 1952-10-22 |
1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | 35 | Jon Favreau | Male | American | 1966-10-19 |
1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 36 | Ben Kingsley | Male | British | 1943-12-31 |
1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | 37 | Cobie Smulders | Female | Canadian | 1982-04-03 |
1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 | 38 | Hayley Atwell | Female | British | 1982-04-05 |
1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 39 | Natalie Portman | Female | Israeli | 1981-06-09 |
1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 40 | Rachel McAdams | Female | Canadian | 1978-11-17 |
1011 | Avengers: Age of Ultron | 2015 | 2015-02-21 | 20:50:00 | 7.3 | 41 | Benedict Wong | Male | British | 1971-07-03 |
1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 42 | Letitia Wright | Female | British | 1993-10-31 |
1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | 43 | Winston Duke | Male | Tobagonian | 1986-11-15 |
1016 | Spider-Man: Homecoming | 2017 | 2017-12-14 | 02:39:00 | 7.4 | 44 | Danai Gurira | Female | American | 1978-02-14 |
1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 45 | Lee Pace | Male | American | 1979-03-25 |
1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 46 | Djimon Hounsou | Male | Beninese | 1964-04-24 |
1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 47 | Clark Gregg | Male | American | 1962-04-02 |
1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | 48 | Kat Dennings | Female | American | 1986-06-13 |
1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 49 | Stellan Skarsgård | Male | Swedish | 1951-06-13 |
1015 | Guardians of the Galaxy Vol. 2 | 2017 | 2017-10-20 | 12:41:00 | 7.6 | 50 | Idris Elba | Male | British | 1972-09-06 |
pd.merge(df_m, df_a, left_on='movie_id', right_on='movie_id', how='right')
movie_id | title | year | date | time | score | actor_id | actor_name | gender | nationality | date_birth | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | 1 | Robert Downey Jr. | Male | American | 1965-04-04 |
1 | 1005 | Captain America: The First Avenger | 2011 | 2011-03-15 | 06:33:00 | 6.9 | 2 | Chris Evans | Male | American | 1981-06-13 |
2 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 3 | Mark Ruffalo | Male | American | 1967-11-22 |
3 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 4 | Chris Hemsworth | Male | Australian | 1983-08-11 |
4 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 5 | Scarlett Johansson | Female | American | 1984-11-22 |
5 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 6 | Jeremy Renner | Male | American | 1971-01-07 |
6 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 7 | Tom Hiddleston | Male | British | 1981-02-09 |
7 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | 8 | Clark Gregg | Male | American | 1962-04-02 |
8 | 1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 | 9 | Paul Rudd | Male | American | 1969-04-06 |
9 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 10 | Benedict Cumberbatch | Male | British | 1976-07-19 |
10 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 11 | Tom Holland | Male | British | 1996-06-01 |
11 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | 12 | Elizabeth Olsen | Female | American | 1989-02-16 |
12 | 1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 | 13 | Anthony Mackie | Male | American | 1978-09-23 |
13 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 14 | Chadwick Boseman | Male | American | 1976-11-29 |
14 | 1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 | 15 | Chris Pratt | Male | American | 1979-06-21 |
15 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 16 | Zoe Saldana | Female | American | 1978-06-19 |
16 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 17 | Karen Gillan | Female | British | 1987-11-28 |
17 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 18 | Dave Bautista | Male | American | 1969-01-18 |
18 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 19 | Vin Diesel | Male | American | 1967-07-18 |
19 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 20 | Bradley Cooper | Male | American | 1975-01-05 |
20 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 21 | Brie Larson | Female | American | 1989-10-01 |
21 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | 22 | Michael B. Jordan | Male | American | 1987-02-09 |
22 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 23 | Tessa Thompson | Female | American | 1983-10-03 |
23 | 1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 | 24 | Samuel L. Jackson | Male | American | 1948-12-21 |
24 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 25 | Gwyneth Paltrow | Female | American | 1972-09-27 |
25 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 26 | Don Cheadle | Male | American | 1964-11-29 |
26 | 1021 | Captain Marvel | 2019 | 2019-03-08 | 14:19:00 | 6.9 | 27 | Paul Bettany | Male | British | 1971-05-27 |
27 | 1007 | Iron Man 3 | 2013 | 2013-07-20 | 19:12:00 | 7.2 | 28 | Sebastian Stan | Male | Romanian | 1982-08-13 |
28 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | 29 | Josh Brolin | Male | American | 1968-02-12 |
29 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | 30 | Karen Gillan | Female | British | 1987-11-28 |
30 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 31 | Pom Klementieff | Female | French | 1986-05-03 |
31 | 1020 | Ant-Man and The Wasp | 2018 | 2018-07-06 | 16:29:00 | 7.1 | 32 | Evangeline Lilly | Female | Canadian | 1979-08-03 |
32 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | 33 | Tilda Swinton | Female | British | 1960-11-05 |
33 | 1021 | Captain Marvel | 2019 | 2019-03-08 | 14:19:00 | 6.9 | 34 | Jeff Goldblum | Male | American | 1952-10-22 |
34 | 1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | 35 | Jon Favreau | Male | American | 1966-10-19 |
35 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | 36 | Ben Kingsley | Male | British | 1943-12-31 |
36 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | 37 | Cobie Smulders | Female | Canadian | 1982-04-03 |
37 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 | 38 | Hayley Atwell | Female | British | 1982-04-05 |
38 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | 39 | Natalie Portman | Female | Israeli | 1981-06-09 |
39 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 40 | Rachel McAdams | Female | Canadian | 1978-11-17 |
40 | 1011 | Avengers: Age of Ultron | 2015 | 2015-02-21 | 20:50:00 | 7.3 | 41 | Benedict Wong | Male | British | 1971-07-03 |
41 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | 42 | Letitia Wright | Female | British | 1993-10-31 |
42 | 1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | 43 | Winston Duke | Male | Tobagonian | 1986-11-15 |
43 | 1016 | Spider-Man: Homecoming | 2017 | 2017-12-14 | 02:39:00 | 7.4 | 44 | Danai Gurira | Female | American | 1978-02-14 |
44 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 45 | Lee Pace | Male | American | 1979-03-25 |
45 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 46 | Djimon Hounsou | Male | Beninese | 1964-04-24 |
46 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | 47 | Clark Gregg | Male | American | 1962-04-02 |
47 | 1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | 48 | Kat Dennings | Female | American | 1986-06-13 |
48 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | 49 | Stellan Skarsgård | Male | Swedish | 1951-06-13 |
49 | 1015 | Guardians of the Galaxy Vol. 2 | 2017 | 2017-10-20 | 12:41:00 | 7.6 | 50 | Idris Elba | Male | British | 1972-09-06 |
OUTER JOIN#
we use LEFT JOIN and UNION to create the OUTER JOIN in sqlite3
# OperationalError: RIGHT and FULL OUTER JOINs are not currently supported
res = cur.execute('''
SELECT M.title, A.actor_name
FROM (SELECT * FROM MCU_movies LIMIT 5) AS M
LEFT JOIN (SELECT * FROM Actors LIMIT 5) AS A
ON M.movie_id = A.movie_id
UNION
SELECT M.title, A.actor_name
FROM (SELECT * FROM Actors LIMIT 5) AS A
LEFT JOIN (SELECT * FROM MCU_movies LIMIT 5) AS M
ON M.movie_id = A.movie_id
ORDER BY A.actor_name ASC
''')
for i in res.fetchall():
print(i)
('Iron Man 2', None)
('The Incredible Hulk', None)
('Captain America: The First Avenger', 'Chris Evans')
('Thor', 'Chris Hemsworth')
(None, 'Mark Ruffalo')
('Iron Man', 'Robert Downey Jr.')
(None, 'Scarlett Johansson')
df_m.set_index('movie_id').iloc[0:5].join(df_a.set_index('movie_id').iloc[0:5], how='outer', validate='m:m').sort_values(['actor_name'], ascending=True)[['title','actor_name']]
title | actor_name | |
---|---|---|
movie_id | ||
1005 | Captain America: The First Avenger | Chris Evans |
1004 | Thor | Chris Hemsworth |
1006 | NaN | Mark Ruffalo |
1001 | Iron Man | Robert Downey Jr. |
1006 | NaN | Scarlett Johansson |
1002 | The Incredible Hulk | NaN |
1003 | Iron Man 2 | NaN |
pd.merge(df_m.iloc[0:5], df_a.iloc[0:5], left_on='movie_id', right_on='movie_id', how='outer').sort_values(['actor_name'], ascending=True)[['title','actor_name']]
title | actor_name | |
---|---|---|
4 | Captain America: The First Avenger | Chris Evans |
3 | Thor | Chris Hemsworth |
5 | NaN | Mark Ruffalo |
0 | Iron Man | Robert Downey Jr. |
6 | NaN | Scarlett Johansson |
1 | The Incredible Hulk | NaN |
2 | Iron Man 2 | NaN |
CROSS#
res = cur.execute('''
SELECT M.title, A.actor_name
FROM (SELECT * FROM MCU_movies LIMIT 3) AS M
CROSS JOIN (SELECT * FROM Actors LIMIT 3) AS A
''')
for i in res.fetchall():
print(i)
('Iron Man', 'Robert Downey Jr.')
('Iron Man', 'Chris Evans')
('Iron Man', 'Mark Ruffalo')
('The Incredible Hulk', 'Robert Downey Jr.')
('The Incredible Hulk', 'Chris Evans')
('The Incredible Hulk', 'Mark Ruffalo')
('Iron Man 2', 'Robert Downey Jr.')
('Iron Man 2', 'Chris Evans')
('Iron Man 2', 'Mark Ruffalo')
df_m.iloc[0:3].join(df_a.iloc[0:3], validate='m:m', how='cross', lsuffix='_l', rsuffix='_r')
movie_id_l | title | year | date | time | score | actor_id | actor_name | movie_id_r | gender | nationality | date_birth | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | 1 | Robert Downey Jr. | 1001 | Male | American | 1965-04-04 |
1 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | 2 | Chris Evans | 1005 | Male | American | 1981-06-13 |
2 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | 3 | Mark Ruffalo | 1006 | Male | American | 1967-11-22 |
3 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 1 | Robert Downey Jr. | 1001 | Male | American | 1965-04-04 |
4 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 2 | Chris Evans | 1005 | Male | American | 1981-06-13 |
5 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 3 | Mark Ruffalo | 1006 | Male | American | 1967-11-22 |
6 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 | 1 | Robert Downey Jr. | 1001 | Male | American | 1965-04-04 |
7 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 | 2 | Chris Evans | 1005 | Male | American | 1981-06-13 |
8 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 | 3 | Mark Ruffalo | 1006 | Male | American | 1967-11-22 |
pd.merge(df_m[0:3], df_a[0:3], how='cross')
movie_id_x | title | year | date | time | score | actor_id | actor_name | movie_id_y | gender | nationality | date_birth | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | 1 | Robert Downey Jr. | 1001 | Male | American | 1965-04-04 |
1 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | 2 | Chris Evans | 1005 | Male | American | 1981-06-13 |
2 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | 3 | Mark Ruffalo | 1006 | Male | American | 1967-11-22 |
3 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 1 | Robert Downey Jr. | 1001 | Male | American | 1965-04-04 |
4 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 2 | Chris Evans | 1005 | Male | American | 1981-06-13 |
5 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | 3 | Mark Ruffalo | 1006 | Male | American | 1967-11-22 |
6 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 | 1 | Robert Downey Jr. | 1001 | Male | American | 1965-04-04 |
7 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 | 2 | Chris Evans | 1005 | Male | American | 1981-06-13 |
8 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 | 3 | Mark Ruffalo | 1006 | Male | American | 1967-11-22 |
1.4.4 UNION#
The UNION in SQL combines the results of two or more SELECT statements into a single result set.
res = cur.execute('''
SELECT movie_id
FROM (SELECT movie_id FROM MCU_movies LIMIT 5)
UNION
SELECT movie_id
FROM (SELECT movie_id FROM Actors LIMIT 5);
''')
for i in res.fetchall():
print(i)
(1001,)
(1002,)
(1003,)
(1004,)
(1005,)
(1006,)
pd.concat([df_m.iloc[0:5][['movie_id']], df_a.iloc[0:5][['movie_id']]]).movie_id.unique()
array([1001, 1002, 1003, 1004, 1005, 1006])
1.4.5 CASE#
The CASE statement is used to perform conditional logic within queries, allowing for conditional evaluation and result selection based on specified conditions.
res = cur.execute('''
SELECT *,
CASE
WHEN score >= 8 THEN 'High'
WHEN score >= 7 THEN 'Medium'
ELSE 'Low'
END AS quality
FROM MCU_movies
''')
for i in res.fetchall():
print(i)
(1001, 'Iron Man', 2008, '2008-01-01', '15:22:00', 7.9, 'Medium')
(1002, 'The Incredible Hulk', 2008, '2008-03-04', '21:16:00', 6.7, 'Low')
(1003, 'Iron Man 2', 2010, '2010-01-01', '05:38:00', 7.0, 'Medium')
(1004, 'Thor', 2011, '2011-01-01', '07:47:00', 7.0, 'Medium')
(1005, 'Captain America: The First Avenger', 2011, '2011-03-15', '06:33:00', 6.9, 'Low')
(1006, 'The Avengers', 2012, '2012-05-10', '08:30:00', 8.0, 'High')
(1007, 'Iron Man 3', 2013, '2013-07-20', '19:12:00', 7.2, 'Medium')
(1008, 'Thor: The Dark World', 2013, '2013-09-25', '18:27:00', 6.9, 'Low')
(1009, 'Captain America: The Winter Soldier', 2014, '2014-11-12', '13:56:00', 7.7, 'Medium')
(1010, 'Guardians of the Galaxy', 2014, '2014-12-03', '22:25:00', 8.0, 'High')
(1011, 'Avengers: Age of Ultron', 2015, '2015-02-21', '20:50:00', 7.3, 'Medium')
(1012, 'Ant-Man', 2015, '2015-04-30', '17:01:00', 7.3, 'Medium')
(1013, 'Captain America: Civil War', 2016, '2016-06-06', '10:04:00', 7.8, 'Medium')
(1014, 'Doctor Strange', 2016, '2016-08-16', '23:11:00', 7.5, 'Medium')
(1015, 'Guardians of the Galaxy Vol. 2', 2017, '2017-10-20', '12:41:00', 7.6, 'Medium')
(1016, 'Spider-Man: Homecoming', 2017, '2017-12-14', '02:39:00', 7.4, 'Medium')
(1017, 'Thor: Ragnarok', 2017, '2017-11-03', '11:54:00', 7.9, 'Medium')
(1018, 'Black Panther', 2018, '2018-04-23', '03:05:00', 7.3, 'Medium')
(1019, 'Avengers: Infinity War', 2018, '2018-04-27', '14:09:00', 8.4, 'High')
(1020, 'Ant-Man and The Wasp', 2018, '2018-07-06', '16:29:00', 7.1, 'Medium')
(1021, 'Captain Marvel', 2019, '2019-03-08', '14:19:00', 6.9, 'Low')
(1022, 'Avengers: Endgame', 2019, '2019-01-01', '01:30:00', 8.4, 'High')
# use lambda in pandas
df_m['quality'] = df_m.score.apply(lambda x: 'High' if x >= 8 else ('Medium' if x >= 7 else 'Low'))
df_m
movie_id | title | year | date | time | score | quality | |
---|---|---|---|---|---|---|---|
0 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | Medium |
1 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | Low |
2 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 | Medium |
3 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | Medium |
4 | 1005 | Captain America: The First Avenger | 2011 | 2011-03-15 | 06:33:00 | 6.9 | Low |
5 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | High |
6 | 1007 | Iron Man 3 | 2013 | 2013-07-20 | 19:12:00 | 7.2 | Medium |
7 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | Low |
8 | 1009 | Captain America: The Winter Soldier | 2014 | 2014-11-12 | 13:56:00 | 7.7 | Medium |
9 | 1010 | Guardians of the Galaxy | 2014 | 2014-12-03 | 22:25:00 | 8.0 | High |
10 | 1011 | Avengers: Age of Ultron | 2015 | 2015-02-21 | 20:50:00 | 7.3 | Medium |
11 | 1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 | Medium |
12 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | Medium |
13 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | Medium |
14 | 1015 | Guardians of the Galaxy Vol. 2 | 2017 | 2017-10-20 | 12:41:00 | 7.6 | Medium |
15 | 1016 | Spider-Man: Homecoming | 2017 | 2017-12-14 | 02:39:00 | 7.4 | Medium |
16 | 1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 | Medium |
17 | 1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | Medium |
18 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | High |
19 | 1020 | Ant-Man and The Wasp | 2018 | 2018-07-06 | 16:29:00 | 7.1 | Medium |
20 | 1021 | Captain Marvel | 2019 | 2019-03-08 | 14:19:00 | 6.9 | Low |
21 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | High |
# use a definition in python
def quality_level(x):
if x >= 8:
y ='High'
elif x >= 7:
y = 'Medium'
else:
y = 'Low'
return y
df_m['quality_1'] = df_m.score.apply(lambda x: quality_level(x))
df_m
movie_id | title | year | date | time | score | quality | quality_1 | |
---|---|---|---|---|---|---|---|---|
0 | 1001 | Iron Man | 2008 | 2008-01-01 | 15:22:00 | 7.9 | Medium | Medium |
1 | 1002 | The Incredible Hulk | 2008 | 2008-03-04 | 21:16:00 | 6.7 | Low | Low |
2 | 1003 | Iron Man 2 | 2010 | 2010-01-01 | 05:38:00 | 7.0 | Medium | Medium |
3 | 1004 | Thor | 2011 | 2011-01-01 | 07:47:00 | 7.0 | Medium | Medium |
4 | 1005 | Captain America: The First Avenger | 2011 | 2011-03-15 | 06:33:00 | 6.9 | Low | Low |
5 | 1006 | The Avengers | 2012 | 2012-05-10 | 08:30:00 | 8.0 | High | High |
6 | 1007 | Iron Man 3 | 2013 | 2013-07-20 | 19:12:00 | 7.2 | Medium | Medium |
7 | 1008 | Thor: The Dark World | 2013 | 2013-09-25 | 18:27:00 | 6.9 | Low | Low |
8 | 1009 | Captain America: The Winter Soldier | 2014 | 2014-11-12 | 13:56:00 | 7.7 | Medium | Medium |
9 | 1010 | Guardians of the Galaxy | 2014 | 2014-12-03 | 22:25:00 | 8.0 | High | High |
10 | 1011 | Avengers: Age of Ultron | 2015 | 2015-02-21 | 20:50:00 | 7.3 | Medium | Medium |
11 | 1012 | Ant-Man | 2015 | 2015-04-30 | 17:01:00 | 7.3 | Medium | Medium |
12 | 1013 | Captain America: Civil War | 2016 | 2016-06-06 | 10:04:00 | 7.8 | Medium | Medium |
13 | 1014 | Doctor Strange | 2016 | 2016-08-16 | 23:11:00 | 7.5 | Medium | Medium |
14 | 1015 | Guardians of the Galaxy Vol. 2 | 2017 | 2017-10-20 | 12:41:00 | 7.6 | Medium | Medium |
15 | 1016 | Spider-Man: Homecoming | 2017 | 2017-12-14 | 02:39:00 | 7.4 | Medium | Medium |
16 | 1017 | Thor: Ragnarok | 2017 | 2017-11-03 | 11:54:00 | 7.9 | Medium | Medium |
17 | 1018 | Black Panther | 2018 | 2018-04-23 | 03:05:00 | 7.3 | Medium | Medium |
18 | 1019 | Avengers: Infinity War | 2018 | 2018-04-27 | 14:09:00 | 8.4 | High | High |
19 | 1020 | Ant-Man and The Wasp | 2018 | 2018-07-06 | 16:29:00 | 7.1 | Medium | Medium |
20 | 1021 | Captain Marvel | 2019 | 2019-03-08 | 14:19:00 | 6.9 | Low | Low |
21 | 1022 | Avengers: Endgame | 2019 | 2019-01-01 | 01:30:00 | 8.4 | High | High |
1.4.6 GROUP BY and HAVING#
The GROUP BY statement groups rows that have the same values into summary rows, which is often used with aggregate functions (COUNT(), MAX(), MIN(), SUM(), AVG()) to group the result-set by one or more columns.
The HAVING clause is used in SQL because the WHERE keyword cannot be used with aggregate functions.
# Execute an SQL query to retrieve the year and score of MCU movies and filtering only the years where the average movie score is greater than 7.5.
res = cur.execute('''
SELECT year, AVG(score) AS avg_score
FROM MCU_movies
GROUP BY year
HAVING AVG(score) > 7.5
''')
res.fetchall()
[(2012, 8.0),
(2014, 7.85),
(2016, 7.65),
(2017, 7.633333333333333),
(2018, 7.6000000000000005),
(2019, 7.65)]
df_m.groupby('year')[['score']].mean().query('score > 7.5')
score | |
---|---|
year | |
2012 | 8.000000 |
2014 | 7.850000 |
2016 | 7.650000 |
2017 | 7.633333 |
2018 | 7.600000 |
2019 | 7.650000 |