Building a Database with SQL & Python (SQLite)
In this SQLite tutorial with Python we will show the basics of SQL operations from within Python console.
Then we will move on to a more advanced real-world example to create an interesting SQLite database using Python.
Introduction
You can execute all the SQL commands from calling them inside Python code using SQLite. In this SQLite tutorial we will demonstrate how to manage and access MySQL like databases using Python and SQLite library.
You will learn how to do all of these things using Python:
- Connect to Databases
- Execute SQL codes using SQLite library in Python
- Create and alter MySQL tables
- Insert values into MySQL tables
- Build databases from more complex data structures in Python using loops
- Query existing databases with Python and fetch data
You can use these database techniques for commercial products and services as well as non-commercial products and services such as education or open source contributions.
You can use database management, querying, altering and manipulation techniques for Data Science and Machine Learning tasks as well as more traditional applications such as scripting, app development, game development and web services.
Or maybe you can even come up with an innovative approach to make use of databases. MySQL is really powerful and it’s a great skill to match with Python.
1- SQLite3 Library Installation
we will use sqlite3 library. sqlite3 library is included in standard Python libraries so, normally you shouldn’t have to install it separately.
if you get an error while importing sqlite3 because you don’t have it installed just use this command:
pip install sqlite3
if you need further assistance you can refer to this page: installing libraries using pip and Conda.
‘https://holypython.com/installing-pip-packages-with-anaconda/
‘https://www.sqlite.org/download.html
2- SQL Connection and Cursor in Python
Connection First
Like most things, SQL operations in Python are straight-forward. Before being able to do anything related to SQL you will need to create a connection to the database. This can be done using sqlite3.connection()
If database you are pointing in your connection doesn’t exist, it will be created in the location you provided. Check out this Python example:
- useful for repositioning the turtle
- making breaks in the drawin
import sqlite3
conn = sqlite3.connect('Desktop/Family.sqlite3')
curr = conn.cursor()
Cursor is another crucial part of initiating the SQL operations in Python. Using the connection to the database we just created we can create a cursor which will be used to execute SQL commands.
Below you can see the part that would come after connection and cursor. We are using executescript method to execute an SQL script which creates a table named Fam with two columns: Name and Age.
3- Executing SQL Scripts in Python
You can simply use .executescript method with the cursor and enter any sensible SQL command inside. Python’s triple quotes are very useful here as they allow us to break code into multiple lines.
After executing the SQL script we have to commit to the connection otherwise changes done via SQL code won’t take effect. To do this we simply use connection.commit().
curr.executescript('''
CREATE TABLE Fam (
Name TEXT,
Age INT);
''')
conn.commit()
4- Closing the Connection
Finally you also have to close the connection otherwise database will remain in use by Python. This can be done by simply calling .close() method on the connection.
Here is a more complete code:
import sqlite3
conn = sqlite3.connect('Desktop/Family.sqlite3')
curr = conn.cursor()
cur.executescript('''
CREATE TABLE Fam (
Name TEXT,
Age INT);
''')
conn.commit()
conn.close()
5- SQL Benefits
Benefits of learning Python programming language is an extensive list and most people are already aware of all its items. But what about SQL? Is SQL really so useful? What are the benefits of coupling SQL skills with Python?
We follow data science reports closely and most data science surveys report SQL in top 3 in-demand data science skills such as this Glassdoor report. SQL had a tendency to follow Python and R despite not being a full programming language (SQL is considered a domain-specific language that’s used to manage and manipulate data in relational databases).
In recent couple of years we can say that SQL is even surpassing in most statistics and coming right after Python regarding top in-demand skills for data science jobs.
But we can’t just validate the benefits of learning SQL programming with external sources and demand in HR departments of companies though it’s a good indicator. Companies usually have lots of resources to discover and demand most useful and edge-cutting technologies. To really understand the benefits of SQL you should get your feet wet and explore how it may solve data problems of your projects.
SQL skills go beyond data science and can be utilized anywhere with Python related applications from scripting to web development, gui applications to app or game development, web crawling / scraping, financial services, patient data in life sciences, city / government level management, defense systems and analytics and more.
6- Database Example: Internet Usage in the World
Let’s make another database example with SQLite and Python:
Internet usage is an important data indicating tech penetration and probably development levels of countries. It’s also sad to see the inequalities as individuals without internet connection would be at such a disadvantage regarding all sorts of information, business activity and skill acquisition. But good news is these numbers are going up and internet usage is expected to level out further across the world in the upcoming years.
I was wondering how I can save only the table and query it just for fun and it was a very insightful and mind opening practice.
In the remaining part of this tutorial we will simply build our own database with this table.
Can you build the same database from the same table? If you are new to SQLite, SQL or Python it would be a great practice to skills you are learning in this article.
Note: You can find this data on Wikipedia. Simply go to Wikipedia page below and then copy the table with its headers all the way down. Then you can go to a spreadsheet editor like Excel and paste by “matching destination formatting”, so, don’t keep source formatting. Then you can save this file as csv, comma separated values format.
Here we are importing all the libraries needed:
sqlite3: will be used for database connections and executing SQL scripts
pandas: will be used to read csv file that consists of Wikipedia table.
import pandas as pd
import sqlite3
import datetime
from datetime import timedelta
#Reading Wikipedia Table
file = "Desktop/intuse.csv"
data = pd.read_csv(file, index_col=(0), header=0,
encoding="cp1252", thousands=",")
print(data.tail())
print(data.columns)
#Connecting to Database
conn = sqlite3.connect('Desktop/wiki_collection.sqlite')
cur = conn.cursor()
.tail method will show the tail of the data frame (last 5 rows). I also used .columns method to see which columns we have and get an idea of names for the database table’s columns.
Great now we seem to have all the information we need to build a solid sqlite database.
I’m going to call the table Intuse but you probably can come up with a better name.
Note: SQL needs data to be in SQL syntax and standards. This means there are no thousand separators for big numbers. This can simply be managed by “thousands” parameter in read_csv() with pandas. If you skip this little step data won’t be recognized properly in the database and you can have difficulties while querying later.
cur.executescript('''
DROP TABLE IF EXISTS Intuse;
CREATE TABLE Intuse (
country TEXT,
users REAL,
population INT,
rank1 INT,
percentage INT,
rank2 INT
);
''')
After having the proper setup for the database table we can fill it with values. We already have an active cursor, all we need is a simple loop iteration on our pandas data frame. Below, we are iterating each row using .iterrows() method and executing an INSERT INTO SQL command at each iteration.
for i in data.iterrows():
print(i)
cur.execute('''INSERT OR IGNORE INTO Intuse (country, users, population, rank1, percentage, rank2)
VALUES ( ?, ?, ?, ?, ?, ?)
''', (i[1][0], i[1][1], i[1][2], i[1][3], i[1][4], i[1][5]) )
That’s all there is! Now the table should be filled with meaningful values and ready to be queried. Let’s take a manual sneak peak with DB Browser for SQLite and we will make different queries on this database using Python in this tutorial.
You can also see Visualization ideas on it right from the database management software.
Summary & Benefits
To summarize this mini SQL operation done in Python, we:
- Created a connection to database and a cursor with that connection
- Executed SQL code using the cursor we created
- Committed to the database connection and closed the connection.
Retrieving data from the database is a slightly different operation. Only in the second step, we will use fetching methods with the cursor. You can see this page: