Querying a Database with Python (SQLite)

Contents

Introduction
Summary

Introduction

In this SQL tutorial we will explore convenient querying possibilities with Python.

Once you’ve created an SQLite (or MySQL) Database you will likely want to query it as well. Or you might also want to access an existing database and retrieve data from it for your application, script, science project or data science project.

If you’re wondering how you can create a database for practicing, you can check out these tutorials:

DB Browser for SQLite for (manual method)

Building a database (Python method)

Connecting to Database and Creating Cursor

First we need to connect to the database and create a cursor with that connection object. We will use this cursor to “fetch” data from database.

import sqlite3

conn = sqlite3.connect('Desktop/Family.sqlite3')
curr = conn.cursor()

Fetching Database Rows in Python (Fetchone)

Fetching simply means retrieving data in this context. We can use fetchall or fetchone methods on our cursor. fetchall fetches all rows returned by the SQL query while fetchone returns only one row.

data = cur.execute('SELECT * FROM Fam')

print(data.fetchone())

('Pam', 50, 'Unknown', None)

Fetching Database Using Fetchall

And fetchall will fetch the whole result of the query. See the example:

data = cur.execute('SELECT * FROM Fam')

print(data.fetchall())

[('Pam', 50, 'Unknown', None), ('Miranda', 32, 'Unknown', None), 
('Pascal', 45, 'Unknown', None), ('Dave', 12, 'Unknown', None), 
('Emmy', 23, 'Unknown', None), ('Michael', 36, 'Unknown', None)]

When you are done querying don’t forget to close the connection so database doesn’t remain in use by Python.

conn.close()

It may seem too simple or trivial but once you master up to here, it doesn’t matter if database has 1 million rows or 1 billion rows you can tackle it with the same skills.

Database Querying Example with Python

import sqlite3

conn = sqlite3.connect('Desktop/GA3.sqlite')
cur = conn.cursor()

data = cur.execute('SELECT * FROM Intuse WHERE Population > 150000000', )

output = data.fetchall()
print(*output, sep="\n")

Here we are getting all the rows where countries have population higher than 150 million.

('China', 904080566.0, 1427647786, 1, '63.33%', 98)
('India', 755820000.0, 1366417754, 2, '55.40%', 117)
('United States', 312320000.0, 324459463, 3, '96.26%', 14)
('Indonesia', 196714070.0, 266911900, 4, '73.70%', 73)
('Brazil', 150410801.0, 209288278, 6, '71.86%', 75)
('Nigeria', 136203231.0, 205886311, 7, '66.44%', 89)
('Bangladesh', 111875000.0, 164945471, 8, '67.79%', 99)
('Pakistan', 93000000.0, 220892341, 5, '43.50%', 135)

To see the columns of this table we can use this query.


data = cur.execute('PRAGMA table_info(Intuse)')

output = data.fetchall()
print(*output, sep="\n")

(0, 'country', 'TEXT', 0, None, 0)
(1, 'users', 'REAL', 0, None, 0)
(2, 'population', 'INT', 0, None, 0)
(3, 'rank1', 'INT', 0, None, 0)
(4, 'percentage', 'INT', 0, None, 0)
(5, 'rank2', 'INT', 0, None, 0)

Now data makes more sense, we can see the internet usage percentage and population of these countries. Based on this table the US seems to have a pretty high internet usage ratio with 96% and Pakistan’s ratio is pretty low with 43%. Let’s dig deeper.


data = cur.execute('SELECT * FROM Intuse order by percentage
                desc limit 5')

output = data.fetchall()
print(*output, sep="\n")


('Falkland Islands', 2881.0, 2910, 213, '99.02%', 1)
('Andorra', 76095.0, 76965, 194, '98.87%', 2)
('Bermuda', 60349.0, 61349, 197, '98.37%', 3)
('Iceland', 329196.0, 335025, 175, '98.26%', 4)
('Liechtenstein', 37201.0, 37922, 203, '98.10%', 5)

This is really cool to have such power and tweaking sensitivity on queries. Now, we’re seeing top 5 countries (achieved by LIMIT 5) in descending order based on internet usage percentage. Iceland, Bermuda, Liechtenstein, Andorra and Falkland Islands all seem to be doing great.

Interestingly some of these are island countries and internet seems to be providing a great opportunity to offset some of the effects of isolation from the rest of the world. Others are tiny European countries.

Small countries and economies can have unique dynamics for their infrastructure. Let’s try to discover something similar but for countries above certain size limit.


data = cur.execute('SELECT * FROM Intuse order by percentage
                desc limit 5')

output = data.fetchall()
print(*output, sep="\n")

Countries above 1 million population ordered by internet usage:

('Kuwait', 4053797.0, 4136528, 129, '98.00%', 6)
('Norway', 5120225.0, 5305383, 117, '96.51%', 12)
('Sweden', 9554907.0, 9910701, 88, '96.41%', 13)
('United States', 312320000.0, 324459463, 3, '96.26%', 14)
('Qatar', 2532059.0, 2639211, 141, '95.94%', 14)

Gulf countries seem to be doing so well. Obviously it seems internet penetration on that top percentile probably requires lots of investment power. And we also see Nordic countries thriving in this list. Let’s check the same stats for countries above 10 million population, we’re entering the big league.


data = cur.execute('SELECT * FROM Intuse order by percentage
                desc limit 5')

output = data.fetchall()
print(*output, sep="\n")

Another interesting result from our query. This is really addictive to be able to change one character and have query results exactly as you want from the luxury of your Python console. SQLite and databases offer so much potential for any type of coder really.

('United States', 312320000.0, 324459463, 3, '96.26%', 14)
('South Korea', 49421084.0, 50982212, 27, '95.10%', 16)
('United Kingdom', 65001016.0, 66181585, 21, '94.62%', 19)
('Iran', 78086663.0, 83020323, 17, '94.05%', 104)
('Netherlands', 15877494.0, 17035938, 66, '93.20%', 22)

You can also visualize these query results on the go using DB Browser for SQLite.

Summary

In this SQL tutorial with Python we have seen different SQL queries and how to execute them from inside a Python code using sqlite3 library.

We’ve seen Python examples for getting database column names and features, creating database connection and cursor, fetching data from SQLite database and more SQL queries on the database with Python.