Using Database with Python Week2

Relational Databases and SQLite

Posted by freeCookie🍪 on December 20, 2016

Relational Databases and SQLite

Definition:

Database - contains many tables

• Relation (or table) - contains tuples and attributes

• Tuple (or row) - a set of fields that generally represents an “object” like a person or a music track

• Attribute (also column or field) - one of possibly many elements of data corresponding to the object represented by the row

Relation(Table):

  Attribute  
Tuple(Row) Tuple/Attribute Tuple
  Attribute(column)  

SQL

Structured Query Language is the language we use to issue commands to the database

Creat a table/ Retrive data/ Insert / Delete

Using SQLite Browser

Some Examples:

Creat/ Insert/ Delete/ Retrive/ Sort/ Count selected numbers

CREATE TABLE Users(
	name VARCHAR(128),
  	email VARCHAR(128)
)
INSERT INTO Usesrs(name, email)VALUES('name','email@sth.com')
DELETE FROM Users WHERE email = 'email@sth.com'
UPDATE USers SET name = 'newname' WHERE email = 'email@sth.com'
SELECT*FROM Users WHERE email = 'email@sth.com'
SELECT*FROM Users ORDER BY name(email)
SELECT COUNT(*) FROM Users

小assessment: 找到一堆文件里的email域名并统计。域名是什么玩意?死蠢的试验了一下:

org = re.search("@[\w.]+", email).group().strip().split('@')[-1]

代码主体部分还是提供好了的:

import sqlite3
import re

conn = sqlite3.connect('emaildb.sqlite')
cur = conn.cursor()

cur.execute('''
DROP TABLE IF EXISTS Counts''')

cur.execute('''
CREATE TABLE Counts (org TEXT, count INTEGER)''')

fname = raw_input('Enter file name: ')
if ( len(fname) < 1 ) : fname = 'mbox.txt'
fh = open(fname)
for line in fh:
    if not line.startswith('From: ') : continue
    pieces = line.split()
    email = pieces[1]
    org = re.search("@[\w.]+", email).group().strip().split('@')[-1]
    print org
    cur.execute('SELECT count FROM Counts WHERE org = ? ', (org, ))
    row = cur.fetchone()
    if row is None:
        cur.execute('''INSERT INTO Counts (org, count)
                VALUES ( ?, 1 )''', ( org, ) )
    else :
        cur.execute('UPDATE Counts SET count=count+1 WHERE org = ?',
            (org, ))
    # This statement commits outstanding changes to disk each
    # time through the loop - the program can be made faster
    # by moving the commit so it runs only after the loop completes
    conn.commit()

# https://www.sqlite.org/lang_select.html
sqlstr = 'SELECT org, count FROM Counts ORDER BY count DESC LIMIT 10'

print
print "Counts:"
for row in cur.execute(sqlstr) :
    print str(row[0]), row[1]

cur.close()

洗洗睡