Post on 26-Jan-2015
description
transcript
Maxym KharchenkoAmazon.com
Manage ORACLE databaseswith Python
Started as a database kernel developer● Network database: db_VISTA
ORACLE DBA for ~ 12 years● Starting with ORACLE 8
Last 3 years: Senior Persistence Engineer @amazon.com
OCM, ORACLE Ace Associate
Blog: http://intermediatesql.com Twitter: @maxymkh
Whoami
Talk about why Python is awesome
Design db “connection status” monitor in Python
(hopefully) Demo it
Agenda
The job of an engineer is to make complex things simple
Why script outside the database ?
SELECT …
FROM v$backup_datafile
Do I have a good datafile backup ?
v$backup_datafile
Backup files actually exist and can be validated
Backup size is plausible On the end storage Network transfer as well
No critical errors in logs
Etc
Some things are just difficult to do in a database
You will, generally, get better Zen
i := i + 1 i += 1
You will, generally, get better Zen
You will, generally, get better Zen
Database resources are precious
Database
Scaling application load: pretty easy
Database
Scaling database load: fairly difficult
Database
Database
Database
So, why Python ?
Python enforces good coding practices
Python enforces good coding practices
my $p1 = GetParams(db1);my $p2 = GetParams(db2);foreach my $p (keys %$p1) {if(exists($p1->{$p}) and exists($p2->{$p})) {if(uc($p1->{$p}) eq uc($p2->{$p})) {$diff->{SAME}->{$p} = 1;} else {$diff->{DIFF}->{$p} = 1;}}} elsif(! exists($p2->{$p})) {$diff->{ONLY_IN_1}->{$p} = 1;}}
Python enforces good coding practices
if 1==1: print "Equal"else: print "Not equal"
Equal
Python enforces good coding practices
if 1==1: print "Equal"else:print "Not equal“File "./a.py", line 6 print "Not equal" ^IndentationError: expected an indented block
Python enforces good coding practices
import requests
url = "http://api.openweathermap.org/data/2.5/weather?q=Seattle,WA"
r = requests.get(url)
print r.json['rain']
Python interfaces to everything
import cx_Oracle
sql = "SELECT user FROM dual"
conn = cx_Oracle.connect('scott/tiger@orcl')
cursor = conn.cursor()cursor.execute(sql)
Python interfaces to everything
import botofrom boto.s3.key import Key
s3 = boto.connect_s3()bucket = s3.get_bucket('my_cats')
k.key = 'lazy_cat.jpg'k.set_contents_from_filename( '/tmp/lazy_cat.jpg')
Python interfaces to everything
Python is pretty popular
Object oriented ? Functional ? Procedural ?
Scripting / interpreted OS independent
What kind of a language is Python ?
If you know any scripting languageyou (almost) know Python
Python is similar to other languages
def is_accessible(db_name): """ Check if dbs are accessible """ ret = False
db_status = ping_db(db_name) if "ACTIVE" == db_status: ret = True
return ret
def print_databases(): with open("/etc/oratab") as f: for line in f: print line.strip().split(':')[0]
But : “Spaces” are first class citizens
Compare to Perl
sub print_databases() { open(my $f, "<", "/etc/oratab") or die ("Unable to open: $!");
while(my $line = <$f>) { my @aItems = split(':', $line); print $aItems[0] . "\n"; }
close($f);}
Compare to Perl
sub print_databases() {open(my $f, "<", "/etc/oratab“) or die ("Unable to open: $!");while(my $line = <$f>) {my @aItems = split(':', $line);print $aItems[0] . "\n"; }close($f);}
def print_databases(): with open("/etc/oratab") as f: for line in f: print line.strip().split(':')[0]File "./a.py", line 6 print line.strip().split(':')[0] ^IndentationError:expected an indented block
Again: Watch out for indents!
[ x for x in all_dbs if x['state'] == 'UP' ]
# Better than:for db in all_dbs: if db['state'] == 'UP': active_dbs.append(db)
Learn to think Pythonian
with open("/etc/oratab") as f: for line in f: …# Better thanf = open("/etc/oratab")while True: line = f.readline() …f.close()
Learn to think Pythonian
Python and ORACLE
import cx_Oracle
conn_str = "user/password@db_name" sql = "SELECT username FROM dba_users"
try: conn = cx_Oracle.connect(conn_str) cursor = conn.cursor() cursor.execute(sql) results = cursor.fetchall()except Exception, e: <Do exception handling>else: <process results>
Monitor must detect if ORACLE database is accessible● If it is NOT accessible, monitor must send
ALERT email
Monitor needs to be able to process multiple databases● Preferably in parallel
Bonus: we should be able to track monitor execution timing
Database “Connection status” monitor
Demo
import cx_Oracle
def is_db_alive(db_name): is_alive = False
try: conn = cx_Oracle.connect("user/password@%s" % db_name) cursor = conn.cursor() cursor.execute("SELECT user FROM dual") except: is_alive = False else: is_alive = True return is_alive
cx_Oracle: Running SQL
db_list = ['db1', 'db2', 'db3']for d in db_list: print("%s=%s" % (d, is_db_alive(d)))
Execute on multiple databases
> csm_monitor.py
db1=Truedb2=Falsedb3=True
import smtplib
def send_email(msg): sender = "oracle@myhost.com" receivers = ["maxym@amazon.com"] try: smtpObj = smtplib.SMTP('localhost') smtpObj.sendmail(sender, receivers, msg) except smtplib.SMTPException: print "Error: unable to send email“
def is_db_alive(db_name): … if not is_alive: send_email('Unable to ping: %s' % db_name)
Send Email
db_list = ['db1', 'db2', 'db3']for d in db_list: print("%s=%s" % (d, is_db_alive(d)))
Send Email
> csm_monitor.py
db1=True Email: 'Unable to ping: db2' sentdb2=Falsedb3=True
“Decorator” functions
def is_db_alive(): … do some stuff return is_alive
def timeit: start = time()
elapsed = time()-start
record(elapsed)
return ret
ret = is_db_alive()
@timeitdef is_db_alive(): … do some stuff return is_alive
from time import time
def timeit(func): def time_exec(*args, **kwargs): start_time = time() ret = func(*args, **kwargs) ela = time() - start_time
print("%s [%s]: %.3f seconds" % (strftime('%X'), args[0], ela)) return ret
return time_exec
@timeitdef is_db_alive(db_name): …
Decorator: Execution Timing
db_list = ['db1', 'db2', 'db3']for d in db_list: print("%s=%s" % (d, is_db_alive(d)))
Execution Timing:
> time csm_monitor.py
22:48:20 [db1]: 6.430 secondsdb1=True Email: 'Unable to ping: db2' sent 22:48:32 [db2]: 11.841 secondsdb2=False 22:48:38 [db3]: 5.946 secondsdb3=True
real 0m24.566s
import multiprocessing
def print_db_alive(db_name): print("%s=%s" % (db_name, is_db_alive(db_name)))
jobs = []for d in db_list: p = multiprocessing.Process( target=print_db_alive, args=(d,) ) jobs.append(p) p.start()
for p in jobs: p.join()
Parallelism
Parallelism
> time csm_monitor.py 22:51:08 [db1]: 6.439 secondsdb1=True 22:51:08 [db3]: 6.469 secondsdb3=True Email: 'Unable to ping: db2' sent 22:51:14 [db2]: 12.491 secondsdb2=False
real 0m12.908s
Lots of free resources on the web● Tutorials● Documentation● Stackoverflow.com● Even books
Python self documentation:● dir()● help()● type()
How to start with Python
Thank you!