Avoiding common database pitfalls

Post on 09-Jan-2017

243 views 2 download

transcript

Avoiding Common Database Pitfalls

Derek Binkley@DerekB_WI

About Me★Lead Developer at National Conference of

Bar Examiners★PHP and Java Developer★MySQL DBA★Father of Three★First ever talk

About NCBEExamining Bars?

About NCBENo, developing the bar exam for future lawyers.

Plus, supporting state admission authorities.

What is this talk about anyway?

Relational databases:Why to use them

Avoiding common mistakesGetting the most of a database

Good data design

Common ways to store data★Relational database (MySQL, SQL Server,

Oracle, Postgres)★NoSQL/document database/Key-value store

(CouchDB, MongoDB, Redis)★File system★Custom Built Solutions

Database Normalization1st Normal Form• Primary Key enforcing uniqueness• Consistent Field Types• Consistent Column Count

2nd Normal Form• Parent/Child Relations• Foreign Keys

3rd Normal Form• No transitive dependencies

Pitfall #1

Not understanding how to structure data

No clearly defined columnsColumns were left unlabeled so that customers could define their own data.

Very difficult for future developers

SQL with no defined columns

How maintainable is this?Do you know what this does?How would you map this to an object in PHP?

SQL well defined columns

This statement is quite clear what you are selecting and what you will get.

A more flexible solution?If Developer Needs• Flexible data storage• Customer ability to define meaning of data

NoSQL – Great use case.• Allows data design to change on the fly• Document defines meaning of underlying data.

Not using foreign key

Not taking advantage of database’s ability to store and index a number as a foreign key.

Violates 2nd Normal form.Cannot ensure data consistency

Keys with meaning

In this example any ID with a value < 100 gets administrative rights.

Problems?It is not clear the meaning of the data without the PHP code above.Administrative access could accidentally be given.Artificially limited to 99 administrators over life of system.

Natural Key v. Surrogate Key• A surrogate key has no meaning, e.g. GUID, incremental key• A natural key is a unique data attribute already in table, e.g.

two digit state code.

Reduce Joins with Natural KeySurrogate keys require join to get address of customer

Natural key allows understanding of state field without a join.

Natural Key May ChangeA primary key must be stable over time.In the United States, two digit state codes are historically stable.Examples of data thought to be stable may not be.• Country code – Yugoslavia? Soviet Union?• SSN – Privacy concerns require SSN to be encrypted?• Naturalized surrogate key – Once a surrogate key is shown to

user it starts to have meaning.

Pitfall #2

Not using the database for what it is good at

Not defining keys/indexes1. Ensure Data Integrity2. Boost PerformanceDatabase System optimizes queries for you.Explain plan can help provide useful information to see the

efficiency of keys.

Custom Transaction Handling• Databases are built to handle transaction.• App created its own locking table.• Developers spent time recreating a feature

they had already paid for.

• Deadlock condition required contacting tech support.

Not using processing power of DBDatabase is not just a storage engine.Powerful platform for sorting, filtering, grouping

and summarizing data.

Where do you put your domain logic?

Domain Logic In MemoryLogic is entirely in your PHP code, database is merely

used for storage.Easy to refactorEasy to unit testPart of your version control systemEasy to deployInefficient use of SQLHigh overheard for database connections and queries

Domain Logic In DatabaseSome logic is in SQL (procedures, views or direct SQL)More efficient: less memory in PHP, less connection

timeSQL statement will be much smaller and possibly easier

to understand.

Object Relational Mapping - ORM • Simplifies data access by mapping tables to

objects.• Can lead to more complex in memory

manipulation of data.• Can tie in to query mechanism to offload work

and logic to database.

Mishandling of nulls• Null should not be meaningful.• Null = “I don’t know”• Null column not equal to another null column• Check for “is null”

ResourcesA Simple Guide to Five Normal Forms in Relational Database Theory - http://www.bkent.net/Doc/simple5.htmNormalization: Friend or Foe? - http://www.treelinedesign.com/slides/normalization/confoo14.pdfDomain Logic and SQL - http://martinfowler.com/articles/dblogic.htmlChoosing a Primary Key: Natural or Surrogate? - http://www.agiledata.org/essays/keys.htmlSurrogate Key vs. Natural Key - http://sqlmag.com/business-intelligence/surrogate-key-vs-natural-key

Slides are available at ????

Please give feedback at https://joind.in/16419