Post on 15-Aug-2020
transcript
BadgerDB::Btree• Goal:BuildkeycomponentsofaRDBMS
– Firsthandexperiencebuildingtheinternalsofasimpledatabasesystem
– Andhavesomefundoingso!
• Twoparts– Buffermanager[✔]– B+tree(DueDate:Mar27by2PM)
• FirstclassdayaftertheSpringbreak
Allprojectsareindividualassignments
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 1
StructureofDatabase
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 2
Queryoptimizerandexecution
Relationaloperators
Fileandaccessmethods
Buffermanager
I/Omanager
Planfortoday
• ReviewofC++templatesandhelpfulfunctions– memset,memcpy andreinterpret_cast
• B+tree:insertion• <break>• BadgerDB::Btree
– Projectspecifications– Code
• Q&A
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 3
C++templates
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 4
Humanmiseryofduplicatecodes
• SupposeyouwriteafunctionprintData:
• lateryouwanttoprintdoubleandstd::string
void printData(int value) {std::cout << "The value is "<< value;
}
void printData(double value) {std::cout << "The value is "<< value;
}
void printData(std::string value) { std::cout << "The value is "<< value;
}2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 5
AndStroustrup said- lettherebetemplates
•
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 6
template<typename T>void printData(T value) {
std::cout << "The value is ” << value;}
AndStroustrup said- lettherebetemplates
•
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 7
template<typename T>void printData(T value) {
std::cout << "The value is ” << value;}
Templatesemantics
• Thesyntaxissimple:template< typename name ó class name >
• Functiontemplates• Classtemplates
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 8
Functiontemplates
•
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 9
template<typename T>void func() {}
int main() { func<int>();func<double>();
}
Functiontemplates
•
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 10
template<typename T>void func() {}
int main() { func<int>();func<double>();
}
template<typename T>void func(T value) {} template<typename T, typename U>T func2(U value) {
return T(value); }
int main() { // T=intfunc(3);// T=doublefunc(3.5);
// T=int, U=double func2(3.5);// T=std::vector, U=intfunc2<std::vector> (5); // specify both T and U // T=std::vector, U=intfunc2<std::vector, int>(5.7);
}
Classtemplates• Alsoworksonstructs
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 11
template<typename T, int i>struct FixedArray {
T data[i]; };
FixedArray<int, 3> a; // array of 3 integers
Classtemplates• Alsoworksonstructs
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 12
template<typename T, int i>struct FixedArray {
T data[i]; };
FixedArray<int, 3> a; // array of 3 integers
template<typename T>class MyClass { };
template<typename T1, typename T2=int> class MyClass{};
// specify all parameters MyClass<double, std::string> mc1;
// default value for T2MyClass<int> mc4;
Templaterequirements• Templatesimplicitlyimposerequirementsontheirparameters
• TypeThastobe:– Copy-Constructible if
T a(b);– Assignablei.e. definesoperator=() if:
a=b;– etc
• Forthisproject:operationssuchasa < b couldmeandifferentforint andstd::string
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 13
Pointersandarrays
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 14
Pointersandarrays
• arraysworkverymuchlikepointerstotheirfirstelementsint myarray [20];int * mypointer;
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 15
Pointersandarrays
• arraysworkverymuchlikepointerstotheirfirstelementsint myarray [20];int * mypointer;
Canyoudo?mypointer = myarray;
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 16
Pointersandarrays• arraysworkverymuchlikepointerstotheirfirstelementsint myarray [20];int * mypointer;
Canyoudo?mypointer = myarray;
Can you do?myarray = mypointer;
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 17
Pointersandarrays• arraysworkverymuchlikepointerstotheirfirstelementsint myarray [20];int * mypointer;
Canyoudo?mypointer = myarray; ç Yes
Can you do?myarray = mypointer; ç No
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 18
Example// more pointers#include <iostream>using namespace std;int main () {
int numbers[5];int * p;p = numbers; *p = 10;p++; *p = 20;p = &numbers[2]; *p = 30;p = numbers + 3; *p = 40; p = numbers; *(p+4) = 50;for (int n=0; n<5; n++)
cout << numbers[n] << ", ";return 0;
}2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 19
Example// more pointers#include <iostream>using namespace std;int main () {
int numbers[5];int * p;p = numbers; *p = 10;p++; *p = 20;p = &numbers[2]; *p = 30;p = numbers + 3; *p = 40; p = numbers; *(p+4) = 50;for (int n=0; n<5; n++)
cout << numbers[n] << ", ";return 0;
}
Prints: 10, 20, 30, 40, 50,
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 20
Pointersandstringliteral
• const char*foo="hello";
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 21
Pointersandstringliteral
• const char*foo="hello";
• Foocontainsthevalue1702,andnot'h',nor"hello“• Whatistheoutputof?
*(foo+4)foo[4]
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 22
C/C++helpfulfuctions
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 23
memset• void * memset ( void * ptr, int value, size_t num );• Fillblockofmemory
– Setsthefirst num bytesoftheblockofmemorypointedby ptr tothespecified value (interpretedasan unsignedchar)
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 24
int main () { char str[] = "almost every programmer should know memset!"; memset (str,'-',6); print(str);return 0;
}
memcpy• void*memcpy (void*dest,const void*source,size_t num );• Copyblockofmemory
– Copiesthevaluesof num bytesfromthelocationpointedtoby source directlytothememoryblockpointedtoby destination.
• std::memcpy ismeanttobethefastestlibraryroutineformemory-to-memorycopy(usuallymoreefficientthan std::strcpy)
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 25
struct { char name[40]; int age; } person, person_copy;
int main () { char myname[] = ”Bucky Badger"; /* using memcpy to copy string: */memcpy ( person.name, myname, strlen(myname)+1 ); /* using memcpy to copy structure: */ memcpy ( &person_copy, &person, sizeof(person) ); return 0;
}
reinterpret_cast<new_type>(expr)• reinterpret_cast convertsanypointertypetoanyotherpointer
type,evenofunrelatedclasses.• Allpointerconversionsareallowed:neitherthecontentpointed
northepointertypeitselfischecked.
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 26
int main () { int i = 7;int* p1 = reinterpret_cast<int*>(&i);assert(p1 == &i);// type aliasing through pointerchar* p2 = reinterpret_cast<char*>(&i);// type aliasing through referencereinterpret_cast<unsigned int&>(i) = 42; std::cout << i << '\n';
}
B+tree
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 27
B+tree
• Occupancy(d)– Minimum50%occupancy(exceptforroot)– Eachnodecontainsd<=m<=2dentries.
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 281/29/17 CS 564: Database Management Systems, Jignesh M. Patel 5
(Ubiquitous)B+Tree• Height-balanced(dynamic)treestructure• Insert/deleteatlogF Ncost(F=fanout,N=#leafpages)• Minimum50%occupancy(exceptforroot).
Eachnodecontainsd <=m <=2d entries.Theparameterd iscalledtheorder ofthetree.
• Supportsequalityandrange-searchesefficiently.
Index Entries(Direct search)
Data Entries
Data EntriesEntries in the leaf pages:
(search key value, recordid)
Index EntriesEntries in the index (i.e. non-leaf) pages:
(search key value, pageid)
1/29/17 CS 564: Database Management Systems, Jignesh M. Patel 5
(Ubiquitous)B+Tree• Height-balanced(dynamic)treestructure• Insert/deleteatlogF Ncost(F=fanout,N=#leafpages)• Minimum50%occupancy(exceptforroot).
Eachnodecontainsd <=m <=2d entries.Theparameterd iscalledtheorder ofthetree.
• Supportsequalityandrange-searchesefficiently.
Index Entries(Direct search)
Data Entries
Data EntriesEntries in the leaf pages:
(search key value, recordid)
Index EntriesEntries in the index (i.e. non-leaf) pages:
(search key value, pageid)
1/29/17 CS 564: Database Management Systems, Jignesh M. Patel 5
(Ubiquitous)B+Tree• Height-balanced(dynamic)treestructure• Insert/deleteatlogF Ncost(F=fanout,N=#leafpages)• Minimum50%occupancy(exceptforroot).
Eachnodecontainsd <=m <=2d entries.Theparameterd iscalledtheorder ofthetree.
• Supportsequalityandrange-searchesefficiently.
Index Entries(Direct search)
Data Entries
Data EntriesEntries in the leaf pages:
(search key value, recordid)
Index EntriesEntries in the index (i.e. non-leaf) pages:
(search key value, pageid)
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 291/29/17 CS 564: Database Management Systems, Jignesh M. Patel 9
B+-Tree:InsertingaDataEntry• FindcorrectleafL.• PutdataentryontoL.
– IfLhasenoughspace,done!– Else,mustsplit L(intoLandanewnodeL2)
• Redistributeentriesevenly,copyup middlekey.• InsertindexentrypointingtoL2intoparentofL.
• Thiscanhappenrecursively– Tosplitnon-leafnode,redistributeentriesevenly,but
pushingup themiddlekey.(Contrastwithleafsplits.)• Splits“grow”tree;rootsplitincreasesheight.
– Treegrowth:getswider oroneleveltallerattop.
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 301/29/17 CS 564: Database Management Systems, Jignesh M. Patel 10
Inserting8*intoB+TreeRoot
17 24 30
2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
13
Entry to be inserted in parent nodeCopied up (and continues to appear in the leaf)
2* 3* 5* 7* 8*
5
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 311/29/17 CS 564: Database Management Systems, Jignesh M. Patel 10
Inserting8*intoB+TreeRoot
17 24 30
2* 3* 5* 7* 14* 16* 19* 20* 22* 24* 27* 29* 33* 34* 38* 39*
13
Entry to be inserted in parent nodeCopied up (and continues to appear in the leaf)
2* 3* 5* 7* 8*
5
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 32
1/29/17 CS 564: Database Management Systems, Jignesh M. Patel 11
Inserting8*intoB+Tree
Insert in parent node.Pushed up (and only appears once in the index)
5 24 30
17
13
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 331/29/17 CS 564: Database Management Systems, Jignesh M. Patel 12
2* 3*
Root17
24 30
14*16* 19*20*22* 24*27*29* 33*34*38*39*
135
7*5* 8*
Inserting8*intoB+Tree
• Rootwassplit:heightincreasesby1• Couldavoidsplitbyre-distributingentrieswithasibling
– Sibling:immediatelytoleftorright,andsameparent
5minsbreak
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 34
https://www.youtube.com/watch?v=AxSdWhkMB_A
BadgerDB:Btree
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 35
Afteryouuntar theproject:• btree.h:Addyourownmethodsandstructuresasyouseefitbut
don’tmodifythepublicmethodsthatwehavespecified.• btree.cpp:Implementthemethodswespecifiedandanyothers
youchoosetoadd.• file.h(cpp):ImplementsthePageFile andBlobFile classes.• main.cpp:Usetotestyourimplementation.Addyourowntests
hereorinaseparatefile.ThisfilehascodetoshowhowtousetheFileScan andBTreeIdnex classes.
• page.h(cpp):ImplementsthePageclass.• buffer.h(cpp),bufHashTbl.h(cpp):Implementationofthebuffer
manager.• Exceptions/*:Implementationofexceptionclassesthatyou
mightneed.• Makefile – makefile forthisproject.
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 36
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 371/29/17 CS 564: Database Management Systems, Jignesh M. Patel 7
B+-treePageFormatLe
af P
age
R1 K 1 R2 K 2 K n P n+1
data entries
record 1 record 2
Next PagePointer
Rn
record n
P0
Prev PagePointer
Non
-leaf
Pa
geP1 K 1 P 2 K 2 P 3 K m P m+1
index entries
Pointer to apage with Values < K1
Pointer to a pagewith values s.t.K1≤ Values < K2
Pointer to apage with values ≥Km
Pointer to a pagewith values s.t., K2≤ Values < K3
Pm
Index• theindex willstoredataentriesintheform<key, rid> pair• storedinafilethatisseparatefromthedatafile• i.e.theindexfile“pointsto”thedatafilewheretheactualrecords
arestored
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 38
• storesalltherelations(actualdata)aswedidinthebuffermanagerassignment
• Youdon’tactuallyusethisoneforthisproject
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 39
File
PageFile BlobFile• pagesinthefilearenotlinkedby
prevPage/nextPage links• treatsthepagesasblobsof8KB
sizei.e doesnotrequirethesepagestobevalidobjectsofthePageclass
• usetheBlobFile tostoretheB+indexfile
• everypageinthefileisanodefromtheB+tree
• wecanmodifythesepagestosuittheparticularneedsoftheB+treeindex
FileScan class• TheFileScan classisusedtoscanrecordsinafile.• FileScan(const std::string&relationName,BufMgr *bufMgr)
– TheconstructortakestherelationName andbuffermanagerinstance• ~FileScan()
– Shutsdown thescanandunpinsanypinnedpages.• void scanNext(RecordId& outRid)
– Returns(viatheoutRid parameter)theRecordId ofthenextrecordfromtherelationbeingscanned.ItthrowsEndOfFileException()whentheendofrelationisreached.
• std::string getRecord() – Returnsapointertothe“current”record.Therecordistheoneina
precedingscanNext()call.• void markDirty()
– Youdon’tneedthisforthisassignment
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 40
BadgerDB:B+Tree IndexSimplifications:• assumethatallrecordsinafilehavethesamelength(soforagivenattributeitsoffsetintherecordisalwaysthesame).
• onlyneedstosupportsingle-attributeindexing• theindexedattributemaybeoneofthreedatatypes:integer,double,orstring
• inthecaseofastring,youcanusethefirst10charactersasthekeyintheB+-tree
• wewillneverinserttwodataentriesintotheindexwiththesamekeyvalue2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 41
B+Tree Index:Constructor
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 42
const string&relationName
Thenameoftherelationonwhichtobuildtheindex.Theconstructorshouldscanthisrelation(usingFileScan)andinsertentriesforallthetuplesinthisrelationintotheindex
String&outIndexName
Thenameoftheindexfile;determinethisnameintheconstructorasshownabove,andreturnthename.
BufMgr *bufMgrIn Theinstanceoftheglobalbuffermanager.
const intattrByteOffset
Thebyteoffsetoftheattributeinthetupleonwhichtobuildtheindex.Forinstance,ifwearestoringthefollowingstructureasarecordintheoriginalrelation:
And,wearebuildingtheindexoverthedoubled,thentheattrByteOffset valueis0+offsetof(RECORD,i),whereoffsetof istheoffsetpositionprovidedbythestandardC++library“offsetoff”.
const DatatypeattrType
Thedatatypeoftheattributeweareindexing.NotethattheDatatypeenumeration{INTEGER,DOUBLE,STRING}isdefinedinbtree.h
If the index file already exists, open the file. Else, create a new index fileParameters:
B+Tree Index:insertEntry
const void*key Apointertothevalue(integer/double/string)wewanttoinsert.
const RecordId &rid Thecorrespondingrecordidofthetupleinthebaserelation.
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 43
insertEntryinserts a new entry into the index using the pair <key, rid>.Input to this function:
B+Tree Index:insertEntry
const void*key Apointertothevalue(integer/double/string)wewanttoinsert.
const RecordId &rid Thecorrespondingrecordidofthetupleinthebaserelation.
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 44
insertEntryinserts a new entry into the index using the pair <key, rid>.Input to this function:
You will be spending bulk of your time /code in this method
B+Tree Index:startScan
const void*lowValue
Thelowvaluetobetested.
const OperatorlowOp
Theoperationtobeusedintestingthelowrange.YoushouldonlysupportGTandGTEhere;anythingelseshouldthrowBadOpcodesException.
const void*highValue
Thehighvaluetobetested.
const OperatorhighOp
Theoperationtobeusedintestingthehighrange.YoushouldonlysupportLTandLTEhere;anythingelseshouldthrowBadOpcodesException.
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 45
startScanThis method is used to begin a “filtered scan” of the index. For e.g. if the method is called using arguments (“a”,GT,”d”,LTE), the scan should seek all entries greater than “a” and less than or equal to “d”.Input to this function:
B+Tree Index:scanNext
• scanNext• fetchestherecordidofthenexttuplethatmatchesthescan
criteria.Ifthescanhasreachedtheend,thenitshouldthrowtheexceptionIndexScanCompletedException
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 46
RecordId &outRid
outputvaluethisistherecordidofthenextentrythatmatchesthescanfiltersetinstartScan.
B+Tree Index:endScan
• endScan• terminatesthecurrentscanandunpins allthepagesthathavebeenpinnedforthepurposeofthescan
• throwsScanNotInitializedException ifcalledbeforeasuccessfulstartScan call.
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 47
Implementationnotes• callthebuffermanagertoread/writepages• don’tkeepthepagespinnedinthebufferpoolunlessyouneedto
• Forthescanmethods,youwillneedtorememberthe“state”ofthescanspecifiedduringthestartScan
• insert doesnot needtoredistributeentries• Attheleaflevel,youdonotneedtostorepointerstobothsiblings.Theleafnodesonlypointtothe“next”(theright)sibling
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 48
FAQs• HowdoIgetstarted?
iftheindexfiledoesnotexist:createnewBlobFile.allocate newmetapageallocate newrootpagepopulate'IndexMetaInfo'withtherootpage numscanrecordsandinsertintotheBTree
elsereadthefirstpagefromthefile- whichisthemetanodegettherootpagenum fromthemetanodereadtherootpage(bufManager->readPage(file,
rootpageNum,out_root_page)onceyouhavetherootnode,youcantraversedownthetree
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 49
FAQs
• howtocheckwhetheranindexfileexists?– Seefile.h:staticboolexists(const std::string&filename)
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 50
FAQs
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 51
• How do I write a node to disk as a Page? e.g. how to write IndexMetaInfonode to the file?
Þ You first need to allocate a Page using the bufferManager.Page* metaPage;bufManager->allocatePage(..., metaPage);
Then you can either cast it as IndexMetaInfo* and update it's parameter. Another way is to first create and populate MetaIndexInfo node.Then you can allocate a new Page* using bufferManager as above. Then use 'memcpy' to copy tothe new Page:
memcpy(metaPage, &metaInfo,sizeof(IndexMetaInfo));
But you do not need to write it back as page to disk explicitly. Buffer Manager does it for you.Remember project 2?
FAQs
• howtoconvertPage - thatyoureadfromthefiletoNode?– youcancaste.g.usingreinterpret_cast
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 52
Suggestions
• Startearly– 1000+linesofcodes
• Trytofinishbeforethespringbreak– NoTAhoursduringthebreak
• Makeincrementalprogress– Testaggressively
2/23/17 CS 564: Database Management Systems, Udip Pant and Jignesh Patel 53