You should first implement the fromBytes in LeafNode. This method reads a LeafNode from a page. For information on how a leaf node is serialized, see LeafNode::toBytes. For an example on how to read a node from disk, see InnerNode::fromBytes. Your code should be similar to the inner node version but should account for the differences between how inner nodes and leaf nodes are serialized. You may find the documentation in ByteBuffer.java helpful.
Once you have implemented fromBytes you should be passing TestLeafNode::testToAndFromBytes.
public static LeafNode fromBytes(BPlusTreeMetadata metadata, BufferManager bufferManager,
LockContext treeContext, long pageNum) {
// TODO(proj2): implement
// Note: LeafNode has two constructors. To implement fromBytes be sure to
// use the constructor that reuses an existing page instead of fetching a
// brand new one.
Page page = bufferManager.fetchPage(treeContext, pageNum);
Buffer buf = page.getBuffer();
byte nodeType = buf.get();
assert(nodeType == (byte) 1);
long rightSibling = buf.getLong();
Optional<Long> rightSiblingOpt = (rightSibling == -1L) ? Optional.empty() : Optional.of(rightSibling);
List<DataBox> keys = new ArrayList<>();
List<RecordId> rids = new ArrayList<>();
int n = buf.getInt();
for (int i = 0; i < n; ++i) {
keys.add(DataBox.fromBytes(buf, metadata.getKeySchema()));
rids.add(RecordId.fromBytes(buf));
}
return new LeafNode(metadata, bufferManager, page, keys, rids, rightSiblingOpt, treeContext);
}
运行测试,通过:
Task 2: get, getLeftmostLeaf, put, remove
After implementing fromBytes, you will need to implement the following methods in LeafNode, InnerNode, and BPlusTree:
get
getLeftmostLeaf (LeafNode and InnerNode only)
put
remove
For more information on what these methods should do refer to the comments in BPlusTree and BPlusNode.
Each of these methods, although split into three different classes, can be viewed as one recursive action each - the BPlusTree method starts the call, the InnerNode method is the recursive case, and the LeafNode method is the base case. It's suggested that you work on one method at a time (over all three classes).
We've provided a sync() method in LeafNode and InnerNode. The purpose of sync() is to ensure that the representation of a node in our buffers is up-to-date with the representation of the node in program memory.
Do not forget to call sync() when implementing the two mutating methods (put and remove); it's easy to forget.
public void put(DataBox key, RecordId rid) {
typecheck(key);
// TODO(proj4_integration): Update the following line
LockUtil.ensureSufficientLockHeld(lockContext, LockType.NL);
// TODO(proj2): implement
// Note: You should NOT update the root variable directly.
// Use the provided updateRoot() helper method to change
// the tree's root if the old root splits.
Optional<Pair<DataBox, Long>> result = root.put(key, rid);
if (result.isPresent()) {
List<DataBox> keys = new ArrayList<>();
keys.add(result.get().getFirst());
List<Long> children = new ArrayList<>();
children.add(root.getPage().getPageNum());
children.add(result.get().getSecond());
InnerNode newRoot = new InnerNode(metadata, bufferManager, keys, children, lockContext);
updateRoot(newRoot);
}
}
但由于scanAll函数尚未完成,无法通过testRandomPuts()测试。
remove函数
依然是先写LeafNode部分,比较简单:
public void remove(DataBox key) {
// TODO(proj2): implement
int index = keys.indexOf(key);
if (index == -1) {
throw new BPlusTreeException("Leaf does not have the key");
}
keys.remove(index);
rids.remove(index);
}
public void remove(DataBox key) {
int index = Collections.binarySearch(keys, key);
if (index == -1) {
throw new BPlusTreeException("Leaf does not have the key");
}
keys.remove(index);
rids.remove(index);
sync();
}
再写InnerNode部分代码:
public void remove(DataBox key) {
// TODO(proj2): implement
int index = InnerNode.numLessThanEqual(key, keys);
BPlusNode child = getChild(index);
child.remove(key);
sync();
}
测试通过
再完成BPlusTree的remove部分:
public void remove(DataBox key) {
typecheck(key);
// TODO(proj4_integration): Update the following line
LockUtil.ensureSufficientLockHeld(lockContext, LockType.NL);
// TODO(proj2): implement
root.remove(key);
}
那么Task2就完成了。
Task 3: Scans
You will need to implement the following methods in BPlusTree:
scanAll
scanGreaterEqual
In order to implement these, you will have to complete the BPlusTreeIterator inner class in BPlusTree.javato complete these two methods.
After completing this task, you should be passing TestBPlusTree::testRandomPuts
Your implementation does not have to account for the tree being modified during a scan. For the time being you can think of this as there being a lock that prevents scanning and mutation from overlapping, and that the behavior of iterators created before a modification is undefined (you can handle any problems with these iterators however you like, or not at all).
首先看scanAll和scanGreaterEqual的注释提示,大致的调用关系如下:
直接写迭代器:
private class BPlusTreeIterator implements Iterator<RecordId> {
// TODO(proj2): Add whatever fields and constructors you want here.
private LeafNode currentLeafNode;
private int currentIndex;
public BPlusTreeIterator(LeafNode startLeafNode, int startIndex) {
this.currentLeafNode = startLeafNode;
this.currentIndex = startIndex;
}
@Override
public boolean hasNext() {
// TODO(proj2): implement
if (currentLeafNode == null) {
return false;
}
// 处理叶节点遍历完需要跨叶
if (currentIndex >= currentLeafNode.getKeys().size()) {
currentLeafNode = currentLeafNode.getRightSibling().orElse(null);
currentIndex = 0;
}
return currentLeafNode != null && currentIndex < currentLeafNode.getKeys().size();
}
@Override
public RecordId next() {
// TODO(proj2): implement
if (!hasNext()) {
throw new NoSuchElementException();
}
RecordId rid = currentLeafNode.getRids().get(currentIndex);
currentIndex++;
return rid;
}
}
回去写一下那两个函数,通过传入构造函数的参数不同来实现:
首先是scanAll函数:
public Iterator<RecordId> scanAll() {
// TODO(proj4_integration): Update the following line
LockUtil.ensureSufficientLockHeld(lockContext, LockType.NL);
// TODO(proj2): Return a BPlusTreeIterator.
return new BPlusTreeIterator(root.getLeftmostLeaf(), 0);
}
然后是scanGreaterEqual函数:
public Iterator<RecordId> scanGreaterEqual(DataBox key) {
typecheck(key);
// TODO(proj4_integration): Update the following line
LockUtil.ensureSufficientLockHeld(lockContext, LockType.NL);
// TODO(proj2): Return a BPlusTreeIterator.
LeafNode startLeaf = root.get(key);
if (startLeaf == null) {
return Collections.emptyIterator();
}
int index = startLeaf.getKeys().indexOf(key);
return new BPlusTreeIterator(startLeaf, index);
}
运行之前没通过的测试:
成功,Task3结束🤩!
Task 4: Bulk Load
Much like the methods from the Task 2 you'll need to implement bulkLoad within all three of LeafNode, InnerNode, and BPlusTree. Since bulk loading is a mutating operation you will need to call sync(). Be sure to read the instructions in BPluNode::bulkLoad carefully to ensure you split your nodes properly. We've provided a visualization of bulk loading for an order 2 tree with fill factor 0.75 (powerpoint slides here).
After this, you should pass all the Project 2 tests we have provided to you (and any you add yourselves)! These are all the provided tests in database.index.*.
public void bulkLoad(Iterator<Pair<DataBox, RecordId>> data, float fillFactor) {
// TODO(proj4_integration): Update the following line
LockUtil.ensureSufficientLockHeld(lockContext, LockType.NL);
// TODO(proj2): implement
// Note: You should NOT update the root variable directly.
// Use the provided updateRoot() helper method to change
// the tree's root if the old root splits.
if (scanAll().hasNext()) {
throw new RuntimeException("Tree is not empty");
}
while (data.hasNext()) {
Optional<Pair<DataBox, Long>> result = root.bulkLoad(data, fillFactor);
if (result.isPresent()) {
Pair<DataBox, Long> pair = result.get();
List<DataBox> keys = new ArrayList<>();
List<Long> children = new ArrayList<>();
keys.add(pair.getFirst());
children.add(root.getPage().getPageNum());
children.add(pair.getSecond());
InnerNode newRoot = new InnerNode(metadata, bufferManager, keys, children, lockContext);
updateRoot(newRoot);
}
}
return;
}
"C:\Program Files\Java\jdk-22\bin\java.exe" "-javaagent:C:\Users\ASUS\AppData\Local\Programs\IntelliJ IDEA Ultimate 2024.3\lib\idea_rt.jar=61448" -Dfile.encoding=UTF-8 -Dsun.stdout.encoding=UTF-8 -Dsun.stderr.encoding=UTF-8 -classpath F:\cs186\cs186-sp25-rookiedb\target\classes edu.berkeley.cs186.database.cli.CommandLineInterface
\|/ ___------___
\__|--o______o--|
| berklee |
---______---
Welcome to RookieDB (v1.8.6-fa24)
=> SELECT * FROM Students AS s WHERE s.sid = 1;
sid | name | major | gpa
-----+-------------------+-----------+-----------
1 | Augustina Mazzoni | Chemistry | 1.0054202
(1 row)
=> CREATE INDEX on Students(sid);
CREATE INDEX ON Students (sid)
=> exit
exit
Bye!
进程已结束,退出代码为 0