Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decoupling local cache function and cache algorithm #38048

Merged
Prev Previous commit
Next Next commit
fix
  • Loading branch information
KinderRiven committed Aug 10, 2022
commit d2b5581632e8025a605d7832212d689a4fd85266
2 changes: 1 addition & 1 deletion src/Common/FileCache.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -773,7 +773,7 @@ bool FileCache::tryReserveForMainList(

for (auto it = main_priority->getLowestPriorityReadIterator(cache_lock); it->valid(); it->next())
{
auto entry_key = it->key();
const auto & entry_key = it->key();
auto entry_offset = it->offset();

if (!is_overflow())
Expand Down
1 change: 0 additions & 1 deletion src/Common/IFileCachePriority.h
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,6 @@ class IFileCachePriority
friend class FileCache;

using Key = FileCacheKey;

using ReadIterator = std::shared_ptr<const IIterator>;
using WriteIterator = std::shared_ptr<IIterator>;
Copy link
Member

@kssenii kssenii Aug 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be unique_ptr? or it is shared somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReadIterator can be unique_ptr. But for WriteIterator, I don't know if it should be set to unique_ptr, because it may be copied with the FileSegment, do we need to invalidate the iterator before copying?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because it may be copied with the FileSegment, do we need to invalidate the iterator before copying?

but file segment cannot be copied

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this, I tried to modify it, but sometimes we need to use a temporary pointer to keep some code readable, such as the following cases, so I think it may be better to keep it as it is, what do you think?

auto priority_iter = record->second;
priority_iter->use(cache_lock);
result_state = priority_iter->hits() >= enable_cache_hits_threshold ? FileSegment::State::EMPTY : FileSegment::State::SKIP_CACHE;

Copy link
Member

@kssenii kssenii Aug 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but looks like here auto & is possible, then unique_ptr will be ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to fix it, but I always get an error when compiling involves unordered_map. Maybe there is something wrong with my understanding of uniqueu_ptr :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be not use a pointer at all - just an object? (and in FileSegmentCell put std::optional to iterator).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds feasible, do you mean all ReadIterator and WriteIterator as objects? In this case, there will be more revisions, but I may not have much time to make revisions recently (there is a lot of work to deal with in reality), and I will reconsider the relevant implementation when I have time :).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean all ReadIterator and WriteIterator as objects

yes

but I may not have much time to make revisions recently (there is a lot of work to deal with in reality), and I will reconsider the relevant implementation when I have time

ok, let's leave it as it is for now


Expand Down
62 changes: 62 additions & 0 deletions src/Common/LRUFileCachePriority.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
#include <Common/LRUFileCachePriority.h>

namespace DB
{

namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
}

IFileCachePriority::WriteIterator
LRUFileCachePriority::add(const Key & key, size_t offset, size_t size, std::lock_guard<std::mutex> &) override
{
#ifndef NDEBUG
for (const auto & entry : queue)
{
if (entry.key == key && entry.offset == offset)
throw Exception(
ErrorCodes::LOGICAL_ERROR,
"Attempt to add duplicate queue entry to queue. (Key: {}, offset: {}, size: {})",
entry.key.toString(),
entry.offset,
entry.size);
}
#endif
auto iter = queue.insert(queue.end(), FileCacheRecord(key, offset, size));
cache_size += size;
return std::make_shared<LRUFileCacheIterator>(this, iter);
}

bool LRUFileCachePriority::contains(const Key & key, size_t offset, std::lock_guard<std::mutex> &) override
{
for (const auto & record : queue)
{
if (key == record.key && offset == record.offset)
return true;
}
return false;
}

void LRUFileCachePriority::removeAll(std::lock_guard<std::mutex> &) override
{
queue.clear();
cache_size = 0;
}

IFileCachePriority::ReadIterator LRUFileCachePriority::getLowestPriorityReadIterator(std::lock_guard<std::mutex> &) override
{
return std::make_shared<const LRUFileCacheIterator>(this, queue.begin());
}

IFileCachePriority::WriteIterator LRUFileCachePriority::getLowestPriorityWriteIterator(std::lock_guard<std::mutex> &) override
{
return std::make_shared<LRUFileCacheIterator>(this, queue.begin());
}

size_t LRUFileCachePriority::getElementsNum(std::lock_guard<std::mutex> &) const override
{
return queue.size();
}

};
118 changes: 38 additions & 80 deletions src/Common/LRUFileCachePriority.h
Original file line number Diff line number Diff line change
Expand Up @@ -5,119 +5,77 @@
namespace DB
{

namespace ErrorCodes
{
extern const int LOGICAL_ERROR;
}

/// Based on the LRU algorithm implementation, the record with the lowest priority is stored at
/// the head of the queue, and the record with the highest priority is stored at the tail.
class LRUFileCachePriority : public IFileCachePriority
{
private:
using LRUQueue = std::list<FileCacheRecord>;
using LRUQueueIterator = typename LRUQueue::iterator;
class LRUFileCacheIterator;

class LRUFileCacheIterator : public IIterator
{
public:
LRUFileCacheIterator(LRUFileCachePriority * file_cache_, LRUQueueIterator queue_iter_)
: file_cache(file_cache_), queue_iter(queue_iter_)
{
}
public:
LRUFileCachePriority() = default;

void next() const override { queue_iter++; }
WriteIterator add(const Key & key, size_t offset, size_t size, std::lock_guard<std::mutex> &) override;

bool valid() const override { return queue_iter != file_cache->queue.end(); }
bool contains(const Key & key, size_t offset, std::lock_guard<std::mutex> &) override;

const Key & key() const override { return queue_iter->key; }
void removeAll(std::lock_guard<std::mutex> &) override;

size_t offset() const override { return queue_iter->offset; }
ReadIterator getLowestPriorityReadIterator(std::lock_guard<std::mutex> &) override;

size_t size() const override { return queue_iter->size; }
WriteIterator getLowestPriorityWriteIterator(std::lock_guard<std::mutex> &) override;

size_t hits() const override { return queue_iter->hits; }
size_t getElementsNum(std::lock_guard<std::mutex> &) const override;

void remove(std::lock_guard<std::mutex> &) override
{
auto remove_iter = queue_iter;
queue_iter++;
file_cache->cache_size -= remove_iter->size;
file_cache->queue.erase(remove_iter);
}
private:
LRUQueue queue;
};

void incrementSize(size_t size_increment, std::lock_guard<std::mutex> &) override
{
file_cache->cache_size += size_increment;
queue_iter->size += size_increment;
}
class LRUFileCachePriority::LRUFileCacheIterator : public IFileCachePriority::IIterator
{
public:
LRUFileCacheIterator(LRUFileCachePriority * file_cache_, LRUFileCachePriority::LRUQueueIterator queue_iter_)
: file_cache(file_cache_), queue_iter(queue_iter_)
{
}

void use(std::lock_guard<std::mutex> &) override
{
queue_iter->hits++;
file_cache->queue.splice(file_cache->queue.end(), file_cache->queue, queue_iter);
}
void next() const override { queue_iter++; }

private:
mutable LRUFileCachePriority * file_cache;
mutable LRUQueueIterator queue_iter;
};
bool valid() const override { return queue_iter != file_cache->queue.end(); }

public:
LRUFileCachePriority() = default;
const Key & key() const override { return queue_iter->key; }

WriteIterator add(const Key & key, size_t offset, size_t size, std::lock_guard<std::mutex> &) override
{
#ifndef NDEBUG
for (const auto & entry : queue)
{
if (entry.key() == key && entry.offset() == offset)
throw Exception(
ErrorCodes::LOGICAL_ERROR,
"Attempt to add duplicate queue entry to queue. (Key: {}, offset: {}, size: {})",
entry.key().toString(),
entry.offset(),
entry.size());
}
#endif
auto iter = queue.insert(queue.end(), FileCacheRecord(key, offset, size));
cache_size += size;
return std::make_shared<LRUFileCacheIterator>(this, iter);
}
size_t offset() const override { return queue_iter->offset; }

bool contains(const Key & key, size_t offset, std::lock_guard<std::mutex> &) override
{
for (const auto & record : queue)
{
if (key == record.key && offset == record.offset)
return true;
}
return false;
}
size_t size() const override { return queue_iter->size; }

void removeAll(std::lock_guard<std::mutex> &) override
{
queue.clear();
cache_size = 0;
}
size_t hits() const override { return queue_iter->hits; }

ReadIterator getLowestPriorityReadIterator(std::lock_guard<std::mutex> &) override
void remove(std::lock_guard<std::mutex> &) override
{
return std::make_shared<const LRUFileCacheIterator>(this, queue.begin());
auto remove_iter = queue_iter;
queue_iter++;
file_cache->cache_size -= remove_iter->size;
file_cache->queue.erase(remove_iter);
}

WriteIterator getLowestPriorityWriteIterator(std::lock_guard<std::mutex> &) override
void incrementSize(size_t size_increment, std::lock_guard<std::mutex> &) override
{
return std::make_shared<LRUFileCacheIterator>(this, queue.begin());
file_cache->cache_size += size_increment;
queue_iter->size += size_increment;
}

size_t getElementsNum(std::lock_guard<std::mutex> &) const override
void use(std::lock_guard<std::mutex> &) override
{
return queue.size();
queue_iter->hits++;
file_cache->queue.splice(file_cache->queue.end(), file_cache->queue, queue_iter);
}

private:
LRUQueue queue;
mutable LRUFileCachePriority * file_cache;
mutable LRUFileCachePriority::LRUQueueIterator queue_iter;
KinderRiven marked this conversation as resolved.
Show resolved Hide resolved
};

};