Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stats: add --cache-threshold autoindex creation/deletion logic #1809

Merged
merged 4 commits into from
May 10, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
stats: added additional mode to --cache-threshold
Added addl logic to remove auto-created index if the `--cache-threshold` arg ends with a 5.
  • Loading branch information
jqnatividad committed May 10, 2024
commit 63fdc55828ec55bf7545c37bd56a4d537aa0cf71
27 changes: 23 additions & 4 deletions src/cmd/stats.rs
Original file line number Diff line number Diff line change
Expand Up @@ -145,8 +145,10 @@ stats options:
Set to 0 to suppress caching.
Set to 1 to force caching.
Set to a negative number to automatically create an index
when the input file size is greater than abs(arg) in bytes
AND to force caching.
when the input file size is greater than abs(arg) in bytes.
If the negative number ends with 5, it will delete the index
file and the stats cache file after the stats run. Otherwise,
the index file and the cache files are kept.
[default: 5000]

Common options:
Expand Down Expand Up @@ -379,6 +381,7 @@ pub fn run(argv: &[&str]) -> CliResult<()> {

let mut compute_stats = true;
let mut create_cache = args.flag_cache_threshold > 0 || args.flag_stats_binout;
let mut autoindex_set = false;

let write_stats_binout = args.flag_stats_binout;

Expand Down Expand Up @@ -484,8 +487,9 @@ pub fn run(argv: &[&str]) -> CliResult<()> {

// check if flag_cache_threshold is a negative number,
// if so, set the autoindex_size to absolute of the number
if args.flag_cache_threshold < 0 {
if args.flag_cache_threshold.is_negative() {
fconfig.autoindex_size = args.flag_cache_threshold.unsigned_abs() as u64;
autoindex_set = true;
}

// we need to count the number of records in the file to calculate sparsity
Expand Down Expand Up @@ -544,7 +548,8 @@ pub fn run(argv: &[&str]) -> CliResult<()> {
}

// ensure create_cache is also true if the user specified --cache-threshold 1
create_cache = create_cache || args.flag_cache_threshold == 1 || args.flag_cache_threshold < 0;
create_cache =
create_cache || args.flag_cache_threshold == 1 || args.flag_cache_threshold.is_negative();

wtr.flush()?;

Expand Down Expand Up @@ -584,6 +589,20 @@ pub fn run(argv: &[&str]) -> CliResult<()> {
fs::copy(currstats_filename.clone(), stats_pathbuf.clone())?;
}

if args.flag_cache_threshold.is_negative() && args.flag_cache_threshold % 10 == -5 {
// if the cache threshold is a negative number ending in 5,
// delete both the index file and the stats cache file
if autoindex_set {
let index_file = path.with_extension("csv.idx");
log::debug!("deleting index file: {}", index_file.display());
if std::fs::remove_file(index_file.clone()).is_err() {
// fails silently if it can't remove the index file
log::warn!("Could not remove index file: {}", index_file.display());
}
}
create_cache = false;
}

if !create_cache {
// remove the stats cache file
if fs::remove_file(stats_pathbuf.clone()).is_err() {
Expand Down