Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SpaDES event queue: leverage for more flexible use cases (heterogeneous projects, diverse module types) #202

Open
5 tasks
eliotmcintire opened this issue Jun 3, 2022 · 0 comments

Comments

@eliotmcintire
Copy link
Contributor

eliotmcintire commented Jun 3, 2022

Writing here because this would be a clear SpaDES.core feature that would help manage complex projects.

The event queue, in my opinion, is very underutilized. We have a structure that tells R "what is next". Right now, SpaDES.core offers 2 options:

  • Run the event
  • Run the event via local Cache (so run the event or pull from Cache)

I believe that there are probably 2 (or more) other cases that we could add that would massively help manage complex cases:

  • Use cloudCache (i.e., 1) run the event or 2) pull from cloud saved copy or 3) run and push to cloud)
  • Wait or Stop if some condition is not met. Conditions could be:
    • Not enough resources (RAM or CPU) (e.g., a high resource demanding event like fireSense_spreadFit)
    • A Cached object is not available (e.g., a single DataPrep module needs to be run somewhere)
  • Spawn "output only" events in a future

So, take the WBI case, we have modules that:

  1. run only once even in the case of replicate simulations (e.g., studyArea Preparation modules)
  2. have huge RAM or CPU footprint (e.g., fireSense_spreadFit)
  3. are output only modules (e.g., caribouRSF)
  4. whose outputs are not needed by "the next" module, but are needed for some future module (e.g., the fireSense fitting series of 3 modules fireSense_ignitionFit, fireSense_escapeFit, fireSense_spreadFit
  5. that take multiple module outputs e.g., from replication e.g., post processing summary modules

The way we have worked with these is to isolate these each into their own (or small bundles of modules) simInit and spades calls. This has the effect of forcing the user to really understand what the use cases for each module are (e.g., right now nobody knows how much RAM and CPUs you need for fireSense_spreadFit without talking to the developers) before they can use it. Similarly, a user doesn't know that some dataPrep modules are "once only for a study area" and others are "once per stochastic replicate" without asking the developer.

Possible solution:

  1. Module metadata gains new parameters about how "heavy" the module is (CPU and RAM), e.g.,
defineParameter(
    .minRAM = 5 * object.size(sim$studyArea) or NULL
    .minCPUs = somethingHereNotSureYet maybe: 100 or NULL
    .cores = character string of ip addresses to pass to (which includes "localhost") `makePSOCKCluster` or NULL
)
  1. SpaDES.core assesses the situation to determine whether .minRAM, .minCPUs, .cores can all be achieved; if yes, then go ahead, if no, then assess useCache to see what to do.

  2. useCache parameter becomes richer about the options it can take. We make these so that all the above situations can be dealt with e.g., a named list:

defineParameter = list(
.useCache = list(useCache = c(".inputObjects", "init"), # as per usual
                 cloudCache  = list("init" = GoogleID), # a folder -- the actual qs or rds file with cacheID will be put or retrieved in this folder
                 mode = "rw", # read/write or just read or just write -- same as `mode` in `?file`. Defaults to `r` to protect cloudCache
                 waitIfNotInCache = 5, # if numeric, then check the the Cache ever X seconds waiting until object is available before continuing (could be cloudCache); if NULL, do as normal; if `"fail"` or `"stop"`, then stop; etc.
      )

With all this, I believe that one could simply run one simInitAndSpades call with "all XX" modules on any machines, and they would all just run whatever they can. If we tie this into experiment, then each unique spades run will be doing the assessments and it would be fine also.

I believe that most of this involves relatively light changes to SpaDES.core. All the infrastructure is basically in place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant