-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
event cleanup fix + update dependencies and install script #3
Conversation
Fix test, fix travis yml, update dependencies, Fix SQL default issue, merge from pferrel v0.9.6
val (appId, channelId) = Common.appNameToId(appName, None) | ||
PEventStore.wipe(result, appId, channelId)(sc) | ||
wipe(result.collect.toSet, originalEvents.collect.toSet, appId, channelId) | ||
//PEventStore.wipe(result, appId, channelId)(sc) | ||
result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need cleanedPEvents if the DB has been modified? Just a convenience? If we don't actually need the returned value should we rename to "clean"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, just a convenience . agree there is a command/query separation problem going on here. there is already another method with that name though. Was going for minimal changes to Max's work here, can refactor separately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should probably change the name now to something like a command because we'll have to live with it for awhile and for non-UR templates even
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k, will do
Missed how this reads params from engine.json that control
|
Do we include $set compaction? Can you point to that? |
@@ -131,8 +177,9 @@ trait CleanedDataSource { | |||
case Some(ew) => | |||
var updated = | |||
if (ew.compressProperties) compressPProperties(sc, rdd) else rdd | |||
//if (ew.removeDuplicates) removePDuplicates(updated) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, this is the $set compression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, methods: compressPProperties and compressLProperties
@@ -29,7 +29,8 @@ class DataSource(val dsp: DataSourceParams) | |||
|
|||
override | |||
def readTraining(sc: SparkContext): TrainingData = { | |||
val eventsDb = cleanedPEvents(sc) | |||
cleanAndPersistPEvents(sc) | |||
val eventsDb = Storage.getPEvents() | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is all I do in the UR datasource?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's more efficient can't we still return the events from cleanAndPersistPEvents, instead of dropping them then reading them again with Storage.getPEvents()?
event cleanup fix + update dependencies and install script
merging to test with the UR, only way I can get at your repo @EmergentOrder |
wip-start-stop-all-pgsql. added check if pgsql is started.
No description provided.