Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

event cleanup fix + update dependencies and install script #3

Merged
merged 8 commits into from
Mar 18, 2016
Merged

event cleanup fix + update dependencies and install script #3

merged 8 commits into from
Mar 18, 2016

Conversation

EmergentOrder
Copy link

No description provided.

val (appId, channelId) = Common.appNameToId(appName, None)
PEventStore.wipe(result, appId, channelId)(sc)
wipe(result.collect.toSet, originalEvents.collect.toSet, appId, channelId)
//PEventStore.wipe(result, appId, channelId)(sc)
result
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need cleanedPEvents if the DB has been modified? Just a convenience? If we don't actually need the returned value should we rename to "clean"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, just a convenience . agree there is a command/query separation problem going on here. there is already another method with that name though. Was going for minimal changes to Max's work here, can refactor separately

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably change the name now to something like a command because we'll have to live with it for awhile and for non-UR templates even

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k, will do

@pferrel
Copy link

pferrel commented Mar 17, 2016

Missed how this reads params from engine.json that control

  1. time-window
  2. deduplication
  3. $set compaction

@pferrel
Copy link

pferrel commented Mar 17, 2016

Do we include $set compaction? Can you point to that?

@@ -131,8 +177,9 @@ trait CleanedDataSource {
case Some(ew) =>
var updated =
if (ew.compressProperties) compressPProperties(sc, rdd) else rdd
//if (ew.removeDuplicates) removePDuplicates(updated)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, this is the $set compression?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, methods: compressPProperties and compressLProperties

@@ -29,7 +29,8 @@ class DataSource(val dsp: DataSourceParams)

override
def readTraining(sc: SparkContext): TrainingData = {
val eventsDb = cleanedPEvents(sc)
cleanAndPersistPEvents(sc)
val eventsDb = Storage.getPEvents()

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is all I do in the UR datasource?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's more efficient can't we still return the events from cleanAndPersistPEvents, instead of dropping them then reading them again with Storage.getPEvents()?

pferrel added a commit that referenced this pull request Mar 18, 2016
event cleanup fix + update dependencies and install script
@pferrel pferrel merged commit b2c8816 into actionml:v0.9.6 Mar 18, 2016
@pferrel
Copy link

pferrel commented Mar 18, 2016

merging to test with the UR, only way I can get at your repo @EmergentOrder

EmergentOrder added a commit that referenced this pull request Mar 28, 2016
wip-start-stop-all-pgsql. added check if pgsql is started.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants