Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Github Wiki migration #79

Merged
merged 24 commits into from
Aug 22, 2018
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
updates
  • Loading branch information
tovbinm committed Aug 22, 2018
commit 220a6cf1c283e1a2413528a066e3481ffd95d2d3
2 changes: 1 addition & 1 deletion docs/developer-guide/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -801,7 +801,7 @@ We provide utility functions to simplify working with Metadata in [com.salesforc

DataReaders define how data should be loaded into the workflow. They load and process raw data to produce the Dataframe used by the workflow. DataReaders are tied to a specific data source with the type of the raw loaded data (for example the AVRO schema or a case class describing the columns in a CSV).

There are three types of DataReaders. [Simple DataReaders](/Developer-Guide/#datareaders) just load the data and return a DataFrame with one row for each row of data read. [Aggregate DataReaders](/Developer-Guide#aggregate-data-readers) will group the data by the entity (the thing you are scoring) key and combine values (with or without time filters) based on the aggregation function associated with each feature definition. For example aggregate readers can be used to compute features like total spend from a list of transactions. [Conditional DataReaders](Developer-Guide/#conditional-data-readers) are like aggregate readers but they allow an daynamic time cuttoff for each row that depends on fullfilment of a user defined condition. For example conditional readers can be used to compute features like total spend before a user becomes a member. These readers can be combined to [join](/examples/Time-Series-Aggregates-and-Joins.html) multiple datasources.
There are three types of DataReaders. [Simple DataReaders](/Developer-Guide/#datareaders) just load the data and return a DataFrame with one row for each row of data read. [Aggregate DataReaders](/Developer-Guide#aggregate-data-readers) will group the data by the entity (the thing you are scoring) key and combine values (with or without time filters) based on the aggregation function associated with each feature definition. For example aggregate readers can be used to compute features like total spend from a list of transactions. [Conditional DataReaders](/Developer-Guide/#conditional-data-readers) are like aggregate readers but they allow an dynamic time cuttoff for each row that depends on fullfilment of a user defined condition. For example conditional readers can be used to compute features like total spend before a user becomes a member. These readers can be combined to [join](/examples/Time-Series-Aggregates-and-Joins.html) multiple datasources.

A constructor object provides shortcuts for defining most commonly used data readers. Defiing a data reader requires specifying the type of the data being read and the key for the data (the entity being scored).

Expand Down