diff --git a/docs/Motivation.md b/docs/Motivation.md deleted file mode 100644 index d46e64e592..0000000000 --- a/docs/Motivation.md +++ /dev/null @@ -1,3 +0,0 @@ -_Building real life machine learning applications needs a fair amount of tribal knowledge and intuition. Coupled with the explosion of ML use cases in the world that need to be addressed, there is a need for tools that enable rapid prototyping and development of machine learning pipelines. We believe that automation is the key to making machine learning development truly scalable and accessible._ - -Read out blog post link: diff --git a/docs/_Footer.md b/docs/_Footer.md deleted file mode 100644 index a1761c11e1..0000000000 --- a/docs/_Footer.md +++ /dev/null @@ -1 +0,0 @@ -[BSD 3-Clause](License) © Salesforce.com, Inc. diff --git a/docs/_Sidebar.md b/docs/_Sidebar.md deleted file mode 100644 index 0f347e14a6..0000000000 --- a/docs/_Sidebar.md +++ /dev/null @@ -1,70 +0,0 @@ -* [About](Home) -* [Installation](Installation) -* QuickStart Examples - * [Titanic Binary Classification](Example%3A-Titanic) - * [Iris MultiClass Classification](Iris-MultiClass-Classification) - * [Boston Regression](Boston-Regression) - * [Aggregations and Joins](Example%3A-Time-Series-Aggregates-and-Joins) - * [Conditional Aggregation](Example%3A-Conditional-Aggregation) -* Alternative Ways of Running - * [Running from spark-shell](Running-TransmogrifAI-from-spark-shell) - * [Optional: Bootstrap Your First Project](Bootstrap-Your-First-Project) -* [Abstractions](Abstractions) - * [Features](Abstractions#features) - * [Stages](Abstractions#stages) - * [Transformers](Abstractions#transformers) - * [Estimators](Abstractions#estimators) - * [Workflows and Readers](Abstractions#workflows-and-readers) -* [AutoML Capabilities](AutoML-Capabilities) - * [Transmogrifier](AutoML-Capabilities#vectorizers-and-transmogrification) - * [Feature Validation](AutoML-Capabilities#feature-validation) - * [ModelSelectors](AutoML-Capabilities#modelselectors) -* [FAQ](FAQ) -* [Talks](Talks) -* [Contributing](Contributing) -* [Developer Guide](Developer-Guide) - * [Features](Developer-Guide#features) - * [Type Hierarchy and Automatic Feature Engineering](Developer-Guide#type-hierarchy-and-automatic-feature-engineering) - * [Feature Creation](Developer-Guide#feature-creation) - * [FeatureBuilders](Developer-Guide#featurebuilders) - * [Stages](Developer-Guide#stages) - * [Transformers](Developer-Guide#transformers) - * [TransmogrifAI Transformers](Developer-Guide#transmogrifai-transformers) - * [Writing your own transformer](Developer-Guide#writing-your-own-transformer) - * [Wrapping a SparkML transformer](Developer-Guide#wrapping-a-sparkml-transformer) - * [Wrapping a non serializable external library](Developer-Guide#wrapping-a-non-serializable-external-library) - * [Estimators](Developer-Guide#estimators) - * [TransmogrifAI Estimators](Developer-Guide#transmogrifai-estimators) - * [Writing your own estimator](Developer-Guide#writing-your-own-estimator) - * [Wrapping a SparkML estimator](Developer-Guide#wrapping-a-sparkml-estimator) - * [Creating Shortcuts for Transformers and Estimators](Developer-Guide#creating-shortcuts-for-transformers-and-estimators) - * [Shortcuts Naming Convention](Developer-Guide#shortcuts-naming-convention) - * [Customizing AutoML Stages](Developer-Guide#customizing-automl-stages) - * [Transmogrification](Developer-Guide#transmogrification) - * [SanityChecker](Developer-Guide#sanitychecker) - * [RawFeatureFilter](Developer-Guide#rawfeaturefilter) - * [Model Selector](Developer-Guide#modelselector) - * [Interoperability with SparkML](Developer-Guide#interoperability-with-sparkml) - * [Workflows](Developer-Guide#workflows) - * [Creating A Workflow](Developer-Guide#creating-a-workflow) - * [Fitting a Workflow](Developer-Guide#fitting-a-workflow) - * [Fitted Workflows](Developer-Guide#fitted-workflows) - * [Saving Workflows](Developer-Guide#saving-workflows) - * [Loading saved Workflows](Developer-Guide#loading-saved-workflows) - * [Removing problematic features](Developer-Guide#removing-problematic-features) - * [Extracting ModelInsights from a Fitted Workflow](Developer-Guide#extracting-modelinsights-from-a-fitted-workflow) - * [Extracting a Particular Stage from a Fitted Workflow](Developer-Guide#extracting-a-particular-stage-from-a-fitted-workflow) - * [Adding new features to a fitted workflow](Developer-Guide#adding-new-features-to-a-fitted-workflow) - * [Metadata](Developer-Guide#metadata) - * [DataReaders](Developer-Guide#datareaders) - * [Aggregate Data Readers](Developer-Guide#aggregate-data-readers) - * [Conditional Data Readers](Developer-Guide#conditional-data-readers) - * [Joined Data Readers](Developer-Guide#joined-data-readers) - * [Evaluators](Developer-Guide#evaluators) - * [Evaluators Factory](Developer-Guide#evaluators-factory) - * [Single Evaluation](Developer-Guide#single-evaluation) - * [Multiple Evaluation](Developer-Guide#multiple-evaluation) - * [Creating a custom evaluator](Developer-Guide#creating-a-custom-evaluator) - * [TransmogrifAI App and Runner](Developer-Guide#transmogrifai-app-and-runner) - * [Parameter Injection Into Workflows and Workflow Runners](Developer-Guide#parameter-injection-into-workflows-and-workflow-runners) - diff --git a/docs/Abstractions.md b/docs/abstractions/index.md similarity index 99% rename from docs/Abstractions.md rename to docs/abstractions/index.md index db3c311b7b..781e9fcfde 100644 --- a/docs/Abstractions.md +++ b/docs/abstractions/index.md @@ -1,3 +1,5 @@ +# Abstractions + TransmogrifAI is designed to simplify the creation of machine learning workflows. To this end we have created an abstraction for creating and running machine learning workflows. The abstraction is made up of Features, Stages, Workflows and Readers which interact as shown in the diagram below. ![TransmogrifAI Abstractions](https://github.com/salesforce/TransmogrifAI/raw/master/resources/AbstractionDiagram-cropped.png) diff --git a/docs/AutoML-Capabilities.md b/docs/automl-capabilities/index.md similarity index 99% rename from docs/AutoML-Capabilities.md rename to docs/automl-capabilities/index.md index ed76a41cbd..24e990b521 100644 --- a/docs/AutoML-Capabilities.md +++ b/docs/automl-capabilities/index.md @@ -1,3 +1,4 @@ +# AutoML Capabilities ## Vectorizers and Transmogrification diff --git a/docs/conf.py b/docs/conf.py index 57aa456cbe..d5f67524ab 100644 --- a/docs/conf.py +++ b/docs/conf.py @@ -169,4 +169,4 @@ (master_doc, 'TransmogrifAI', 'TransmogrifAI Documentation', author, 'TransmogrifAI', 'One line description of project.', 'Miscellaneous'), -] \ No newline at end of file +] diff --git a/docs/Contributing.md b/docs/contributing/index.md similarity index 95% rename from docs/Contributing.md rename to docs/contributing/index.md index 37bccd760e..94f8964d69 100644 --- a/docs/Contributing.md +++ b/docs/contributing/index.md @@ -1,10 +1,12 @@ +# Contributing + This page lists recommendations and requirements for how to best contribute to TransmogrifAI. We strive to obey these as best as possible. As always, thanks for contributing – we hope these guidelines make it easier and shed some light on our approach and processes. -# Issues, requests & ideas +## Issues, requests & ideas Use GitHub [Issues](https://github.com/salesforce/TransmogrifAI/issues) page to submit issues, enhancement requests and discuss ideas. -# Contributing +## Contributing 1. **Ensure the bug/feature was not already reported** by searching on GitHub under [Issues](https://github.com/salesforce/TransmogrifAI/issues). If none exists, create a new issue so that other contributors can keep track of what you are trying to add/fix and offer suggestions (or let you know if there is already an effort in progress). 3. **Clone** the forked repo to your machine. @@ -14,7 +16,7 @@ Use GitHub [Issues](https://github.com/salesforce/TransmogrifAI/issues) page to > **NOTE**: Be sure to [sync your fork](https://help.github.com/articles/syncing-a-fork/) before making a pull request. -# Contribution Checklist +## Contribution Checklist - [x] Clean, simple, well styled code - [x] Comments @@ -28,8 +30,8 @@ Use GitHub [Issues](https://github.com/salesforce/TransmogrifAI/issues) page to - Minimize number of dependencies. - Prefer BSD, Apache 2.0, MIT, ISC and MPL licenses. -# Code of Conduct +## Code of Conduct Follow the [Apache Code of Conduct](https://www.apache.org/foundation/policies/conduct.html). -# License +## License By contributing your code, you agree to license your contribution under the terms of the [BSD 3-Clause](License). diff --git a/docs/Developer-Guide.md b/docs/developer-guide/index.md similarity index 95% rename from docs/Developer-Guide.md rename to docs/developer-guide/index.md index e7959e59a2..f98dc010a8 100644 --- a/docs/Developer-Guide.md +++ b/docs/developer-guide/index.md @@ -1,49 +1,4 @@ -## Table Of Contents -* [Features](Developer-Guide#features) - * [Type Hierarchy and Automatic Feature Engineering](Developer-Guide#type-hierarchy-and-automatic-feature-engineering) - * [Feature Creation](Developer-Guide#feature-creation) - * [FeatureBuilders](Developer-Guide#featurebuilders) -* [Stages](Developer-Guide#stages) -* [Transformers](Developer-Guide#transformers) - * [TransmogrifAI Transformers](Developer-Guide#transmogrifai-transformers) - * [Writing your own transformer](Developer-Guide#writing-your-own-transformer) - * [Wrapping a SparkML transformer](Developer-Guide#wrapping-a-sparkml-transformer) - * [Wrapping a non serializable external library](Developer-Guide#wrapping-a-non-serializable-external-library) -* [Estimators](Developer-Guide#estimators) - * [TransmogrifAI Estimators](Developer-Guide#transmogrifai-estimators) - * [Writing your own estimator](Developer-Guide#writing-your-own-estimator) - * [Wrapping a SparkML estimator](Developer-Guide#wrapping-a-sparkml-estimator) -* [Creating Shortcuts for Transformers and Estimators](Developer-Guide#creating-shortcuts-for-transformers-and-estimators) - * [Shortcuts Naming Convention](Developer-Guide#shortcuts-naming-convention) -* [Customizing AutoML Stages](Developer-Guide#customizing-automl-stages) - * [Transmogrification](Developer-Guide#transmogrification) - * [SanityChecker](Developer-Guide#sanitychecker) - * [RawFeatureFilter](Developer-Guide#rawfeaturefilter) - * [Model Selector](Developer-Guide#modelselector) -* [Interoperability with SparkML](Developer-Guide#interoperability-with-sparkml) -* [Workflows](Developer-Guide#workflows) - * [Creating A Workflow](Developer-Guide#creating-a-workflow) - * [Fitting a Workflow](Developer-Guide#fitting-a-workflow) - * [Fitted Workflows](Developer-Guide#fitted-workflows) - * [Saving Workflows](Developer-Guide#saving-workflows) - * [Loading saved Workflows](Developer-Guide#loading-saved-workflows) - * [Removing problematic features](Developer-Guide#removing-problematic-features) - * [Extracting ModelInsights from a Fitted Workflow](Developer-Guide#extracting-modelinsights-from-a-fitted-workflow) - * [Extracting a Particular Stage from a Fitted Workflow](Developer-Guide#extracting-a-particular-stage-from-a-fitted-workflow) - * [Adding new features to a fitted workflow](Developer-Guide#adding-new-features-to-a-fitted-workflow) -* [Metadata](Developer-Guide#metadata) -* [DataReaders](Developer-Guide#datareaders) - * [Aggregate Data Readers](Developer-Guide#aggregate-data-readers) - * [Conditional Data Readers](Developer-Guide#conditional-data-readers) - * [Joined Data Readers](Developer-Guide#joined-data-readers) -* [Evaluators](Developer-Guide#evaluators) - * [Evaluators Factory](Developer-Guide#evaluators-factory) - * [Single Evaluation](Developer-Guide#single-evaluation) - * [Multiple Evaluation](Developer-Guide#multiple-evaluation) - * [Creating a custom evaluator](Developer-Guide#creating-a-custom-evaluator) -* [TransmogrifAI App and Runner](Developer-Guide#transmogrifai-app-and-runner) -* [Parameter Injection Into Workflows and Workflow Runners](Developer-Guide#parameter-injection-into-workflows-and-workflow-runners) - +# Developer Guide ## Features @@ -1083,3 +1038,7 @@ Here we are resetting the “TopK” parameter of a stage with class name “MyT *** + + +.. toctree:: + :maxdepth: 2 \ No newline at end of file diff --git a/docs/Bootstrap-Your-First-Project.md b/docs/examples/Bootstrap-Your-First-Project.md similarity index 99% rename from docs/Bootstrap-Your-First-Project.md rename to docs/examples/Bootstrap-Your-First-Project.md index 40809b4ec3..7d4c86a854 100644 --- a/docs/Bootstrap-Your-First-Project.md +++ b/docs/examples/Bootstrap-Your-First-Project.md @@ -1,3 +1,5 @@ +# Boostrap Your First Project + We provide a convenient way to bootstrap you first project with TransmogrifAI using the TransmogrifAI CLI. As an illustration, let's generate a binary classification model with the Titanic passenger data. diff --git a/docs/Boston-Regression.md b/docs/examples/Boston-Regression.md similarity index 99% rename from docs/Boston-Regression.md rename to docs/examples/Boston-Regression.md index 3d33e8586b..902290e732 100644 --- a/docs/Boston-Regression.md +++ b/docs/examples/Boston-Regression.md @@ -1,3 +1,5 @@ +# Boston Regression + The following code illustrates how TransmogrifAI can be used to do linear regression. We use Boston dataset to predict housing prices. The code for this example can be found [here](https://github.com/salesforce/TransmogrifAI/tree/master/helloworld/src/main/scala/com/salesforce/hw/boston), and the data over [here](https://github.com/salesforce/op/tree/master/helloworld/src/main/resources/BostonDataset). diff --git a/docs/Example:-Conditional-Aggregation.md b/docs/examples/Conditional-Aggregation.md similarity index 99% rename from docs/Example:-Conditional-Aggregation.md rename to docs/examples/Conditional-Aggregation.md index 71d3112467..44a2595007 100644 --- a/docs/Example:-Conditional-Aggregation.md +++ b/docs/examples/Conditional-Aggregation.md @@ -1,3 +1,5 @@ +# Conditional Aggregation + In this example, we demonstrate use of TransmogrifAI's conditional readers to, once again, simplify complex data preparation. Code for this example can be found [here](https://github.com/salesforce/TransmogrifAI/tree/master/helloworld/src/main/scala/com/salesforce/hw/dataprep/ConditionalAggregation.scala), and the data can be found [here](https://github.com/salesforce/op/tree/master/helloworld/src/main/resources/WebVisitsDataset/WebVisits.csv). In the previous [example](Example%3A-Time-Series-Aggregates-and-Joins), we showed how TransmogrifAI FeatureBuilders and Aggregate Readers could be used to aggregate predictors and response variables with respect to a reference point in time. However, sometimes, aggregations need to be computed with respect to the time of occurrence of a particular event, and this time may vary from key to key. In particular, let's consider a situation where we are analyzing website visit data, and would like to build a model that predicts the number of purchases a user makes on the website within a day of visiting a particular landing page. In this scenario, we need to construct a training dataset that for each user, identifies the time when he visited the landing page, and then creates a response which is the number of times the user made a purchase within a day of that time. The predictors for the user would be aggregated from the web visit behavior of the user up unto that point in time. diff --git a/docs/Iris-MultiClass-Classification.md b/docs/examples/Iris-MultiClass-Classification.md similarity index 98% rename from docs/Iris-MultiClass-Classification.md rename to docs/examples/Iris-MultiClass-Classification.md index d7035a9a82..538931c9de 100644 --- a/docs/Iris-MultiClass-Classification.md +++ b/docs/examples/Iris-MultiClass-Classification.md @@ -1,3 +1,5 @@ +# Iris MultiClass Classification + The following code illustrates how TransmogrifAI can be used to do classify multiple classes over the Iris dataset. The code for this example can be found [here](https://github.com/salesforce/TransmogrifAI/tree/master/helloworld/src/main/scala/com/salesforce/hw/iris), and the data over [here](https://github.com/salesforce/op/tree/master/helloworld/src/main/resources/IrisDataset). diff --git a/docs/Running-TransmogrifAI-from-spark-shell.md b/docs/examples/Running-from-Spark-Shell.md similarity index 97% rename from docs/Running-TransmogrifAI-from-spark-shell.md rename to docs/examples/Running-from-Spark-Shell.md index 33357a4263..b3dea721b2 100644 --- a/docs/Running-TransmogrifAI-from-spark-shell.md +++ b/docs/examples/Running-from-Spark-Shell.md @@ -1,3 +1,5 @@ +# Running from Spark Shell + Start up your spark shell: ```bash diff --git a/docs/Example:-Time-Series-Aggregates-and-Joins.md b/docs/examples/Time-Series-Aggregates-and-Joins.md similarity index 99% rename from docs/Example:-Time-Series-Aggregates-and-Joins.md rename to docs/examples/Time-Series-Aggregates-and-Joins.md index 63ff2afb6b..13cf1f073f 100644 --- a/docs/Example:-Time-Series-Aggregates-and-Joins.md +++ b/docs/examples/Time-Series-Aggregates-and-Joins.md @@ -1,3 +1,5 @@ +# Time Series Aggregates and Joins + In this example, we will walk you through some of the powerful tools TransmogrifAI has for data preparation, in particular for time series aggregates and joins. The code for this example can be found [here](https://github.com/salesforce/TransmogrifAI/tree/master/helloworld/src/main/scala/com/salesforce/hw/dataprep/JoinsAndAggregates.scala), and the data over [here](https://github.com/salesforce/op/tree/master/helloworld/src/main/resources/EmailDataset). In this example, we would like to build a training data set from two different tables -- a table of Email Sends, and a table of Email Clicks. The following case classes describe the schemas of the two tables: diff --git a/docs/Example:-Titanic.md b/docs/examples/Titanic-Binary-Classification.md similarity index 99% rename from docs/Example:-Titanic.md rename to docs/examples/Titanic-Binary-Classification.md index 09772b2b14..06f6e95a7a 100644 --- a/docs/Example:-Titanic.md +++ b/docs/examples/Titanic-Binary-Classification.md @@ -1,3 +1,5 @@ +# Titanic Binary Classification + Here we describe a very simple TransmogrifAI workflow for predicting survivors in the often-cited Titanic dataset. The code for building and applying the Titanic model can be found here: [Titanic Code](https://github.com/salesforce/TransmogrifAI/blob/master/helloworld/src/main/scala/com/salesforce/hw/OpTitanicSimple.scala), and the data can be found here: [Titanic Data](https://github.com/salesforce/op/blob/master/helloworld/src/main/resources/TitanicDataset/TitanicPassengersTrainData.csv). You can run this code as follows: diff --git a/docs/examples/index.rst b/docs/examples/index.rst new file mode 100644 index 0000000000..ecc01170f7 --- /dev/null +++ b/docs/examples/index.rst @@ -0,0 +1,15 @@ +.. _examples: + +Examples +============ + +.. toctree:: + :maxdepth: 1 + + Titanic-Binary-Classification + Iris-MultiClass-Classification + Boston-Regression + Time-Series-Aggregates-and-Joins + Conditional-Aggregation + Running-from-Spark-Shell + Bootstrap-Your-First-Project diff --git a/docs/FAQ.md b/docs/faq/index.md similarity index 87% rename from docs/FAQ.md rename to docs/faq/index.md index 4c2eb2c354..7c4dbb8f41 100644 --- a/docs/FAQ.md +++ b/docs/faq/index.md @@ -1,4 +1,6 @@ -## 1) What is TransmogrifAI? +# FAQ + +## What is TransmogrifAI? TransmogrifAI is an AutoML library written in Scala that runs on top of Spark. It was developed with a focus on enhancing machine learning developer productivity through machine learning automation, and an API that enforces compile-time type-safety, modularity and reuse. @@ -7,13 +9,13 @@ Use TransmogrifAI if you need a machine learning library to: * Rapidly train good quality machine learnt models with minimal hand tuning * Build modular, reusable, strongly typed machine learning workflows -## 2) I am used to working in Python why should I care about type safety? +## I am used to working in Python why should I care about type safety? The flexibility of Salesforce Objects allows customers to modify even standard objects schemas. This means that when writing models for a multi-tenant environment the only information about what is in a column that we can really count on is the Salesforce type (i.e. Phone, Email, Mulipicklist, Percent, etc.). Working in a strictly typed environment allows us to leverage this information to perform sensible automatic feature engineering. In addition type safety assures that you get fewer unexpected data issues in production. -## 3) What does automatic feature engineering based on types look like? +## What does automatic feature engineering based on types look like? In order to take advantage of automatic type based feature engineering in TransmogrifAI one simply defines the features that will be used in the model and relies on TransmogrifAI to do the feature engineering. The code for this would look like: @@ -25,11 +27,11 @@ The transmogrify shortcut will sort the features by type and apply appropriate t Of course if you want to manually perform these or other transformations you can simply specify the steps for each feature and use the VectorsCombiner Transformer to manually combine your final features. However, this gives developers the option of using default type specific feature engineering. -## 4) What other AutoML functionality does TransmogrifAI provide? +## What other AutoML functionality does TransmogrifAI provide? Look at the [AutoML Capabilities](AutoML-Capabilities) section for a complete list of the powerful AutoML estimators that TransmogrifAI provides. In a nutshell, they are Transmogrifier for automatic feature engineering, SanityChecker and RawFeatureFilter for data cleaning and automatic feature selection, and ModelSelectors for different classes of problems for automatic model selection. -## 5) What imports do I need for TransmogrifAI to work? +## What imports do I need for TransmogrifAI to work? ```scala // TransmogrifAI functionality: feature types, feature builders, feature dsl, readers, aggregators etc. @@ -47,9 +49,9 @@ import com.salesforce.op.utils.spark.RichMetadata._ import com.salesforce.op.utils.spark.RichStructType._ ``` -## 6) I don't need joins or aggregations in my data preparation why can't I just use Spark to load my data and pass it into a Workflow? +## I don't need joins or aggregations in my data preparation why can't I just use Spark to load my data and pass it into a Workflow? You can! Simply use the `.setInputRDD(myRDD)` or `.setInputDataSet(myDataSet)` methods on Workflow to pass in your data. -## 7) How do I examine intermediate data when trying to debug my ML workflow? +## How do I examine intermediate data when trying to debug my ML workflow? You can generate data up to any particular point in the Workflow using the method `.computeDataUpTo(myFeature)`. Calling this method on your Workflow or WorkflowModel will compute a DataFrame which contains all of the rows for features created up to that point in your flow. diff --git a/docs/index.rst b/docs/index.rst index 03732da103..d31ab7c6eb 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -6,10 +6,6 @@ TransmogrifAI ========================================= -.. toctree:: - :maxdepth: 2 - :caption: Contents: - TransmogrifAI (pronounced trăns-mŏgˈrə-fī) is an **AutoML** library written in Scala that runs on top of Spark. It was developed with a focus on enhancing machine learning **developer productivity** through **machine learning automation**, and an API that enforces **compile-time type-safety**, **modularity** and **reuse**. Use TransmogrifAI if you need a machine learning library to: @@ -41,3 +37,21 @@ Motivation *Building real life machine learning applications needs a fair amount of tribal knowledge and intuition. Coupled with the explosion of ML use cases in the world that need to be addressed, there is a need for tools that enable rapid prototyping and development of machine learning pipelines. We believe that automation is the key to making machine learning development truly scalable and accessible.* For more information, read our `blogpost `_! + +Documentation +######## +.. toctree:: + :maxdepth: 4 + + installation/index + examples/index + abstractions/index + automl-capabilities/index + faq/index + talks/index + contributing/index + developer-guide/index + license/index + + + diff --git a/docs/Installation.md b/docs/installation/index.md similarity index 95% rename from docs/Installation.md rename to docs/installation/index.md index 667ee9ec8e..72d4f493c6 100644 --- a/docs/Installation.md +++ b/docs/installation/index.md @@ -1,3 +1,5 @@ +# Installation + * Install Java 1.8 * Get Spark 2.2.x: [Download](https://spark.apache.org/downloads.html), unzip it and then set an environment variable: `export SPARK_HOME=` * Clone the TransmogrifAI repo: `git clone https://github.com/salesforce/TransmogrifAI.git` diff --git a/docs/License.md b/docs/license/index.md similarity index 99% rename from docs/License.md rename to docs/license/index.md index a21678c941..15373c96e1 100644 --- a/docs/License.md +++ b/docs/license/index.md @@ -1,3 +1,5 @@ +# License + Copyright (c) 2017, Salesforce.com, Inc. All rights reserved. diff --git a/docs/Talks.md b/docs/talks/index.md similarity index 97% rename from docs/Talks.md rename to docs/talks/index.md index 93fda5bc00..6ca26e0497 100644 --- a/docs/Talks.md +++ b/docs/talks/index.md @@ -1,13 +1,17 @@ -## 2018 +# Talks + +**2018** * [AutoML: The Assembly Line of Machine Learning](http://www.dataengconf.com/automl-the-assembly-line-of-machine-learning), Mayukh Bhaowal, DataEngConf * [The Black Swan of Perfectly Interpretable Models](https://www.infoq.com/presentations/salesforce-einstein-ml), Leah McGuire and Mayukh Bhaowal, QCon.ai * [Implementing AutoML Techniques at Salesforce Scale](https://vimeo.com/274420096), Matthew Tovbin, Spark+AI Summit, [Slides](https://www.slideshare.net/MatthewTovbin/implementing-automl-techniques-at-salesforce-scale) -## 2017 + +**2017** * [Embracing a Taxonomy of Types to Simplify Machine Learning](https://databricks.com/session/embracing-a-taxonomy-of-types-to-simplify-machine-learning), Leah McGuire, Spark Summit, [Slides](https://www.slideshare.net/databricks/embracing-a-taxonomy-of-types-to-simplify-machine-learning-with-leah-mcguire) * [When all the world’s data scientists are just not enough](https://atscaleconference.com/videos/when-all-the-worlds-data-scientists-are-just-not-enough/), Shubha Nabar, The @Scale Conference * [Low Touch Machine Learning](https://www.youtube.com/watch?v=PKTvo9X9Sjg), Leah McGuire, Spark Summit * [Fantastic ML apps and how to build them](https://www.youtube.com/watch?v=J5YNiaZbUJI), Matthew Tovbin, Scale By The Bay, [Slides](https://www.slideshare.net/MatthewTovbin/fantastic-ml-apps-and-how-to-build-them) -## 2016 + +**2016** * [Metadata Science: When the world's data scientists are not enough](https://www.youtube.com/watch?v=zd9DKjvcRzc), Shubha Nabar, Scala By The Bay * [Doubt Truth to be a Liar: Non Triviality of Type Safety for Machine Learning](https://www.youtube.com/watch?v=FfpSyXTx0uo), Matthew Tovbin, Scala By The Bay / Scala Days, [Slides](https://www.slideshare.net/MatthewTovbin/doubt-truth-to-be-a-liar-non-triviality-of-type-safety-for-machine-learning)