Skip to content

Commit

Permalink
introduction rewrite
Browse files Browse the repository at this point in the history
  • Loading branch information
emmyscode committed Oct 27, 2022
1 parent d13580b commit 8fdc2d6
Showing 1 changed file with 6 additions and 7 deletions.
13 changes: 6 additions & 7 deletions Introductory_modules/Introduction_to_Ray.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -10,31 +10,30 @@
"\n",
"Welcome, we're glad to have you along! This module serves as an interactive introduction to Ray, a flexible distributed computing framework built for Python with data science and machine learning practitioners in mind. Before we jump into the structure of this tutorial, let us first unpack the context of where we are coming from, along with the motivation for learning Ray.\n",
"\n",
"![Main Map](../_static/assets/Introduction_to_Ray/Main_Map.png)\n",
"![Main Map](../_static/assets/Introduction_to_Ray/map.png)\n",
"\n",
"*Figure 1*\n",
"\n",
"**Context**\n",
"\n",
"When we think about our daily interactions with artificial intelligence (AI), products such as recommendation systems, photo editing software, and auto-captioning on videos come to mind. In addition to these user-facing experiences, enterprise-level use cases like reducing downtime in manufacturing, order-fulfillment optimization, and maximizing power generation in wind farms contribute to the ever-growing integration of AI with the way we live. Today's AI applications require enormous amounts of data to be trained on and machine learning models tend to grow over time. They have become so complex and infrastructure intensive that developers have no option but to distribute execution across multiple machines. However, distributed computing is hard. It requires specialized knowledge about orchestrating clusters of computers together to efficiently schedule tasks and must provide features like fault tolerance when a component fails, high availability to minimize service interruption, and autoscaling to reduce waste.\n",
"Today's artificial intelligence (AI) applications require enormous amounts of data to be trained on and machine learning (ML) models tend to grow over time. From consumer-facing products like recommendation systems and photo editing software to enterprise-level use cases like reducing downtime in manufacturing and order-fulfillment optimization, ML systems have become so complex and infrastructure intensive that developers have no option but to distribute execution across multiple machines. However, distributed computing is hard. It requires specialized knowledge about orchestrating clusters of computers together to efficiently schedule tasks and must provide features like fault tolerance when a component fails, high availability to minimize service interruption, and autoscaling to reduce waste.\n",
"\n",
"As a data scientist, machine learning pracitioner, developer, or engineer, your contribution may center on building data processing pipelines, training complicated models, running efficient hyperparameter experiments, creating simulations of agents, and/or serving your application to users. In each case, you need to choose a distributed system to support each task, but you don't want to learn a different programming language or toss out your existing toolbox. This is where Ray comes in.\n",
"\n",
"**What is Ray?**\n",
"\n",
"Ray is an open source, distributed execution framework that allows you to scale AI and machine learning workloads. Our goal is to keep things simple (which is enabled by a concise core API) so that you can parallelize Python programs on your laptop, cluster, cloud, or even on-premise with minimal code changes. Ray automatically handles all aspects of distributed execution including orchestration, scheduling, fault tolerance, and auto scaling so that you can scale your apps without becoming a distributed systems expert. With a rich ecosystem of libraries and integrations with many important data science tools, Ray lowers the effort needed to scale compute intensive workloads.\n",
"Ray is an open source, distributed execution framework that allows you to scale AI and machine learning workloads. Our goal is to keep things simple (enabled by a concise core API) so that you can parallelize Python programs on your laptop, cluster, cloud, or even on-premise with minimal code changes. Ray automatically handles all aspects of distributed execution including orchestration, scheduling, fault tolerance, and auto scaling so that you can scale your apps without becoming a distributed systems expert. With a rich ecosystem of libraries and integrations with many important data science tools, Ray lowers the effort needed to scale compute intensive workloads.\n",
"\n",
"**Notebook Outline**\n",
"\n",
"This first notebook is part of a series where we will discuss the four major **layers** that comprise Ray, namely its core engine, high-level libraries, ecosystem of integrations, and cluster deployment support (see Figure 1). To foster active learning, you will encounter short coding excercises and discussion questions throughout the notebook to reinforce knowledge through practice. In this first notebook, we will cover:\n",
"This first notebook is part of a series where we will discuss the three major **layers** that comprise Ray, namely its core engine, high-level libraries, and ecosystem of integrations. In this first notebook, we will cover:\n",
"\n",
"- Introduction to Ray\n",
"- Layer One: Ray Core\n",
"- Part One: Ray Core\n",
" - Ray Core Key Concepts\n",
" - Example: Cross-Validation on Housing Data\n",
" - Sequential Implementation\n",
" - Distributed Implementation with Ray\n",
" - Optional Excercise: Quick Sort\n",
" - Summary\n",
"- Homework\n",
"- Next Steps\n",
Expand Down Expand Up @@ -445,7 +444,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
"version": "3.10.6"
},
"vscode": {
"interpreter": {
Expand Down

0 comments on commit 8fdc2d6

Please sign in to comment.