Ray – Fast and Simple Distributed Computing [proxy]

Build machine learning applications at any scale.

Open Source

We are building production-quality open source software and investing in the community around it.

Scale Anywhere

Run the same code on your laptop, on a powerful multi-core machine, on any cloud provider, or on a Kubernetes cluster.

Machine Learning

Use scalable machine learning libraries out of the box for hyperparameter search, reinforcement learning, training, serving, and more.

Ray is a distributed execution framework that makes it easy to scale your applications and to leverage state of the art machine learning libraries.

Tasks
Actors
Tune
RLlib

import ray
import time

ray.init()

@ray.remote
def f(i):
    time.sleep(1)
    return i

futures = [f.remote(i) for i in range(4)]
print(ray.get(futures))

import ray
ray.init()

@ray.remote
class Counter(object):
    def __init__(self):
        self.n = 0

    def increment(self):
        self.n += 1

    def read(self):
        return self.n

counters = [Counter.remote() for i in range(4)]
[c.increment.remote() for c in counters]
futures = [c.read.remote() for c in counters]
print(ray.get(futures))

import torch.optim as optim
from ray import tune
from ray.tune.examples.mnist_pytorch import (
    get_data_loaders, ConvNet, train, test)


def train_mnist(config):
    train_loader, test_loader = get_data_loaders()
    model = ConvNet()
    optimizer = optim.SGD(model.parameters(), lr=config["lr"])
    for i in range(10):
        train(model, optimizer, train_loader)
        acc = test(model, test_loader)
        tune.track.log(mean_accuracy=acc)


analysis = tune.run(
    train_mnist, config={"lr": tune.grid_search([0.001, 0.01, 0.1])})

print("Best config: ", analysis.get_best_config(metric="mean_accuracy"))

# Get a dataframe for analyzing trial results.
df = analysis.dataframe()

import gym
from gym.spaces import Discrete, Box
from ray import tune

class SimpleCorridor(gym.Env):
    def __init__(self, config):
        self.end_pos = config["corridor_length"]
        self.cur_pos = 0
        self.action_space = Discrete(2)
        self.observation_space = Box(0.0, self.end_pos, shape=(1, ))

    def reset(self):
        self.cur_pos = 0
        return [self.cur_pos]

    def step(self, action):
        if action == 0 and self.cur_pos > 0:
            self.cur_pos -= 1
        elif action == 1:
            self.cur_pos += 1
        done = self.cur_pos >= self.end_pos
        return [self.cur_pos], 1 if done else 0, done, {}

tune.run(
    "PPO",
    config={
        "env": SimpleCorridor,
        "num_workers": 4,
        "env_config": {"corridor_length": 5}})

Powered by Ray

“Ant Financial has built a multi-paradigm fusion engine on top of Ray that combines streaming, graph processing, and machine learning in a single system to perform real-time fraud detection and online promotion. Ray’s flexibility, scalability and efficiency allowed us to process billions of dollars worth of transactions during Double 11, the largest shopping day in the world.”
“At ASAPP, we experiment with machine learning models every day through our open source framework Flambé, and we eventually deploy many of those models to production where they serve millions of live customer interactions. Until we found Ray, we tried using more generic task distribution frameworks but they didn’t fit our needs. Using Ray has allowed us to quickly and reliably implement new ML tooling at scale, and run over large clusters of machines effortlessly, enabling Flambé to grow and support our model training for both research and production.”
“Ericsson uses Ray to build distributed reinforcement learning systems that interact with network nodes and simulators with RLlib and to tune machine learning models hyper-parameters with Ray tune.”
“Creating personalized unit (chip) testing to reduce test cost, improve quality and Increase capacity for Intel manufacturing and testing process. Advanced Analytics uses Ray to speed up and scale their hyperparameter and model selection techniques.”
“At JP Morgan, we use Ray to power the training of our deep reinforcement learning based electronic trading models such as LOXM and DeepX. Ray components such as Tune and RLlib provide easy-to-use building blocks and baseline implementations to accelerate our research on algorithmic trading strategies.”
“Real world applications of reinforcement learning require distributing both training and simulation workloads – often across hundreds of machines. Our autonomous systems platform leverages Ray to accelerate our customers’ creation of intelligent systems across a diverse set of industries including manufacturing, energy, smart buildings and homes, and process control and automation.”
“At Primer AI, we use Ray to parallelize our data processing workflows and analytics pipelines for natural language processing. The highly efficient serialization using a shared-memory object store is a perfect fit for handling our data-intensive jobs. The easy-to-use API allows our data scientists to quickly write production-quality parallelized workflows that power our core products.”
“We chose Ray because we needed to train many reinforcement learning agents simultaneously. It was important to us to deliver results quickly to people using Pathmind, our product applying reinforcement learning to business simulations. Ray and RLlib made it easy to do that using distributed compute in the public cloud.”
Futurewei uses Ray in its cloud services to make it easy for AI developers to build distributed machine learning models. We use Ray Tune to scale up hyperparameter search jobs for automatic machine learning and use RLlib to enable distributed reinforcement learning training.