RocketML Distributed Machine Learning

vinay rao

Beaverton, Oregon

7 0

RocketML is a super fast scale-out system for both Training machine learning models and Pre-processing steps. As data gets larger, machine learning steps gets slower making data scientists job tedious. A distributed system like RocketML shortens model training and processing tasks from days to minutes. RocketML is built to stitch together a large number of powerful Xeon processors to scale efficiently. Every component of the software is tuned so that the system is pushed to the limits of Amdahl's law. In essence everybody who uses RocketML gets a supercomputer at their disposal! RocketML supports Tabular, Text, Video and Image data types in its raw formats eliminating cost and unnecessary steps. It supports data pre-processing functionality to run on 1000s of cores. A few examples are shown here: 1) Object detection on images and videos using pre-built deep learning models 2) Audio-to-text using CMU Sphinx 3) Google APIs, text processing for multiple languages RocketML provides Distributed Machine learning algorithms that scale efficiently across nodes. Our current suite of algorithms include: 1) Linear Regression, Ridge Regression, Lasso, Logistic Regression, 2) Support Vector Machine, 3) Singular Value Decomposition, 4) Principal Component Analysis, 5) Non-negative Matrix Factorization, 6) K-means clustering, 7) Decision Trees, Gradient Boosted Trees, 8) Neural Nets RocketML also allows customers to bring their own models, favorite frameworks like MxNet, Tensorflow, Sklearn, PyTorch. ...learn more

Project status: Published/In Market

Robotics, HPC, Artificial Intelligence

Intel Technologies
OpenVINO, AI DevCloud / Xeon, MKL, Intel Opt ML/DL Framework, Intel Python, Intel CPU

Code Samples [1]

Overview / Usage

RocketML is built to stitch together a large number of powerful Xeon processors to scale efficiently for machine learning.

Methodology / Approach

RocketML is essentially a C++ library for distributed machine learning for the following

Supervised algorithms: Generalized Linear Models, Logistic Regression, Linear and Nonlinear SVM, Random forests, Gradient boosted trees and
Unsupervised learning algorithms: Singular Value Decomposition, Principal Component Analysis (PCA), Sparse PCA, and Clustering.

We have built a wide collection of sequential and distributed-memory operations, including support for

dense and sparse-direct linear algebra,
unconstrained, bound, and least square optimization problems,
scalable linear equation solvers and eigenvalue solvers.
It also includes optimization methods:
gradient descent method,
stochastic gradient descent methods,
coordinate-descent method,
alternating direction method of multipliers (ADMM),
quasi-Newton methods like L-BFGS,
trust region methods,
line search methods.

These methods are fairly popular in solving generalized linear models, logistic regression, and SVM.

Technologies Used

Intel Xeon Processors
Intel MKLs
Intel AI Dev Cloud
OpenVino (**NEW)
Intel Python Distribution

Repository

https://aws.amazon.com/marketplace/pp/B07C49CVC1?

Collaborators

There are no people to show.

Comments (0)

You have disabled JavaScript

We are sorry, but without JavaScript we are currently unable to display the latest activity feed. Please, enable Javascript in your browser.

RocketML Distributed Machine Learning

vinay rao

Overview / Usage

Methodology / Approach

Technologies Used

Repository

Collaborators

Login to continue

This action requires you to be logged in.

Thanks for voting. Please leave a comment.

RocketML Distributed Machine Learning

vinay rao

Overview / Usage

Methodology / Approach

Technologies Used

Repository

Collaborators

Login to continue

This action requires you to be logged in.