Tuesday, March 2, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Machine Learning

How to choose a cloud machine learning platform

August 24, 2020
in Machine Learning
How to choose a cloud machine learning platform
585
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter

You might also like

Machine Learning Cuts Through the Noise of Quantum Computing

Novel machine-learning tool can predict PRRSV outbreaks and biosecurity effectiveness

Machine Learning Courses Market Overview, Revenue, Industry Verticals, and Forecast Evaluation 2020 to 2026 – NeighborWebSJ

Credit: Dreamstime

In order to create effective machine learning and deep learning models, you need copious amounts of data, a way to clean the data and perform feature engineering on it, and a way to train models on your data in a reasonable amount of time.

Then you need a way to deploy your models, monitor them for drift over time, and retrain them as needed.

You can do all of that on-premises if you have invested in compute resources and accelerators such as GPUs, but you may find that if your resources are adequate, they are also idle much of the time.

On the other hand, it can sometimes be more cost-effective to run the entire pipeline in the cloud, using large amounts of compute resources and accelerators as needed, and then releasing them.

The major cloud providers — and a number of minor clouds too — have put significant effort into building out their machine learning platforms to support the complete machine learning lifecycle, from planning a project to maintaining a model in production. How do you determine which of these clouds will meet your needs? Here are 12 capabilities every end-to-end machine learning platform should provide.

Be close to your data

If you have the large amounts of data needed to build precise models, you don’t want to ship it halfway around the world. The issue here isn’t distance, however, it’s time: Data transmission speed is ultimately limited by the speed of light, even on a perfect network with infinite bandwidth. Long distances mean latency.

The ideal case for very large data sets is to build the model where the data already resides, so that no mass data transmission is needed. Several databases support that to a limited extent.

The next best case is for the data to be on the same high-speed network as the model-building software, which typically means within the same data centre. Even moving the data from one data centre to another within a cloud availability zone can introduce a significant delay if you have terabytes (TB) or more. You can mitigate this by doing incremental updates.

The worst case would be if you have to move big data long distances over paths with constrained bandwidth and high latency. The trans-Pacific cables going to Australia are particularly egregious in this respect.

Support an ETL or ELT pipeline

ETL (export, transform, and load) and ELT (export, load, and transform) are two data pipeline configurations that are common in the database world. Machine learning and deep learning amplify the need for these, especially the transform portion. ELT gives you more flexibility when your transformations need to change, as the load phase is usually the most time-consuming for big data.

In general, data in the wild is noisy. That needs to be filtered. Additionally, data in the wild has varying ranges: One variable might have a maximum in the millions, while another might have a range of -0.1 to -0.001. For machine learning, variables must be transformed to standardised ranges to keep the ones with large ranges from dominating the model. Exactly which standardised range depends on the algorithm used for the model.

Support an online environment for model building

The conventional wisdom used to be that you should import your data to your desktop for model building. The sheer quantity of data needed to build good machine learning and deep learning models changes the picture: You can download a small sample of data to your desktop for exploratory data analysis and model building, but for production models you need to have access to the full data.

Web-based development environments such as Jupyter Notebooks, JupyterLab, and Apache Zeppelin are well suited for model building. If your data is in the same cloud as the notebook environment, you can bring the analysis to the data, minimising the time-consuming movement of data.

Support scale-up and scale-out training

The compute and memory requirements of notebooks are generally minimal, except for training models. It helps a lot if a notebook can spawn training jobs that run on multiple large virtual machines or containers. It also helps a lot if the training can access accelerators such as GPUs, TPUs, and FPGAs; these can turn days of training into hours.

Support AutoML and automatic feature engineering

Not everyone is good at picking machine learning models, selecting features (the variables that are used by the model), and engineering new features from the raw observations. Even if you’re good at those tasks, they are time-consuming and can be automated to a large extent.

AutoML systems often try many models to see which result in the best objective function values, for example the minimum squared error for regression problems. The best AutoML systems can also perform feature engineering, and use their resources effectively to pursue the best possible models with the best possible sets of features.

Support the best machine learning and deep learning frameworks

Most data scientists have favourite frameworks and programming languages for machine learning and deep learning. For those who prefer Python, Scikit-learn is often a favourite for machine learning, while TensorFlow, PyTorch, Keras, and MXNet are often top picks for deep learning.

In Scala, Spark MLlib tends to be preferred for machine learning. In R, there are many native machine learning packages, and a good interface to Python. In Java, H2O.ai rates highly, as do Java-ML and Deep Java Library.

The cloud machine learning and deep learning platforms tend to have their own collection of algorithms, and they often support external frameworks in at least one language or as containers with specific entry points. In some cases you can integrate your own algorithms and statistical methods with the platform’s AutoML facilities, which is quite convenient.

Some cloud platforms also offer their own tuned versions of major deep learning frameworks. For example, AWS has an optimised version of TensorFlow that it claims can achieve nearly-linear scalability for deep neural network training.

Read more on the next page…

Page





Join the newsletter!

Error: Please check your email address.

Tags machine learning


Credit:
Google News

Previous Post

Chromium DNS hijacking detection accused of being around half of all root queries

Next Post

Group of unskilled Iranian hackers behind recent attacks with Dharma ransomware

Related Posts

Machine Learning Cuts Through the Noise of Quantum Computing
Machine Learning

Machine Learning Cuts Through the Noise of Quantum Computing

March 2, 2021
Novel machine-learning tool can predict PRRSV outbreaks and biosecurity effectiveness
Machine Learning

Novel machine-learning tool can predict PRRSV outbreaks and biosecurity effectiveness

March 1, 2021
Machine Learning Courses Market Overview, Revenue, Industry Verticals, and Forecast Evaluation 2020 to 2026 – NeighborWebSJ
Machine Learning

Machine Learning Courses Market Overview, Revenue, Industry Verticals, and Forecast Evaluation 2020 to 2026 – NeighborWebSJ

March 1, 2021
Machine learning could aid mental health diagnoses: Study – ETCIO.com
Machine Learning

Machine learning could aid mental health diagnoses: Study – ETCIO.com

March 1, 2021
Google’s deep learning finds a critical path in AI chips
Machine Learning

Google’s deep learning finds a critical path in AI chips

March 1, 2021
Next Post
Group of unskilled Iranian hackers behind recent attacks with Dharma ransomware

Group of unskilled Iranian hackers behind recent attacks with Dharma ransomware

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

Why do companies fail to stop breaches despite soaring IT security investment?
Internet Privacy

Why do companies fail to stop breaches despite soaring IT security investment?

March 2, 2021
Tweaking Algorithmic Filtering to Combat Fake News
Data Science

Tweaking Algorithmic Filtering to Combat Fake News

March 2, 2021
Machine Learning Cuts Through the Noise of Quantum Computing
Machine Learning

Machine Learning Cuts Through the Noise of Quantum Computing

March 2, 2021
Google’s Tensorflow Certification & What I’ve Learned Since
Neural Networks

Google’s Tensorflow Certification & What I’ve Learned Since

March 2, 2021
Apple’s data-collection ‘nutrition labels’ for apps will begin appearing next week
Digital Marketing

Pinterest powers up creators during stressful times: Monday’s daily brief

March 2, 2021
Developers can now use IBM’s cloud services across multiple environments with IBM Cloud Satellite – IBM Developer
Technology Companies

Developers can now use IBM’s cloud services across multiple environments with IBM Cloud Satellite – IBM Developer

March 2, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • Why do companies fail to stop breaches despite soaring IT security investment? March 2, 2021
  • Tweaking Algorithmic Filtering to Combat Fake News March 2, 2021
  • Machine Learning Cuts Through the Noise of Quantum Computing March 2, 2021
  • Google’s Tensorflow Certification & What I’ve Learned Since March 2, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates