Monday, March 1, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Big Data

AI chips in the real world: Interoperability, constraints, cost, energy efficiency, and models

February 4, 2021
in Big Data
AI chips in the real world: Interoperability, constraints, cost, energy efficiency, and models
585
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter

How do you make the best out of the proliferating array of emerging custom silicon hardware while not spreading yourself thin to keep up with each and every one of them?

If we were to put a price tag on that question, it would be in the multi-billion dollar territory. That’s what the combined estimated value of the different markets it touches upon is. As AI applications are exploding, so is the specialized hardware that supports them.

You might also like

DataStax Astra goes serverless | ZDNet

Off-chain reporting: Toward a new general purpose secure compute framework by Chainlink

Cutting-edge Katana Graph scores $28.5 million Series A Led by Intel Capital

For us, interest in so-called AI chips came as an offshoot of our interest in AI, and we’ve tried to keep up with developments in the field. For Even Sparks, Determined AI CEO and founder, it goes deeper. We caught up to discuss the interplay between hardware and models in AI.

An interoperability layer for disparate hardware stacks

Before founding Determined AI, Sparks was a researcher at the AmpLab at UC Berkeley. He focused on distributed systems for large scale machine learning, and this is where he had the opportunity to work with people like Dave Patterson, a pioneer in computer science, and currently vice-chair of the board of directors of the RISC-V Foundation.

Patterson was, as Sparks put it, banging the drum about Moore’s Law being dead and custom silicon being the only hope for continued growth in the space early on. Sparks was influenced, and what he wants to do with Determined AI is build software to help data scientists and machine learning engineers.

The goal is to help data scientists and machine learning engineers accelerate workloads and workflows and build AI applications faster. To do that, Determined AI provides a software infrastructure layer that sits underneath frameworks like TensorFlow or PyTorch and above various chips and accelerators.

Being in the position he is, Sparks’s interest lay not so much in dissecting vendor strategies, but rather in walking in the shoes of people developing and deploying machine learning models. As such, a natural place to start was ONNX.

ONNX is an interoperability layer thay enables machine learning models trained using different frameworks to be deployed across a range of AI chips.

ONNX is an interoperability layer that enables machine learning models trained using different frameworks to be deployed across a range of AI chips that support ONNX. We’ve seen how vendors like GreenWaves or Blaize support ONNX.

ONNX came out of Facebook originally, and Sparks noted the reason ONNX was developed was Facebook had a very disparate training and inference stack for machine learning applications.

Facebook developed using PyTorch internally, while the bulk of deep learning models running in production were computer vision models that were running backed by Caffe. Facebook’s mandate was that research can be done in whatever language you want, but production deployment had to be in Caffe.

That led to the need for an intermediate layer that would translate between the model architectures that were output in PyTorch and input into Caffe. Soon enough, people realized this is a good idea more broadly applicable. Not too different in fact from things that we’ve seen previously in programming language compilers.

ONNX and TVM: Two ways to solve similar problems

The idea is to utilize an intermediate representation between multiple high-level languages and plug things in with multiple languages at the source and multiple frameworks at the destination. It does sound a lot like compilers, and a good idea. But ONNX is not the end-all in AI chip interoperability.

TVM is the new kid on the block. TVM started as a research project out of the University of Washington, it recently became a top-level Apache open source project, and it also has a commercial effort behind it in OctoML.

TVM’s goals are similar to ONNX’s: Making it possible to compile deep learning models into what they call minimum deployable modules, and automatically optimize these models for different pieces of target hardware.

Sparks noted TVM is a relatively new project, but it’s got a pretty strong open source community behind it. He went on to add that many people would like to see TVM become a standard: “Hardware vendors not named in Nvidia are likely to want more openness and a way to enter into the market. And they’re looking for a kind of narrow interface to implement.”

There is nuance in pinpointing the differences between ONNX and TVM, and we defer to the conversation with Sparks on that. In a nutshell, TVM is a bit lower level than ONNX, Sparks said, and there are some trade-offs associated with that. He opined TVM has the potential to be perhaps a little bit more general.

Sparks noted, however, that both ONNX and TVM are early in their lifetime, and they will learn from each other over time. For Sparks, they are not immediate competitors, just two ways to solve similar problems.

AI constraints, cost, and energy efficiency

Whether it’s ONNX or TVM, however, dealing with this interoperability layer should not be something data scientists and machine learning engineers have to do. Sparks advocates for a separation of concerns between the various stages of model development — very much in line with the MLOps theme:

“There are many systems out there for preparing your data for training, making it high performance and compact data structures and so on. That is a different stage in the process, different workflow than the experimentation that goes into model training and model development.

As long as you get your data in the right format while you’re in model development, it should not matter what upstream data system you’re doing. Similarly, as long as you develop in these high-level languages, what training hardware you’re running on, whether it’s GPU’s or CPU’s or exotic accelerators should not matter.”

determined-components.jpg

Determined AI’s stack aims to abstract different underlying hardware architectures

What does matter is how that hardware can satisfy application constraints, as per Sparks. Imagine a medical devices company that has legacy hardware out in the field. They’re not going to upgrade just to run slightly more accurate models.

Instead, the problem is almost the inverse: How to get the most accurate model that can run on this particular hardware. So they might start with a huge model and employ techniques like quantization and distillation to fit that hardware.

This refers to deployment/inference, but the same logic can be applied to training as well. The cost of training AI models, both financial and environmental, is hard to ignore. Sparks referred to work from OpenAI, according to which the cost of training went up three hundred thousand times in the last few years.

That was two years ago. As more recent work coming from former co-lead of Google’s ethical AI team shows, this trend has anything but slowed down. The cost to train OpenAI’s latest language model, GPT3, has been estimated between $7 million and $12 million.

Sparks pointed out the obvious: This is an insane amount of computation, of energy, of money, which most mortals don’t have. So we need tools that help reason about this cost and assign quotas; Sparks is busy building those.

Infusing knowledge in models

Determined AI’s technology provides a way of specifying a budget, a number of models to be trained to convergence, and the space of models to explore. Training ceases before convergence, and users can explore the models without breaking the bank. This approach is based on active learning, but there are more approaches, like distillation, fine-tuning, or transfer learning:

“You let the big guys, the Facebooks and the Googles of the world do the big training on huge quantities of data with billions of parameters, spending hundreds of GPU years on a problem. Then instead of starting from scratch, you take those models and maybe use them to form [embeddings that you’re going to use for downstream tasks].”

Sparks mentioned NLP and image recognition, with BERT and ResNet-50, as good examples of this approach. He also offered a word of warning, however: This won’t always work. Where it gets tricky is when the modality of the data that people are training on is totally different than what’s available.

hybrid.jpg

A hybrid approach to AI, infusing knowledge in machine learning models, may be the best way to minimize training costs

But there may be another way. Whether we call it robust AI, hybrid AI, neuro-symbolic AI, or by any other name, would infusing knowledge in machine learning models help? Sparks’s answer was a definite yes:

“In ‘commodity’ use cases like NLP or vision, there are benchmarks that people agree on and standard data sets. Everyone knows what the problem is, image classification or object detection or language translation. But when you start to specialize more, some of the greatest lift we have seen is where you get a domain expert to infuse their knowledge.”

Sparks used physical phenomena as an example. Let’s say you set up a feed-forward neural network with 100 parameters and ask it to predict where a flying object is going to be in a second. Given enough examples, the system will probably converge to a reasonably good approximation of the function of interest and will predict with a high degree of accuracy:

“But if you infuse the application with a little bit more knowledge of the physical world, the amount of data is going to go way down, the accuracy is going to go way up, and we’re going to see some gravitational constant start to emerge as maybe one feature of the network or some combination of features.

Neural networks are great. They’re super powerful, functional approximations. But if I tell the computer a little bit more about what that function is, hopefully, I can save everybody a few million bucks in compute and get models that more accurately represent the world. To abandon that thinking would be irresponsible.”

Credit: Zdnet

Previous Post

How to burst your bubble: broadening your social media horizons

Next Post

A New Linux Malware Targeting High-Performance Computing Clusters

Related Posts

DataStax Astra goes serverless | ZDNet
Big Data

DataStax Astra goes serverless | ZDNet

February 25, 2021
Off-chain reporting: Toward a new general purpose secure compute framework by Chainlink
Big Data

Off-chain reporting: Toward a new general purpose secure compute framework by Chainlink

February 25, 2021
Cutting-edge Katana Graph scores $28.5 million Series A Led by Intel Capital
Big Data

Cutting-edge Katana Graph scores $28.5 million Series A Led by Intel Capital

February 24, 2021
Hasura connects GraphQL to the REST of the world
Big Data

Hasura connects GraphQL to the REST of the world

February 23, 2021
As Power BI aces Gartner’s new Magic Quadrant, what’s the story behind Microsoft’s success?
Big Data

As Power BI aces Gartner’s new Magic Quadrant, what’s the story behind Microsoft’s success?

February 19, 2021
Next Post
A New Linux Malware Targeting High-Performance Computing Clusters

A New Linux Malware Targeting High-Performance Computing Clusters

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

Benefits of Data Integration – Data Science Central
Data Science

Benefits of Data Integration – Data Science Central

March 1, 2021
Machine learning could aid mental health diagnoses: Study – ETCIO.com
Machine Learning

Machine learning could aid mental health diagnoses: Study – ETCIO.com

March 1, 2021
The Bayesian vs frequentist approaches: implications for machine learning – Part two
Data Science

The Bayesian vs frequentist approaches: implications for machine learning – Part two

March 1, 2021
Google’s deep learning finds a critical path in AI chips
Machine Learning

Google’s deep learning finds a critical path in AI chips

March 1, 2021
9 Tips to Effectively Manage and Analyze Big Data in eLearning
Data Science

9 Tips to Effectively Manage and Analyze Big Data in eLearning

March 1, 2021
Machine Learning & Big Data Analytics Education Market 2021 Global Industry Size, Reviews, Segments, Revenue, and Forecast to 2027 – NeighborWebSJ
Machine Learning

Machine Learning & Big Data Analytics Education Market 2021 Global Industry Size, Reviews, Segments, Revenue, and Forecast to 2027 – NeighborWebSJ

March 1, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • Benefits of Data Integration – Data Science Central March 1, 2021
  • Machine learning could aid mental health diagnoses: Study – ETCIO.com March 1, 2021
  • The Bayesian vs frequentist approaches: implications for machine learning – Part two March 1, 2021
  • Google’s deep learning finds a critical path in AI chips March 1, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates