Saturday, February 27, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Big Data

Tecton brings DevOps to machine learning

June 22, 2020
in Big Data
Tecton brings DevOps to machine learning
587
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter

Models require algorithms and data, and when it comes to data, machine learning models have especially ravenous appetites. Yet, when it comes to transforming data into features consumed by the model, data scientists often have to reinvent the wheel as feature generation lacks the configuration management and automation processes associated with model workflows.

Tecton, a startup that recently emerged from stealth, is introducing a cloud-based service intended to address this bottleneck. They are striving, for feature generation, to introduce the same type of maintainability and reusability that Informatica did when it introduced automated data transformations to the data warehousing world back at its founding.

You might also like

DataStax Astra goes serverless | ZDNet

Off-chain reporting: Toward a new general purpose secure compute framework by Chainlink

Cutting-edge Katana Graph scores $28.5 million Series A Led by Intel Capital

The cofounders of the company came from Uber, where they were responsible for Project Michelangelo, the company’s machine learning platform.

At Uber, Michelangelo took care of the full lifecycle of ML from managing data to training, evaluating, and deploying models, followed by making and monitoring the predictions. Among the lessons learned from developing the Michelangelo platform was that feature generation lacked the degree of rigorous lifecycle automation that model development. All too often, data scientists had to keep reinventing the wheel when generating features, and the beginnings of that automation were built into the platform.

Those are the lessons that company founders took to creating Tecton. It addresses part of the problem impacting data scientists: they are often spending too much of their time performing data engineering. A couple years ago, in a survey jointly sponsored by Ovum (now Omdia) and Dataiku, we found that even in the best of circumstances, data scientists will still likely spend up to half their time wrestling with data.

In a company blog, Tecton lays out the problem in exhaustive detail. Many cloud AutoML services can automate aspects of algorithm development and feature selection, along with the workflow of moving models from training to deployment and production, but they lack any such automation with the feature engineering that feeds the model with data.

Tecton applies the same DevOps automation to feature development that is the norm for coding the logic of models. All of the relevant artifacts to building a feature (e.g., data sources, transformations, and a recipe for processing) are stored and versioned in a Git repository, just as modeling logic is.

The result is that the legwork involved with creating and deploying the data pipelines used for generating features is automated. This doesn’t mean that data scientists no longer need data engineers, but they can become less reliant on them. Data engineers are still needed at the outset of building a feature for connecting to data sources. But it then allows data scientists to work in their Python notebooks to design the feature transformation logic. The next step is specifying configuration information (e.g., specify data sources, frequency of transformations, and whether to serve the feature online or offline in batch). All these artifacts are stored in a Git repository where they can be versioned, shared, and reused.

The operable notion is that data scientists should not get mired in the business of designing, deploying, and running data pipelines, and they shouldn’t need to bother data engineers each time they have to design new features or run another job. Once the feature is developed, Tecton automates the operation of the pipeline, triggering the appropriate execution engine such as Spark for batch, or Flink or Kafka Streams for streaming. This addresses one part of the model deployment bottleneck as data scientists don’t have to worry about wiring up the right execution engines.

To track lifecycle and support reusability, Tecton retains data lineage data and makes the metadata searchable in a “feature store’ catalog. The same pipelines can be used, both for training and live production data, and by versioning these pipelines, they can readily be iterated by data scientists on a self-service basis when modifying or generating new features.

Having emerged from stealth within the past couple months, Tecton’s offering is currently running in private preview on AWS. As noted above, it manages and automates feature pipelines and includes a feature store for storing historical feature and label data; includes an SDK for retrieving training data; plus a web-based UI for tracking features and monitoring data quality and data drift issues. Once the AWS offering hits general release, Tecton will follow with other clouds later. To us, this would make a useful complement to cloud AutoML services, helping them automate the missing mile of data. 

Credit: Zdnet

Previous Post

FIFA 21 Steam Debut Deserves Gamers Troll-Bomb ‘Gambling’ Soccer

Next Post

A Brief Summary of the Science of Data Analytics

Related Posts

DataStax Astra goes serverless | ZDNet
Big Data

DataStax Astra goes serverless | ZDNet

February 25, 2021
Off-chain reporting: Toward a new general purpose secure compute framework by Chainlink
Big Data

Off-chain reporting: Toward a new general purpose secure compute framework by Chainlink

February 25, 2021
Cutting-edge Katana Graph scores $28.5 million Series A Led by Intel Capital
Big Data

Cutting-edge Katana Graph scores $28.5 million Series A Led by Intel Capital

February 24, 2021
Hasura connects GraphQL to the REST of the world
Big Data

Hasura connects GraphQL to the REST of the world

February 23, 2021
As Power BI aces Gartner’s new Magic Quadrant, what’s the story behind Microsoft’s success?
Big Data

As Power BI aces Gartner’s new Magic Quadrant, what’s the story behind Microsoft’s success?

February 19, 2021
Next Post
What are Data Pipelines ?

A Brief Summary of the Science of Data Analytics

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

Berlin resident jailed for threatening to bomb NHS hospital unless Bitcoin ransom was paid
Internet Security

Berlin resident jailed for threatening to bomb NHS hospital unless Bitcoin ransom was paid

February 27, 2021
The Ethereum Virtual Machine (EVM)
Data Science

The Ethereum Virtual Machine (EVM)

February 27, 2021
Healthcare leaders debunk 3 myths about machine learning
Machine Learning

Providence exec explains the differences, their healthcare applications

February 27, 2021
Future Tech: Artificial Intelligence and the Singularity | by Jason Sherman | Feb, 2021
Neural Networks

Future Tech: Artificial Intelligence and the Singularity | by Jason Sherman | Feb, 2021

February 27, 2021
Chrome will soon try HTTPS first when you type an incomplete URL
Internet Security

Chrome will soon try HTTPS first when you type an incomplete URL

February 27, 2021
Cisco Releases Security Patches for Critical Flaws Affecting its Products
Internet Privacy

Cisco Releases Security Patches for Critical Flaws Affecting its Products

February 27, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • Berlin resident jailed for threatening to bomb NHS hospital unless Bitcoin ransom was paid February 27, 2021
  • The Ethereum Virtual Machine (EVM) February 27, 2021
  • Providence exec explains the differences, their healthcare applications February 27, 2021
  • Future Tech: Artificial Intelligence and the Singularity | by Jason Sherman | Feb, 2021 February 27, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates