Tuesday, January 19, 2021
  • Setup menu at Appearance » Menus and assign menu to Top Bar Navigation
Advertisement
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News
No Result
View All Result
NikolaNews
No Result
View All Result
Home Data Science

How Do You Win the Data Science Wars?  You Cheat By Doing The Necessary Pre-work!

January 20, 2019
in Data Science
How Do You Win the Data Science Wars?  You Cheat By Doing The Necessary Pre-work!
587
SHARES
3.3k
VIEWS
Share on FacebookShare on Twitter

Credit: Data Science Central

“If” by Rudyard Kipling

You might also like

Get Hired as a Data Scientist in 2021: Six Checkpoints

Machine Learning / Stats / BI: Mini Translation Dictionary

Advantages and Disadvantages of Automated Machine Learning

If you can keep your head when all about you
     Are losing theirs and blaming it on you,  

If you can trust yourself when all men doubt you,
    But make allowance for their doubting too;  

If you can wait and not be tired by waiting,
    Or being lied about, don’t deal in lies,

Or being hated, don’t give way to hating,
    And yet don’t look too good, nor talk too wise:

I’m sure that most data scientists have experienced that moment when they realize that the folks around them have no idea what they do.  That moment when someone walks up to them and says “I’ve got some data.  Can you do some data science on it?”

Many organizations started their data science journey by hiring a “data scientist” and asking him or her to perform magic on the data.  And while there are countless problems with that approach, companies quickly learned that 1) not everyone who calls themselves a data scientist is a data scientist (I can call myself young and dashing, but that don’t make it so) and 2) there is no magic when it comes to data science. Sorry, but as Chris Rock famously said:

There is no sex in the champagne room…

Data Science is very hard work, requiring experience and expertise in gathering (scraping in some cases) data from a wide variety of poorly documented and hard-to-access data sources; dealing with the incompleteness, inaccuracies, vagueness and poor documentation about the data; massaging, twisting and torturing that data into some useful form; and trying a seemingly endless number of analytic transformations, enrichments and algorithms in an attempt to find those combinations of variables and metrics that might yield a better predictor of performance (see Figure 1).

Figure 1:  Data Science Analytics Development Process

The key to the successful execution of the data science analytics development process – where success is defined as delivering predictive and prescriptive results that materially impact the operations and success of the business – is the pre-work. This means not an hour or so of “showing up and throwing up’, but an orchestrated business stakeholder and subject matter expert engagement process that ensures that the data science team thoroughly and intimately understands the business opportunity under consideration, understands the metrics and KPI’s against which progress and success will be measured, has identified, validated, valued and prioritized the decisions that need to be optimized in support of the business opportunity and understands the costs of the analytics being wrong (the cost of False Positives and False Negatives).  So how do we ensure data science success?  We cheat.

We cheat, we do tons of pre-work before we ever “put science to the data”

As I discussed in the blog “Why Is Data Science Different than Software Development?”, the methodologies and processes that support successful software development do not work for data science projects according to one simple observation: software development knows, with 100% assurance, the expected outcomes, while data science – through data exploration and hypothesis testing, failing and learning – discovers those outcomes (see Figure 2).

Figure 2:  Key Differences Between Software Development and Data Science

Software development defines the criteria for success; Data Science discovers them

Consequently, the development folks and management sometimes do not understand and appreciate the significant amount of work that needs to be done “before ever putting science to the data” to give the data science development process the highest probability of success.  And for data scientists, that data science development process is about getting all the key business stakeholders, business executives and subject matter experts to “think like a data scientist”.

Figure 3:  Thinking Like a Data Scientist

It is critical to the success of the data science initiative not only to have subject matter expertise involved at the beginning of the engagement (because they have valuable insights into variables and metrics that might be better predictors of performance gathered over years of hands-on experience), but it is critical to understand their work environment and decision-making processes to help drive the subsequent adoption of the analytics.

Check out the following sources for more details on “Thinking Like A Data Scientist” process:

There are several data science pre-engagement prerequisites that we require before we ever “put science to the data” (see Figure 4). They include:

  • Creating a personafor each stakeholder or constituent that captures roles, responsibilities, pain points and key operational decisions.
  • Documenting the use casesthat comprise the targeted business initiative or opportunity; document financial, operational and customer benefits and potential implementation risks for each use case.
  • Identifying, brainstorming and ranking internal and external data sourcesagainst the top priority use cases; assess data implementation risks associated with accessibility, completeness, granularity, accuracy, latency, documentation, etc.
  • Leveraging the Prioritization Matrixto conduct an envisioning exercise with key stakeholders and constituents to prioritize use cases (business value vs. implementation feasibility) and create a Use Case Roadmapthat identifies use case interdependencies and prerequisites.
  • Developing a Hypothesis Development Canvasto ensure cross-organizational alignment by fleshing out priority use case business and data science requirements including KPI’s against which to measure success and progress; financial, customer and operational benefits and costs associated with False Positives and False Negatives.

Figure 4:  Guide for Ensuring Data Science Success

Now, I’m not saying that this is the only approach for doing the data science development pre-work, but this is what we’ve found works for us (and is the heart of the Big Data MBA methodology that I teach at the University of San Francisco School of Management).  This process leverages several design thinking concepts and techniques to first identify the prerequisites for success (we call it an Envisioning Workshop and typically takes 2 to 3 days to complete).  We then follow up the Envisioning Workshop with a Proof of Value (not a Proof of Concept) to prove out the business and operational value for the top use cases coming out of the Envisioning Workshop (see Figure 5).

Figure 5:  Leveraging Design Thinking and Envisioning to Drive Data Science Success

So there, I’ve given away all of our secrets.  No reason why anyone should ever just grab some data and expect the data science team to do magic.  Because remember, there is no sex in the champagne room.

 

 

 


Credit: Data Science Central By: Bill Schmarzo

Previous Post

TUM launches Munich Center for Machine Learning (MCML)

Next Post

Machine-learning the wrong things - Marketplace APM

Related Posts

Get Hired as a Data Scientist in 2021: Six Checkpoints
Data Science

Get Hired as a Data Scientist in 2021: Six Checkpoints

January 19, 2021
Machine Learning / Stats / BI: Mini Translation Dictionary
Data Science

Machine Learning / Stats / BI: Mini Translation Dictionary

January 19, 2021
Advantages and Disadvantages of Automated Machine Learning
Data Science

Advantages and Disadvantages of Automated Machine Learning

January 19, 2021
How to become a Digital Strategy Leader
Data Science

How to become a Digital Strategy Leader

January 19, 2021
The Importance and Benefits of Fintech Apps
Data Science

The Importance and Benefits of Fintech Apps

January 19, 2021
Next Post
Machine-learning the wrong things – Marketplace APM

Machine-learning the wrong things - Marketplace APM

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recommended

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

Plasticity in Deep Learning: Dynamic Adaptations for AI Self-Driving Cars

January 6, 2019
Microsoft, Google Use Artificial Intelligence to Fight Hackers

Microsoft, Google Use Artificial Intelligence to Fight Hackers

January 6, 2019

Categories

  • Artificial Intelligence
  • Big Data
  • Blockchain
  • Crypto News
  • Data Science
  • Digital Marketing
  • Internet Privacy
  • Internet Security
  • Learn to Code
  • Machine Learning
  • Marketing Technology
  • Neural Networks
  • Technology Companies

Don't miss it

Ninety Percent of Large Pharma Companies Initiated Artificial Intelligence/Machine Learning Projects In 2020 | Business
Machine Learning

Ninety Percent of Large Pharma Companies Initiated Artificial Intelligence/Machine Learning Projects In 2020 | Business

January 19, 2021
Microsoft Defender is boosting its response to malware attacks by changing a key setting
Internet Security

Microsoft Defender is boosting its response to malware attacks by changing a key setting

January 19, 2021
New Educational Video Series for CISOs with Small Security Teams
Internet Privacy

New Educational Video Series for CISOs with Small Security Teams

January 19, 2021
Get Hired as a Data Scientist in 2021: Six Checkpoints
Data Science

Get Hired as a Data Scientist in 2021: Six Checkpoints

January 19, 2021
Project MEDAL to apply machine learning to aero innovation
Machine Learning

Project MEDAL to apply machine learning to aero innovation

January 19, 2021
Australia’s tangle of electronic surveillance laws needs unravelling
Internet Security

Australia’s tangle of electronic surveillance laws needs unravelling

January 19, 2021
NikolaNews

NikolaNews.com is an online News Portal which aims to share news about blockchain, AI, Big Data, and Data Privacy and more!

What’s New Here?

  • Ninety Percent of Large Pharma Companies Initiated Artificial Intelligence/Machine Learning Projects In 2020 | Business January 19, 2021
  • Microsoft Defender is boosting its response to malware attacks by changing a key setting January 19, 2021
  • New Educational Video Series for CISOs with Small Security Teams January 19, 2021
  • Get Hired as a Data Scientist in 2021: Six Checkpoints January 19, 2021

Subscribe to get more!

© 2019 NikolaNews.com - Global Tech Updates

No Result
View All Result
  • AI Development
    • Artificial Intelligence
    • Machine Learning
    • Neural Networks
    • Learn to Code
  • Data
    • Blockchain
    • Big Data
    • Data Science
  • IT Security
    • Internet Privacy
    • Internet Security
  • Marketing
    • Digital Marketing
    • Marketing Technology
  • Technology Companies
  • Crypto News

© 2019 NikolaNews.com - Global Tech Updates