On April 2, 2019 at the Galvanize Campus in San Francisco, California, Data.World will host an Afternoon of Data to raise questions and brainstorm insights surrounding the very prominent concept of data literacy in today’s society.
With the event two weeks out, its speakers, all prominent figures in the data space, weighed in on some widespread issues that surface as data literacy grows across industries.
What are the best hacks for streamlining datasets?
The term “dirty data” may sound catchy, but disorganized datasets can cause a lot of problems within a company. We asked our speakers to weigh on how we can consistently streamline datasets for efficient use.
Trending AI Articles:
1. Ten trends of Artificial Intelligence (AI) in 2019
2. Bursting the Jargon bubbles — Deep Learning
3. How Can We Improve the Quality of Our Data?
4. Machine Learning using Logistic Regression in Python with Code
Ben Jones, Founder and CEO of Data Literacy, believes that being overly obsessed with datasets’ levels of perfection can lead to operational inefficiency. Fortunately, there are ways to minimize disorganization while also knowing where to stop. “What companies can’t afford to do is stop using data until it’s 100% clean, organized, and in one place,” said Jones. “I think it helps to start by taking stock of the existing data landscape and identifying a roadmap for building a better one. The order of the steps on said roadmap depends on three factors: cost and effort to get it done, business value resulting from the change, and any technical dependencies.”
Pallav Agrawal, Director of Data Science at Levi and Strauss, also views data governance as a constant weighing of priorities. “A good place to start is by speaking with the hands-on employees who are using data assets to learn how critical each asset is for them to perform their daily functions, and then tally the results to obtain a priority ranking of all assets,” he said.
Finally, Lisa Green, Executive Director of Solve For Good, left us with her best set of steps to make sure that streamlining happens in line with company goals:
- Define what data assets you have.
- Survey data practitioners and data consumers in all departments on how they use the data.
- Design a test catalog with a small subset of your data that represents all the types of data you have in an accurate proportionality.
- Evaluate and iterate on the test catalog.
- Share the latest version of the test catalog with various stakeholders within the company to get buy in or to feedback on the need for further iteration.
- Scale up the final version of the test catalog to include all data assets.
How Can Companies Procure The Right Data People?
Data strategies are only as strong as the people that apply them. And the unique combination of math skills, programming knowledge and business acumen required can be tough to find in potential job applicants. “Where many hiring managers fall down is in assessing candidates’ business skills and whether they can use them to tie all three skill sets together,” explained Green. Her solution involves crafting a hands-on interview process: “Choose an interview method that evaluates how well candidates understand a specific business problem faced by your company, how they would translate it to a data problem, and how they would communicate the results of their data solution to their non-technical colleagues within the company.”
Jones believes that an employees success has less to do with skills and more to do with culture fit. “Often hiring managers make the mistake of focusing narrowly on the knowledge or skills of a candidate. While it’s necessary to define these traits, it isn’t sufficient,” he said. “It’s also important to define the attitudes and behaviors that the person needs to have in order to thrive in the current environment and the one they’re hoping to build.”
Agrawal suggested that social media can be a powerful tool in cultivating a competitive talent pool. “Look through at least a few dozen LinkedIn profiles of data people and determine what types of projects and experience in a person’s profile excite you, and highlight it,” he advised. “Once you have a significant number of highlighted fields, find common patterns and use those along with the simple English descriptions in your job requirements.”
Noren sees the optimal career trajectories for budding data scientists as changing over time, with a background in engineering’s no longer being ideal to transition into data science. “Meaningful change towards becoming a data-driven organization has to come from key leadership,” she said. “Avoid hiring expensive data scientists, machine learning engineers, and data infrastructure engineers if they won’t be working in a leadership structure that truly understands their value and how to leverage their assets.”
How can those currently outside the data science industry get the experience they need to be marketable?
The expectations surrounding data deliverance were formed so quickly that companies have no choice but to satisfy demand with a shortage in supply of existing data scientists. A viable solution would be to fashion data scientists out of existing employees through teaching, as data is a language that more and more people need to speak to function in the modern workforce.
Fortunately, Noren believes that there are many ways to go about acquiring data knowledge. “For young data scientists, attending hackathons and competing in Kaggle competitions as a member of a cross-functional team is one way to go. More companies are offering internships. And for the incredibly self-motivated, preparing a project on a question relevant to the field one intends to enter and then using data science to address it and data visualization to present it would also be enough to get attention from some employers,” she said.
Agrawal offered up a dynamic strategy for breaking into the data space while still advancing one’s current career: “If an individual wants to be a data scientist, but is having difficulty finding a data science job due to lack of experience, then they should consider becoming part of a team that solves problems through the use of data such as product manager, data engineer, project manager, DevOps engineer or data analyst,” he suggested. “As they work with data scientists, they will learn how to think like one, while working in their personal time to build a portfolio of projects that demonstrate data-driven insight generation and problem-solving skills.”
Jones’ closing thought on acquiring data experience expanded beyond the scope of formal career paths. “There are so many great ways to build skills and gain experience in data these days. I highly recommend joining active data communities on Twitter, Slack, LinkedIn and other places,” he said. “There are always interesting projects and challenges going on, like the Makeover Monday project that asks participants to remake a chart each week, or Viz for Social Good that matches data workers with non-profits that are in need of talent. I think it’s very narrow-minded to think we can only build or apply our data skills in the context of our career. Why stop there?”
Feeling inspired by all things data?
The discussion continues with these featured speakers and more at Afternoon of Data in San Francisco on April 2nd.