From data quality issues to architecting and optimizing models and data pipelines, there are many considerations to keep in mind with regard to machine learning.
At Data Summit Connect, a free 3-day series of data-focused webinars, a session, titled “Unlocking the Power of Machine Learning,” provided a close look at the challenges involved in using machine learning, as well as the enabling technologies, techniques, and applications required to achieve your goals.
As part of the session, Rashmi Gupta, director data architecture, KPMG LLC, explained how to use tools for orchestration and version control to streamline datasets in a presentation, titled “Operationalizing of Machine Learning Data.” She also discussed how to secure data to ensure that production control access is streamlined for testing. A challenge of machine learning is operationalizing the data volume, performance, and maintenance.
Challenges today in realizing the potential benefits of machine learning in the enterprise include data access issues (agility and security), data quality issues (disaggregated data with errors), lack of governance for validating certifying model accuracy, and lack of collaboration between business and IT. If the underlying data is not accurate, then the organization will not be able to reach its goals with machine learning, said Gupta. What is needed is a centralized framework with governance that operates and integrates various capabilities to support multiple domain solutions. Gupta highlighted market leaders for machine learning platforms as well as the advantages of various tool choices.
Outlining the best practices for machine learning success, Gupta said, organizations should:
- Start with a fully scalable and fault tolerant platform that is extensible and integrates other machine learning platforms and open source technologies to avoid vendor lock-in and ensure flexibility.
- Develop a framework that integrates with other key capabilities to promote distributed DevOps, MLOps, and DataOps.
- Create shared repositories and a storefront to enable secure collaboration portals for business and tech teams and ensure that business domain knowledge is included.
- The final step, Gupta said, is a process around governance and monitoring to create controls and policies for user security, data and model security and accuracy, and implementation of continuous monitoring.
Adding to the discussion, Andy Thurai, thought leader, blogger, and chief strategist at the Field CTO (thefieldcto.com), shared how infusing AI into operations can lead to improvements with his presentation, “AIOps the Savior for Digital Business Unplanned Outages.”
Citing MarketsandMarkets research that the AIOps market is set to be worth $11 billion by 2023, Thurai said that after starting with automating the IT operations tasks, now AIOps has moved beyond the rudimentary RPA, event consolidation, noise reduction use cases into mainstream use cases such as root causes analysis, service ticket analytics, anomaly detection, demand forecasting, and capacity planning.
According to Thurai, a 2019 ITIC survey of 1,000 business executives found that, according to 86% of respondents, the cost of an outage was estimated to be $300,000 per hour, and according to 33%, the cost of an outage was as high as $1 million an hour. The research also found that the average unplanned service outage lasts 4 hours and the average number of outages per year is two.
Thurai noted that AIOps, a term coined by Gartner, refers to the use of big data, modern machine learning, and other advanced analytics technologies to directly and indirectly enhance IT operations (including monitoring, automation, and service desk processes) functions with proactive, personal, and dynamic insight. AIOps, he noted, allows concurrent use of data sources, data collection, analytics technologies, and presentation technologies.
Thurai offered three common use cases where AIOps can offer benefit: event consolidation to help reduce “noise” and alleviate alert fatigue; anomaly detection; and root cause analysis since it has been found that a large percentage of outages are due to problems related to changes, and if those problematic changes can be identified earlier, outages can be shortened. Additional advanced use cases include service ticketing and help desk scheduling, demand forecasting, capacity planning, botnet detection and traffic isolation, ticket enhancements, and proactive support.
Webcast replays of Data Summit Connect, a free 3-day webinar series held Tuesday, June 9 through Thursday, June 11, will be made available on the DBTA website.
Credit: Google News