In a report released by O’Reilly Media today, Python is the top search term on the company’s learning platform for the full 2019 calendar year, and it’s the #1 programming language, based on content usage, for the same period. In its report, O’Reilly describes Python as “preeminent” and, while data on its platform may not be perfectly reflective of the wider world of developers and data professionals, the company is well-respected in those arenas, with an array of popular books and tier-1 events, including Strata Data & AI Conference and Oscon.
The more you know
Anyone doing much work with data these days knows Python is hot, and that it’s become the programming language of choice for data science. So while the news of Python’s popularity may not be earth-shattering, there’s lots of interesting data that goes beyond that headline. For example, it’s not just that Python is popular; it’s that it’s still growing, in comparison to a similar analysis last year of search and usage data from 2018. And in a one-two punch, interest in Python’s “rivals” is decreasing.
Other growing categories on the upswing include data engineering, data science, and AI + ML, generally, and Kafka, specifically. Even interest in SQL showed growth. Meanwhile, interest in data management and Spark is down, and that of Hadoop has dropped even more precipitously. According to O’Reilly’s report on the data, “Hadoop and its ecosystem of related projects (such as Hive) are in the midst of a protracted, years-long decline.” In fact, year-over-year, Hadoop and Hive are each down 34%, and Spark is down by 21%.
Cloud’s growing in general. Among the major public cloud providers, AWS, Azure and Google Cloud Platform (GCP) come in first, second and third in platform content usage, respectively, and all three are growing. But the growth in their numbers ranks in the opposite order, with usage of GCP-related content growing at almost 40%, Azure at 30% and AWS at 15%. And in the different, but related, world of container technology, content usage on Docker is down slightly while that of Kubernetes is up almost 40%.
Speaking more broadly, 2019 seems to have been a year for maturity, and rigor. Usage on O’Reilly’s security-related content was up 26%. Enterprise architecture, though small in absolute terms, was up roughly 50%. Architectural patterns, and serverless architecture were up significantly as well. Interestingly, DevOps, while still the top infrastructure + ops topic, was down 5% year-over-year.
Extensibility and popularity
Meanwhile, back to Python. The core language is a versatile one, but the real fuel for its momentum may be its extensibility and huge ecosystem support, resulting in a dizzying array of packages that extend the language. That’s certainly true in the machine learning arena, where Python packages like NumPy, Pandas, Scikit-learn and PyTorch seem to make up much of the de facto ML stack.
As a result, Python sample code is everywhere. And while I’m not much of a Python developer, I find Python code easy to read, and even to modify. That’s both good and bad — it means even novices may find the language approachable, but it may also mean that a chunk of Python code out there isn’t the best, in terms of quality, performance or efficient use of the language and its packages. All the more reason, then, for the industry to be focusing on architecture, design and good development practices. Let’s all hope interest in those topics of technological robustness stay strong and grow in 2020.