Founded in 1998, VMware went public on August 14, 2007. Founded a year after VMware in 1999, Salesforce.com held its initial public offering three years earlier on June 23, 2004. From a temporal standpoint, then, the two companies are peers. Which is one reason that it’s interesting to examine how they have performed to date, and how they might perform moving forward.
Another is the fact that they represent, at least for now, radically different approaches to the market. VMware, of course, is the dominant player in virtualization and markets adjacent to that space. Where Microsoft built its enormous value in part by selling licenses to the operating system upon which workloads of all shapes and sizes were run, VMware’s revenue has been driven primarily by the sales of software that makes operating systems like Windows virtual.
Salesforce.com, on the other hand, has been since its inception the canonical example for what eventually came to be known as Software-as-a-Service, the sale of software centrally hosted and consumed via a browser. Not only is Salesforce.com’s software – or service, as they might prefer it be referred to – consumed very differently than software from VMware, the licensing and revenue model of services-based offerings is quite distinct from the traditional perpetual license business.
It has been argued in this space many times previously (see here, for example) that the perpetual license model that has dominated the industry for decades is increasingly under pressure from a number of challengers ranging from open source competition to software sold, as in the case of Salesforce.com, as a service. In spite of the economic challenge inherent to running a business where your revenue is realized over a long period of time that competes with those that get all of their money up front, the bet here is that the latter model will increasingly give way to the former.
Which is why it’s interesting, and perhaps informative, to examine the relative performance, and market valuation, of two peers that have taken very different approaches. What does the market believe about the two models, and what does their performance tell us if anything?
It is necessary to acknowledge before continuing that in addition to the difference in terms of model, Salesforce and VMware are not peers from a categorical standpoint. The former is primary an applications vendor, its investments in platforms such as Force.com or Heroku notwithstanding, while the latter is an infrastructure player, in spite of its dalliances with application plays such as Zimbra. While the comparison is hardly apples to apples, therefore, it is nevertheless instructive in that both vendors compete for a percentage of organizational IT spend. But keep that distinction in mind regardless.
As for the most basic metric, market capitalization, VMware is currently worth more than Salesforce: $42 billion to $35 billion as I write this. Even the most cursory examination of some basic financial metrics will explain why. Since it began reporting in 2003, Salesforce.com has on a net basis lost just over $350 million. VMware, for its part, has generated a net profit of almost $4 billion.
If we examine the quarterly gross profit margins of the two entities over time, these numbers make sense.
Apart from some initially comparable margins for Salesforce, VMware has consistently commanded a higher margin than its SaaS-based counterpart. While this chart likely won’t surprise anyone who’s tracked the markets broadly or the companies specifically, however, it’s interesting to note the fork in the 2010-2011 timeframe. Through 2010, the margins appeared to be on a path to converge; after, they diverged aggressively. Nor is this the only metric in which this pattern is visible.
We see the same trajectories in a plot of the quarterly net income, for example, but this is to be expected since this is in part a function of the profit generated.
The question is what changed in the 2010 timeframe, and one answer appears to be investments in scale. The following chart depicts the gross property, plant and equipment charges for both firms over the same timeframe.
Notice in particular the sharp spike beginning in 2010-2011 for Salesforce. After six years as a public entity, the company began investing in earnest, which undoubtedly had consequences partially reflected in the charts above. VMware’s expenditures here are interesting for their part, because the conventional wisdom is that it is the services firms like Salesforce, Amazon, Google or increasingly Microsoft whose PP&E will reflect their necessary investments in the infrastructure to deliver services at scale. Granted, VMware’s vCloud Hybrid Service is intended to serve as a public infrastructure offering, but this was intended to be an “an asset-light model” which presumably would not have commanded the same infrastructure investments. Nevertheless, VMware outpaced Salesforce until 2014.
The question, whether it’s from the perspective of an analyst or an investor, is what the returns have been on this dramatic increase in spending. Obviously in Salesforce’s case, its capital investments have dragged down its income and margins, while VMware’s dominant market position has allowed it to not only sustain its pricing but grow its profitability. But what about revenue growth?
One of the strongest arguments in favor of SaaS products is convenience; it’s far less complicated to sign up to a service hosted externally than it is to build and host your own. If convenience is indeed a driver for adoption, greater revenue growth is one potential outcome: if it’s easier to buy, it will be bought more. This is, in fact, a necessity for Salesforce: if you’re going to trade losses now for growth, you need the growth. To some extent, this is what we see when we compare the two companies.
The initial decline from 70+ percent growth for both companies is likely the inevitable product of simple math: the more you make, the harder it is to grow. While we can discount the first half of this chart, the second half is intriguing in that it is a reversal of the pattern we have seen above. While VMware solidly outperformed Salesforce at a growing rate in profit and income, Salesforce, beginning about a year after its PP&E investments picked up, has grown its revenue at a higher rate than has VMware. Early in this period you could argue the rate differential was a function of revenue disparities, but the delta between the revenue numbers last quarter was less than 10%.
In general, none of these results should be surprising. VMware has successfully capitalized on a dominant position in a valuable market and the financial results demonstrate that. Salesforce, as investors appear to have recognized, is clearly trading short term losses against the longer term return. While there is some evidence to suggest that Salesforce’s strategy is beginning to see results, and VMware is probably paying closer attention to its overall ability to grow revenue, it’s still very early days. It’s equally possible that one or both are poor representatives for their respective approach. It will be interesting to monitor these numbers over time, however, to try and test how the two models continue to perform versus one another.
As we settle into a roughly bi-annual schedule for our programming language rankings, it is now time for the second drop of the year. This being the second run since GitHub retired its own rankings forcing us to replicate them by querying the GitHub archive, we are continuing to monitor the rankings for material differences between current and past rankings. While we’ve had slightly more movement than is typical, however, by and large the results have remained fairly consistent.
One important trend worth tracking, however, is the correlation between the GitHub and Stack Overflow rankings. This is the second consecutive period in which the relationship between how popular a language is on GitHub versus Stack Overflow has weakened; this run’s .74 is in fact the lowest observed correlation to date. Historically, the number has been closer to .80. With only two datapoints indicating a weakening – and given the fact that at nearly .75, the correlation remains strong – it is premature to speculate as to cause. But it will be interesting to monitor this relationship over time; should GitHub and Stack Overflow continue to drift apart in terms of programming language traction, it would be news.
For the time being, however, the focus will remain on the current rankings. Before we continue, please keep in mind the usual caveats.
- To be included in this analysis, a language must be observable within both GitHub and Stack Overflow.
- No claims are made here that these rankings are representative of general usage more broadly. They are nothing more or less than an examination of the correlation between two populations we believe to be predictive of future use, hence their value.
- There are many potential communities that could be surveyed for this analysis. GitHub and Stack Overflow are used here first because of their size and second because of their public exposure of the data necessary for the analysis. We encourage, however, interested parties to perform their own analyses using other sources.
- All numerical rankings should be taken with a grain of salt. We rank by numbers here strictly for the sake of interest. In general, the numerical ranking is substantially less relevant than the language’s tier or grouping. In many cases, one spot on the list is not distinguishable from the next. The separation between language tiers on the plot, however, is generally representative of substantial differences in relative popularity.
- In addition, the further down the rankings one goes, the less data available to rank languages by. Beyond the top tiers of languages, depending on the snapshot, the amount of data to assess is minute, and the actual placement of languages becomes less reliable the further down the list one proceeds.
(click to embiggen the chart)
Besides the above plot, which can be difficult to parse even at full size, we offer the following numerical rankings. As will be observed, this run produced several ties which are reflected below.
6 C++ / Ruby
17 Visual Basic
19 Clojure / Groovy
- R: Advocates of R will be pleased by the language’s fourth consecutive gain in the rankings. From 18 in January of 2013 to 13 in this run, the R language continues to rise. Astute observers might note by comparing plots that this is in part due to growth on GitHub; while R has always performed well on Stack Overflow due to the volume of questions and answers, it has tended to be under-represented on GitHub. This appears to be slowly changing, however, in spite of competition from Python, issues with the runtime itself and so on.
- Go: Like R, Go is sustaining its upward trajectory in the rankings. It didn’t match its six place jump from our last run, but the language moved up another spot and sits just outside the Top 20 at 21. While we caution against reading much into the actual placement on these rankings, where differences between spots can over-represent only marginal differences in performance, we do track changes in trajectory closely. While its 21st spot, therefore, may not distinguish it materially from the languages directly above or behind it, its trendline within these rankings does. Given the movement to date, as well as the qualitative evidence we see in terms of projects moving to Go from other alternatives, it is not unreasonable to expect Go to be a Top 20 language within the next six to twelve months.
- Perl: Perl, on the other hand, is trending in the opposite direction. Its decline has been slow, to be fair, dropping from 10 only down to 12 in our latest rankings, but it’s one of the few Tier 1 languages that has experienced a decline with no offsetting growth since we have been conducting these rankings. While Perl was the glue that pulled together the early web, many believe the Perl 5 versus Perl 6 divide has fractured that userbase, and at the very least has throttled adoption. While the causative factors are debatable, however, the evidence – both quantitative and qualitative – points to a runtime that is less competitive and significant than it once was.
- Julia/Rust: Two of the first quarter’s three languages to watch – Elixir didn’t demonstrate the same improvement – continued to their rise. Each jumped 5 spots from 62/63 to 57/58. This leaves them still well outside the second tier of languages, but they continue to climb in our rankings. For differing reasons, these two languages are proving to be popular sources of investigation and experimentation, and it’s certainly possible that one or both could follow in Go’s footsteps and make their way up the rankings into the second tier of languages at a minimum.
- Swift: Making its debut on our rankings in the wake of its announcement at WWDC is Swift, which checks in at 68 on our board. Depending on your perspective, this is either low for a language this significant or impressive for a language that is a few weeks old. Either way, it seems clear that – whatever its technical issues and limitations – Swift is a language that is going to be a lot more popular, and very soon. It might be cheating, but Swift is our language to watch this quarter.
Big picture, the takeaway from the rankings is that the language diversity explored most recently by my colleague remains the norm. While the Top 20 continues to be relatively static, we do see longer term trends adding new players (e.g. Go) to this mix. Whatever the resulting mix, however, it will ultimately be a reflection of developers’ desires to use the best tool for the job.
Following the departure of Steve Ballmer, one of the outgoing executive’s defenders pointed to Microsoft’s profit over his tenure relative to a number of other competitors. One of those was Salesforce.com, which compared negatively for the simple reason that it has not generated a net profit. This swipe was in keeping with the common industry criticism of other services based firms from Amazon to Workday. As far as balance sheets are concerned, services plays – be they infrastructure, platform, software or a combination of all three – are poor investments. Which in turn explains why the upward trajectory common to the share prices of firms of this type has generated talk of a bubble.
Recently, however, Andreessen Horowitz’ Preethi Kasireddy and Scott Kupor questioned in print and podcast form the mechanics of how SaaS firms in particular are being evaluated. The source will be an issue for some, undoubtedly, as venture capitalists have a long history of creatively interpreting financial metrics and macro-industry trends for their own benefit. Kasireddy and Kupor’s explanations, however, are simple, digestible and rooted in actual metrics as opposed to the “eyeballs” that fueled the industry’s last recreation of tulipmania.
The most obvious issue they highlight with services entities versus perpetual software models is revenue recognition. Traditional licenses are paid up front, which means that vendors can apply the entire sale to the quarter it was received which a) provides a revenue jolt and b) helps offset the incurred expenses. Services firms, however, have typically incurred all of the costs of development and customer acquisition up front but are only able to recognize the revenue as it is delivered. As they put it,
The customer often only pays for the service one month or year at a time — but the software business has to pay its full expenses immediately…
The key takeaway here is that in a young SaaS business, growth exacerbates cash flow — the faster it grows, the more up-front sales expense it incurs without the corresponding incoming cash from customer subscriptions fees.
The logical question, then, is this: if services are such a poor business model, why would anyone invest in companies built on them? According to Kasireddy and Kupor, the answer is essentially the ratio of customer lifetime value (LTV) to customer acquisition costs (CAC). Their argument ultimately can be reduced to LTV, which they calculate by some basic math involving the annual recurring revenue, gross margin, churn rate and discount rate, and then measure against CAC to produce a picture of business’s productivity. The higher the multiple of a given customers’s lifetime value relative to the costs of acquiring same, obviously, the better the business.
Clearly businesses betting on services are doing so – at least in part, competitive pressures are another key driver – because they believe that while the benefits of upfront perpetual licensing are tempting, the more sustainable, higher margin approach over the longer term is subscriptions. Businesses like Adobe who have made this transition would not willingly leave the capital windfall on the table otherwise. Which means that while we need more market data over time to properly evaluate Kasireddy and Kupor’s valuation model over time, in particular their contention that services plays will be winner-take-all by default, it is difficult to argue the point that amortizing license fees over time allows vendors to extract a premium that is difficult to replicate with up front, windfall-style licensing. Even if this alternative model cannot justify current valuations, ultimately, it remains a compelling argument for why services based companies should not be evaluated in the same manner as their on premise counterparts.
But there is one other advantage to services based businesses that Kasireddy and Kupor did not cover, or even mention. If you listen to the podcast or read the linked piece, the word “data” is mentioned zero times. Which is an interesting omission, because from this vantage point data is one of the most crucial structural advantages to services based businesses. There are others, certainly: the two mention the R&D savings, for example, that are realized by supporting a single version of an application versus multiple versions over multiple platforms. But potentially far more important from my perspective is the inherent advantage IaaS, PaaS and SaaS vendors have in visibility. Services platforms operating at scale have the opportunity to monitor, to whatever level of detail they prefer, customer behaviors ranging from transactional velocity, collaboration rates, technology preferences, deployment patterns, seasonal consumption trends – literally anything. They can tell an organization how this data is trending over time, they can compare a customer against baselines of all other customers, customers in their industry, or direct competitors.
Traditional vendors of on premise technologies sold under perpetual licenses need to ask permission to audit the infrastructure in any form, and to date vendors have been reluctant to grant this permission widely, in part due to horrifically negative experiences with vendor licensing audit teams. By hosting the infrastructure and customers centrally, however, services based firms are granted virtually by default information that sellers of on premise solutions would only be able to extract from a subset of customers. There is a level of permission inherent in working off of remotely hosted applications. It is necessary to accept the reality that literally every action, every behavior, every choice can (and should) be tracked for later usage.
How this data is ultimately used depends, of course, on the business. Companies like Google might leverage the data stored and tracked to serve you more heavily optimized ads or to decide whether or not to roll out more widely a selectively deployed new feature. Amazon might help guide technology choices by providing some transparency into which of a given operating system, database and so on was more widely used, and in what context. The Salesforce and Workday’s of the world, meanwhile, can observe in detail business practices, compare them with other customers and then present those findings back to its customers. For a fee, of course.
Which is ultimately why data is an odd asset to ignore when discussing the valuation of services firms. Should they execute properly, vendors whose products are consumed as a service are strongly differentiated from software-only players. They effectively begin selling not just software, but an amalgam of software and the data-based insights gleaned from every other user of the product – a combination that would be difficult, if not impossible, for on premise software vendors to replicate. Given this ability to differentiate, it seems likely that services firms over time will command a premium, or at the very least introduce premium services, on top of the base software experience they’re delivering. This will become increasingly important over time. And over time, the data becomes a more and more significant barrier to entry, as Apple learned quite painfully.
This idea of leveraging software to generate data isn’t new, of course. We at RedMonk have been writing about it since at least 2007. For reasons that are not apparent, however, very few seem to be factoring it into their public valuations of services based businesses. Pieces like the above notwithstanding, we do not expect this to continue.
“Even though the UNIX system introduces a number of innovative programs and techniques, no single program or idea makes it work well. Instead, what makes it effective is the approach to programming, a philosophy of using the computer. Although that philosophy can’t be written down in a single sentence, at its heart is the idea that the power of a system comes more from the relationships among programs than from the programs themselves. Many UNIX programs do quite trivial things in isolation, but, combined with other programs, become general and useful tools.”
- The UNIX Programming Environment, Brian Kernighan and Rob Pike
“Is the ‘all in one’ story compelling or should we separate out a [redacted] LOB?” is a question we fielded from a RedMonk client this week, and it’s an increasingly common inquiry topic. For years, functional accretion – in which feature after feature is layered upon a basic application foundation – has been the norm. The advantage to this “all in one” approach is that buyers need only to make one decision, and refer to one seller for support. This simple choice, of course, has a cost: the jack of all trades is the master of none.
At RedMonk, we have been arguing for many years that the developer has evolved from virtual serf to de facto kingmaker. Accepting that, at least for the sake of argument, it is worth asking whether one of the unintended consequences of this transition may be a return to Unix-style philosophies.
The most obvious example of this in the enterprise market today is so-called microservices. Much like Unix programs, many services by themselves are trivial in isolation, but leveraged in concert can be tremendously powerful tools. This demands, of course, an Amazon-level commitment to services, such that every facet of a given infrastructure may be consumed independently and on demand – and this level of commitment is rare. But with even large incumbents increasingly focused on making their existing software portfolios available as services, the trend towards services broadly is real and clearly sustainable.
The trend towards microservices, which are much more granular in nature, is more recent and thus more difficult to project longer term (particularly given some of the costs), but certainly exploding in popularity. Sessions like Adrian Cockroft’s “Migrating to Microservices” are regularly standing room only with lines wrapping down two halls. The parallels between the Unix philosophy and microservices are obvious, in that both essentially are devoted to the idea of composable applications built from programs that do one thing well.
These types of services are difficult to sell to traditional IT buyers, who might not understand them well enough, would prefer to make a single decision or both. But developers understand the idea perfectly, and would prefer to choose a service that does what they need it to over one that may or may not do what they need but does ten things they don’t. It’s easy, then, to see microservices as the latest manifestation of the developer kingmaker.
It’s not as easy, however, to understand a similar trend in the consumer application space. In recent months, rather than continue trying to build a single application that serviced both file sharing and photo sharing needs, Dropbox split its application into the traditional Dropbox (Files) and the newly launched Carousel (Photos). Foursquare today released an application called Swarm, which essentially forks its business into two divisions: Foursquare, a Yelp competitor, and Swarm, a geo-based social network. Facebook, meanwhile, ripped out the messaging component of its core application in April because, as Mark Zuckerberg described it:
The reason why we’re doing that is we found that having it as a second-class thing inside the Facebook app makes it so there’s more friction to replying to messages, so we would rather have people be using a more focused experience for that.
Like enterprise tech, consumer technology has been trending towards all-in-one for sometime as pieces like the “Surprisingly Long List of Everything Smartphones Replaced” highlight. But if Facebook, Foursquare and Twitter are any indication, it may be that a Unix philosophy renaissance is underway in the consumer space as well, even if the causative factors aren’t as obvious.
All of which means that our answer to the opening question should come as no surprise: we advised our client to separate out a new line of business. Particularly when developers are involved, it remains more effective to offer products that do one thing well, however trivial. As long as the Unix-ization of tech continues, you might consider doing the same.
Among the predictions in this space for the year 2014 was the idea that disruption was coming to storage. Having looked at the numbers, this prediction may have been off: disruption had apparently already arrived. By my math, these are EMC’s revenue growth rates for the last four years for its Information Infrastructure business: 18.43% (2010), 17.92% (2011), 2.05% (2012), 3.48% (2013). While the Information Infrastructure includes a few different businesses, Information Storage – what EMC is best known for – is responsible for around 91% of the revenue for the Information Infrastructure reporting category. And Information Infrastructure, in turn, generates 77% of EMC’s total consolidated revenue – the rest is mostly VMware (22%).
All of this tells us two things. One, that EMC has seen a multi-year downward trajectory in its ability to grow its storage business, and two, that storage is responsible for the majority of the company’s revenue. Put one and two together and it’s clear that the company has a problem.
How the company has reacted to these developments, meanwhile, can help observers gain a better understanding of what EMC believes are the causes to this under-performance. Based on the announcements at EMC World, it’s easy to sum up the company’s strategic response in one word: software. From ScaleIO to ViPR to the acquisition of DSSD and its team of ex-Solaris engineers, a lot of the really interesting news at EMC World was about software, which is an interesting shift for a hardware company. EMC is committed enough to its software strategy, in fact, that it’s willing to directly compete with its subsidiaries.
If it’s true that EMC is betting heavily on software to restore its hardware growth, the next logical question is whether this is the appropriate response. Based on what happened to the major commodity compute players – Dell has gone private, HP is charging for firmware and IBM left the market entirely – it’s difficult to argue for a different course of action. It seems unlikely that the optimal approach moving forward for EMC – or any other storage provider, for that matter – is going to be heavy hardware engineering. There are customers, particularly loyal EMC customers, that are hungry for hardware innovation and will continue to pay outsized margins for that gear moving forward. There are many more customers, however, willing to explore software abstractions layered on top of commodity hardware, otherwise known as software-defined storage. There’s a reason that EMC’s primary points of comparison were vendors like Amazon and Google rather than its traditional competitors.
Like its counterparts in the networking space who are coping with the implications of software-defined offerings in their space, EMC essentially had two choices: bury its head in the sand and pretend that the business is fine, or begin to tactically incorporate disruptive elements as part of a longer term strategy for adapting its business. Which is another way of saying that the company only really had one realistic choice, which to its credit was made: EMC is clearly adapting. Software-defined storage was a common topic of discussion at the company’s event this week, and while there are still areas where the embrace is awkward, the company clearly understands the challenge ahead and is taking steps to adjust its product lines and the models behind them. The transition to what it calls the “third platform” – EMC’s terminology for the cloud – will pose monumental challenges to the business longer term, but by betting on software EMC is investing in the area most likely to deliver differentiated value over time.
The biggest problem with the transition to the “third platform,” however, isn’t going to be their engineering response. As the company likes to point out, it is investing heavily in both M&A and traditional R&D, and with names like Bechtolsheim, Bonwick, Shapiro et al coming on board it’ll have the requisite brainpower available. But the problem with its current strategy is that it does little to prioritize convenience. As we’ve seen in the cloud compute segment, customers are increasingly willing to trade performance and features for speed and ready availability. And like most systems vendors, EMC is not currently built to service this type of demand directly; they will instead have to settle for an arms supplier-type role. Even in software, which is intrinsically simpler to make available than the hardware EMC has traditionally sold, the company keeps assets like ViPR locked up behind registration walls. In a market in which technology decisions are being made based more on what’s available than what’s good, that’s an issue.
The gist of all this, then, in the wake of EMC World is that the company is inarguably adapting to a market that’s rapidly changing around it, but has tough problems to solve in availability and convenience. The loyalty of EMC accounts is absolutely an asset, one that the company will need to rely on as customers make the “third platform” transition moving forward. But the company also needs to remember that technology decision making at those loyal EMC accounts has changed materially, and is increasingly advantaging players like Amazon at the expense of incumbents.
The focus on software engineering, therefore, is appropriate and welcome, but insufficient by itself to address the coming transition. Only a focus on reducing the friction of adoption, and improving developer engagement, can fix that.
Disclosure: EMC is a client, as are Amazon and VMware.
While the term SOA was lost to marketers years ago, the underlying concept may be in the process of making a comeback. Though the term itself has become a bad word outside of the most conservative enterprises and suppliers today, constructing applications from services has clear and obvious benefits. In his instant classic post about his time at Amazon, Google’s Steve Yegge described Amazon’s journey towards an architecture composed of services this way:
So one day Jeff Bezos issued a mandate…His Big Mandate went something along these lines:
1) All teams will henceforth expose their data and functionality through service interfaces.
2) Teams must communicate with each other through these interfaces.
3) There will be no other form of interprocess communication allowed: no direct linking, no direct reads of another team’s data store, no shared-memory model, no back-doors whatsoever. The only communication allowed is via service interface calls over the network.
4) It doesn’t matter what technology they use. HTTP, Corba, Pubsub, custom protocols — doesn’t matter. Bezos doesn’t care.
5) All service interfaces, without exception, must be designed from the ground up to be externalizable. That is to say, the team must plan and design to be able to expose the interface to developers in the outside world. No exceptions.
6) Anyone who doesn’t do this will be fired.
Like Cortez’s soldiers, the Amazon employees got to work if for no other reason than they had no choice. The result, in part, is the Amazon you see today, the same one that effectively owns the market for public cloud services at present. Much as enterprises have historically writen off Adrian Cockcroft’s Netflix lessons with statements like “it only works for ‘Unicorns’ like Netflix,” most have convinced themselves that the level of service-orientation that Amazon achieved is effectively impossible for them to replicate. Which is, to be fair, likely true absent the Damoclean incentive Bezos put in place at Amazon. What’s interesting, however, is that many of those same enterprises are likely headed towards increased levels of abstraction and service-orientation, whether they realize it or not.
The most obvious example of this trend at work is the unfortunately named (Mobile) Back-end-as-a-Service category of providers. From Built.io to Firebase to Kinvey to the dozen other providers in the space, one the core value propositions is shortening the application development lifecycle by composing applications from a collection of services. Rather than building identity, location, and similar common services into the application from scratch, BaaS providers supply the necessary libraries to access externally hosted services. Which means that the application output of these providers is intrinsically service-oriented by design.
Elsewhere, in the adjacent Platform-as-a-Service space, providers are essentially advancing the same concept. In building an application on Engine Yard or Heroku, for example, developers are not required to implement their own datastores or caching infrastructure, but rather may leverage them as services – whether that’s Hadoop, MongoDB, memcached, MySQL, PostgreSQL, Redis, or Riak. Even IBM is planning to make the bulk of its software catalog consumable as a service by the end of the year. Which is logical, because the differentiation for PaaS providers is likely to be above the platform itself, as it is in the open source operating system market.
Consider on top of all of the above the existing traction for traditional SaaS offerings, and the reality is that it’s getting harder to build applications that are not dependent in some way upon services. And for those applications that are not yet, vendors are likely to make it increasingly difficult to maintain that independence as they move into services as a hedge against macro-issues with the sales of stand alone software.
There’s a reason, in other words, that micro-services are all the rage at the moment: services are how applications are being built today.
— Pivotal (@gopivotal) March 11, 2014
As announced yesterday at their inaugural analyst conference, Cloudera – the first commercial backer of the Hadoop project – has secured $160 million in new financing, bringing their total raised capital to $300 million. Because it remains, at least for now, a private company, precise details on their finances remain unavailable. What has been disclosed, however, is that approximately seventy percent of their revenue derives from what the company refers to as “software.” The question that no one seems to be asking today is what the word “software” actually means in this context.
Twenty years ago, the logistics of software businesses were straightforward. Vendors employed developers, though they were systemically undervalued at the time, to collectively author a product that was explicitly designed to be sold to a specific buyer. By the late 1980′s in the case of the enterprise, this was typically the Chief Information Officer. While software was virtual by nature it was typically distributed in physical form, whether that was a floppy disk, a CD-ROM or, later, a DVD. Software businesses, in other words, looked a great deal like traditional manufacturing businesses: they constructed a product, shipped it to buyers who put it to work in their own environments. Often with assistance from the vendor or certified third parties, true, but the moving pieces of the industry owed a great deal to traditional manufacturing.
Today, things have changed. Many enterprise software vendors are still operating as they have for the past few decades, but pressure from competing models is mounting.
The distribution of software, of course, has changed dramatically. Physical media has for years been an anachronism, as it became much more efficient for buyer and seller to leverage digital distribution. But things hardly stopped there: from Salesforce.com’s 2004 IPO forward, mainstream customers began to assess more critically the costs and benefits of installing software on premises versus merely consuming it as a service. The company – whose product resembled traditional packaged application (CRM) with the exception of its delivery model – drew a bright line between its business and that of traditional vendors. Salesforce famously campaigned around a message of “No Software.” Its toll-free number, in fact, remains 1-800-NO-SOFTWARE. For some, this idea is comically inaccurate: without software, there is no Salesforce.com. It is still software, it’s just delivered via a different medium and managed by the vendor rather than the customer. For Salesforce, and presumably some subset of its customer base, however, the belief is that the distinction between SaaS and the traditional packaged software shipped to, installed on and managed from a customer’s premises is significant enough to render it a completely new product. A product, therefore, that should not be referred to as “software.”
Whatever one makes of the semantics of this argument, changes in the nature of software availability, delivery and procurement are clearly impacting markets and the incumbents who previously dominated them.
While Oracle may attribute its adjusted earnings miss this quarter to currency fluctuations, or its shortfall two quarters ago to a “lack of urgency” in its saleforce, the reality is that the trend line for its sales of new licenses has been problematic for well over a decade. Notably, this declining ability to sell new licenses of its software overlaps with rising adoption of open source software and SaaS packages, among other competitive models. This is no coincidence. While the underlying business and revenue models for open source and SaaS differ, they share one common advantage over traditional software distribution models such as Oracle: they are far easier to acquire.
Microsoft, another firm built largely on traditional software distribution and acquisition models, has similarly struggled to compete with more available alternatives. The company has telegraphed its level of concern with the viability of its traditional models moving forward with its massive investments in more available infrastructure (Azure), and this is appropriate. On the consumer front, Microsoft has seen Apple’s operating system distribution and pricing model shift from physical media priced at $200 to a free download. Within the server market, meanwhile, Microsoft’s biggest challenge of late has been competing in the rapidly growing (and inherently convenient) public cloud market where operating system licensing fees are near zero in most cases.
Microsoft and Oracle collectively generated billions of dollars of wealth according to the simple model described above: they manufactured software, shipped it to customers who were ultimately responsible for its installation, implementation, maintenance and usage. As barriers to hardware and software both have broken down, due to technical approaches such as open source, the public cloud or SaaS, the traditional model became increasingly subject to disruption. Adaptations from both have included the incorporation of the very models they were disrupted by: open source, public cloud and SaaS.
Absent substantial pressure, it’s unlikely that either of these businesses would have strayed from the courses that made them and their shareholders very wealthy. But selling software in the traditional manner became, and is becoming, more difficult to do.
Which brings us to the definitional question of what it means to be a “software” business. If companies as successful and well capitalized as Microsoft and Oracle are struggling with disruptions to traditional software distribution models, it would seem important for every software vendor to consider carefully what it means when it defines itself as a “software” vendor.
Given the market’s clear and accelerating preference for software models built on convenience – be that cloud, open source, SaaS or otherwise – it is useful for vendors and buyers alike to consider where a given product falls on a spectrum of customer effort.
At one end we have traditional software players, for whom procurement begins with on-site sales visits, nothing is free and access to the software is jealously guarded. From a customer perspective, this is a high effort model: the procurement process is time intensive, the costs of the software are high, and the customer typically bears the risk of implementation because they are paying for the software on an up front basis.
At the other end of the spectrum lies SaaS, which is accessible to anyone with a browser and payable via a credit card. The customer effort metric for SaaS is low: the software is comparatively easily acquired, so procurement is less involved, and implementation risk and effort is largely shifted to the vendor – for a longer term premium, of course.
Somewhere in between these models, but closer to the SaaS end of the spectrum, lie both open source software and public cloud services. Both are substantially advantaged versus traditional software and infrastructure in acquisition. Procurement is so low effort for customers, in fact, that both open source and public cloud services are frequently leveraged within organizations without technical leadership being aware of that fact. Because they do not represented a finished product, however, more implementation effort is required versus SaaS. Customer evaluation for public cloud and open source relative to SaaS – or PaaS, for that matter – is typically an evaluation of the tradeoffs between convenience and control.
Directionally, while there are obvious exceptions, from a macro perspective the market is actively shifting away from the traditional model towards the latter two examples. Which implies that vendor strategies must adapt to this changing reality; vendors whose model allows only for traditional models of software distribution and consumption will be at a significant disadvantage moving forward. Many, in fact, are already alarmingly behind. If a given software vendor isn’t at least considering strategic shifts in consumption and delivery of software in its market, it has a serious problem.
In a world in which the only option for customers is to purchase software up front and assemble and leverage it at their own risk, traditional software sales would remain robust, because software is a basic necessity. In today’s market context, however, where customers have a wealth of options, from purchasing, installing and running their own software stack to fully outsourcing same to constructing hybrid applications composed of micro-services (i.e. managed software exposed as an API), software vendors must cater to customer’s desires for convenience and effort minimization.
Even smaller software vendors must actively plan for a future in which they are not merely handing off a software product to customers and hoping for the best, but actively delivering it over a network, managing and monitoring it on behalf of customers in a public cloud, integrating data into the product and so on. This poses immense operational challenges, of course, as most pure software vendors are not appropriately resourced to deliver their software in a network context, for example. But if they don’t, they can be sure competitors will.
Vendors, whether that’s Cloudera or Pivotal as featured in the quote above, will continue to point to “software” as their primary revenue source. But the reality is that when successful companies say “software” they will actually mean software plus some combination of public cloud infrastructure, hardware/appliance, automated management/monitoring capabilities, hosted micro-services, and data enabled analytics. The majority of which is software, of course. Just not strictly software as we have been conditioned to think of it.
Which is why in a growing number of cases, the term “software company” may become as obsolete as the media they once distributed the product on.
Disclosure: Cloudera, Pivotal and Salesforce are current RedMonk customers. Microsoft has been a customer but is not currently, and neither Apple nor Oracle is a RedMonk customer.
Initial reactions to Facebook’s acquisition of WhatsApp primarily centered on price. Which is understandable, given the valuation. Here is an incomplete list of 25 technology companies the market values less than Facebook values WhatsApp.
- Analog Devices
- Computer Associates
- Dassault Systemes
- Activision Blizzard
- Red Hat
- Electronic Arts
- Level 3
- F5 Networks
On the one hand, the service boasts over 450 million users – 200 million or so more than Twitter. And the deal math essentially values each user at approximately $42 which is, after adjusting for inflation, less than what AOL paid per ICQ user ($49) in 1998. It’s also considerably less than Twitter’s per user valuation, which is around $127, according to Friday’s market prices.
On the other hand, unlike Twitter, WhatsApp has been distinctly reluctant to monetize users via advertising. Messaging is also a highly competitive market, one in which users have a wide variety of credible alternatives – which we’ll come back to. And while WhatsApp is technically a paid application, it’s free for the first year and the maximum revenue per user under the current model is $0.99 annually. As a side note on that subject, it’s worth noting that based on growth charts it would appear that 200+ million of WhatsApp’s users have signed up in the last year, meaning that they have not yet been forced to make a choice whether to purchase the app or turn to free alternatives. All of which suggests that the company was not acquired because of its revenue generation potential.
Instead, as most of the subsequent analyses have acknowledged, WhatsApp was presumably acquired for strategic reasons. One rationale sees WhatsApp as a hedge against perceived defections from and stagnation within Facebook’s core platform – primarily amongst younger users. Another argues WhatsApp represents a strategic bid to assist Facebook in the zero sum battle for a user’s overall attention. A third is to tap one of the largest sources of messaging traffic in the world for data mining and analysis in order to deliver more effective advertising results for one or both platforms. There are 19 billion reasons to believe, however, that the incentive here was a combination of all of the above, as well as a wide number of other factors, meaning that the impetus for the deal is not likely to be found in a spreadsheet. Which in turn is why the deal seems perfectly defensible to some and absolutely baffling to others.
If we set aside the numbers, however entertaining debating them might be, it’s reasonable to acknowledge that messaging is, by itself, a fundamentally important channel. It has become, almost by accident, the default communication mechanism for individuals all over the world. Regional dynamics – specifically lower cost SMS services and fewer network boundaries in the US – have led to asymmetrical adoption of SMS-competitors such as WhatsApp from country to country, but the overall numbers are inarguable: messaging is an enormous, and therefore strategic, market. Which helps explain why the likes of Facebook and Google have been courting players like Snapchat and WhatsApp.
Strategic though the market may be, however, the dynamics of messaging make it an odd market to parse. Consider the following characteristics:
Messaging networks are, by and large, private. Snapchat’s visibility, in fact, is in part based on the self-destructing nature of its messages. Recent trends, in fact, suggest growth not only in private communications but anonymous ones such as Secret or Whisper. By contrast, Facebook and Twitter, for all that they have been brought in to discussions of messaging in the wake of the WhatsApp acquisition, are fundamentally different channels. They are typically public by default; Facebook has periodically irked users by exposing publicly content they wished to keep private. This distinction is particularly important for advertising business models. Layering advertisements into conversations one has with the public are disruptive but generally accepted. Injecting them into private conversations, many of which are one to one, is highly problematic.
While Facebook, Twitter and other social media sites are not exactly the Library of Congress in terms of their longevity, relative to WhatsApp and other messaging devices they may as well be cuneiform tablets. Twitter has recently made available a user’s entire history via an archive, and Facebook’s Timeline feature is an attempt to give the service relevance over longer periods of time. Messaging services, on the other hand, are typically for ephemeral, and thus largely disposable, content. No one would argue that these temporal limitations make messaging irrelevant; but it is difficult to compare throwaway messages to friends and family with to the more persistent timelines of other social media services. Distinguishing between degrees of transience may seem trivial, but it is in all likelihood the reason there are multiple competitive messaging services against comparatively fewer social networking alternatives. It’s simple for users to manage multiple messaging services for different groups of friends, where similarly fragmented usage would destroy the utility of a social network.
Questions of network effect are likewise important. Much has been made of the importance of the “address book,” the idea being that unlike desktop or web based applications, smartphone apps are much more likely to have access to up-to-date contact information, and therefore the most important network to a given individual. But while this advantages messaging apps over desktop and web rivals, it also means that smartphone-based alternatives have a near equal playing field. None of WhatsApp, LINE, Snapchat and so on have preferential access to my contacts, meaning that the switching costs between one service and another – while not non-existent – are theoretically marginal. With even a moderate group consensus, it’s possible to immediately switch from one messaging app to another – as indeed is occurring already in some quarters. When WhatsApp had three hours of downtime this weekend, the free alternative Telegram saw 5 million new registrants.
The choice of WhatsApp for Facebook is a bid for user volume. But what about the technology? Much has been made of WhatsApp’s impressively low employee count (55) given both the valuation and message volume handled, and understandably so. It’s interesting to note, however, that WhatsApp lacks some of the platform features common to competitors like LINE and WeChat, which has impacted WhatsApp’s popularity in multiple Asian geographies. Even compared to lesser known alternatives such as Telegram, it lacks basic features. Telegram, for example, is accessible from multiple devices including a desktop; WhatsApp, by comparison, is tied to a single device – and architected for same. The coming addition of voice calls, meanwhile, is a potentially interesting differentiator but the appeal to its messaging-centric audience is uncertain. There’s a reason, after all, that voice usage has consistently declined globally in favor of messaging systems, and it’s not strictly price.
The most curious thing about analyses of the WhatsApp deal, however, is that they almost universally fail to mention iMessage as a competitor. If you search this piece by Ben Evans or this piece by Ben Thompson for example – both of which are recommended – neither includes iMessage as potential competition for WhatsApp. Which is technically understandable: iMessage is (at present, at least) limited to iOS, and is only really optimal in group messaging if the entire group is using iOS devices. But in terms of service potential, iMessage has a few key advantages over competition such as WhatsApp.
- Rather than make a clean break from SMS systems, it embraces (and extinguishes?) them.
- It is embedded in the operating system: no downloads are necessary.
- It is accessible using desktop clients.
- Every registered Apple user is a de facto user.
The network effects – and potential lock-in – to iMessage is something many non-iOS users have probably experienced. Either through limitations of the service (for group messaging, most obviously), or for deliberate purpose (limited SMS packages, for example), people carrying competitive devices may actively be excluded from conversations amongst iOS users.
In sum, then, we have a market that is widely accepted as strategically important, but with a higher degree of efficiency than other markets such as social media. Users can and will engage and leverage multiple messaging services, with few able to manage the kind of lock-in characteristic of sites like Facebook or Twitter. For this reason, large investments in the space, while of theoretically high upside because of their effortless acquisition of new users, are likely to be comparatively high risk.
The fact that Facebook has publicly committed to running WhatsApp as a separate business may be a function of this risk. One advantage older, incumbent technology providers have had over younger rivals in recent years was their understanding that even mergers that make sense on paper are difficult to execute properly, and more likely than not to fail. This, more than the costs involved, is why large scale acquisitions in the technology industry are comparatively rare. It is intrinsically difficult to merge businesses. One wonders if WhatsApp being maintained as a stand alone business, or Google’s recent decision to maintain Nest as a separate entity following the $3.2B acquisition, is a sign that the technology upstarts are learning from the likes of EMC and VMware.
Speaking of Google, meanwhile, what will be most interesting to observe in the wake of WhatsApp’s acquisition is how the search giant responds. While not commonly credited with such, Apple already posseses a competitor in iMessage. Google, curiously enough, does not: while its Android Hangouts app handles both its instant messaging and traditional SMS, the two channels are kept entirely separate.
This is problematic for Google moving forward. For Apple, success with iMessage (Apple does not disclose subscriber numbers) is welcome, but non-essential. At its core, Apple’s value proposition to users is the devices it sells (a lesson Blackberry failed to learn, BBM or no). Google, on the other hand, depends as much on a given user’s attention as does Facebook. The more time users spend with the latter – or wholly owned subsidiaries like WhatsApp – the less time available for interaction with Google properties. Which explains Google’s persistent interest in messaging services. But not why they feel compelled to pay the kinds of outsized premiums for inorganic growth. One imagines that if Google, who is held up as Apple’s superior when it comes to services, could deliver something similar to iMessage in Android, it would be in a position to deliver at least comparable numbers to WhatsApp competitors based on the number of Android devices activated per day. And if it made this service available cross-platform on iOS in particular, as it has with its other Google services (which admittedly may not be technically feasible), it would represent a very formidable competitor indeed. Google’s lack of direct attention to the messaging opportunity has been, frankly, perplexing.
$19 billion dollars is a hell of a shot across the bow, though.
Because the RedMonk Programming Language Rankings have tended to be fairly stable over time, one of the more common questions we get following a release concerns community volatility. More specifically, many are curious if the individual data sources themselves – GitHub and Stack Overflow – tend to be less constant over time than the correlation of same. To explore this question, we examined the GitHub Archive using a simple query fetching the number of repositories created per programming language per quarter beginning in 2011. Per the GitHub Archive, their data only goes back as far as February 12, 2011, so Q1 of that year here is short data for a little over a month’s worth of activity. As with the Programming Language Rankings, this excludes forks. And in an effort to make the data more accessible, this analysis focuses on a subset of the list, the Top 10 programming languages by our rankings. The findings are interesting, and seem to raise as many questions as they answer.
First, consider the following chart of repositories created on GitHub per quarter for each of our top ten programming languages.
(click to embiggen the chart)
The dramatic growth in late 2011 and early 2012 is not particularly surprising. Less predictable, however, was the three consecutive quarters of decline (Q312 to Q113) for the surveyed languages. To be clear, this is a decline in newly created repositories per quarter within our ten language sample only: the chart does not suggest a decline in overall activity on the platform. Still, given the significance of these ten languages to GitHub and the seemingly corrective nature of the dip, the company’s $100M round early in the third quarter of 2012 appears well timed.
The major question, however, remains the aforementioned surge. Further mining of the dataset is needed to try and ascertain a cause, but the good news for GitHub is that even absent a surge growth as measured by repository creation appears to be healthy. Unless valuations, then, were built on an assumption of Q312-Q113 growth rates, the impact of this anomalous spike should be minimal. It will also be interesting to assess growth rates outside of the subset here; was the decline in this sample offset by volume growth in other, less popular languages on the platform?
To provide a clearer picture of how these languages have performed in repository creation relative to one another, the following motion chart is made available. The data pictured is the ranking of each language in terms of repository creation, measured quarterly from 2011 through 2013. The motion charts provide three different lenses by which this data can be viewed, not to mention subsetted, over time. Click the play button in the bottom left hand corner to advance the dataset over time, and the majority of the visualization is interactive and clickable.
In the near future we’ll explore the history of the other axis of our rankings, Stack Overflow, for this same sample set of languages to assess the relative differences in trajectories between the two communities.
Even by the standards here, these 2014 predictions are late. But feeling that late is better than never, let the following serve as my official forecast for the year – or what’s left of it – ahead. As always, these predictions are based off of the information at hand, which in some cases might be quantitative forecasts and in others little more than rumor and speculation. The common thread, for me, is the belief that each will occur. Historically at RedMonk, we are better at predicting developments than their timing, and these predictions are no exception. Nevertheless, the following predictions are made for the year 2014.
One new wrinkle this year, as suggested by Bryan Cantrill, is the introduction of categories. In years past, all predictions have been issued as if they were of equal probability. The fact is, however, that not all predictions are created equal: some are more probable than others. To that end I’ve grouped this year’s predictions into five sections, ranging from very likely to theoretically possible. This will help when evaluting the predictions, because it will reveal the relative substance behind them.
With that said, on to the predictions.
38% or Less of Oracle’s Software Revenue Will Come from New Licenses
As discussed in November of 2013 and July of 2012, while Oracle has consistently demonstrated an ability to grow its software related revenues the percentage of same derived from the sale of new licenses has been in decline for over a decade.
Down to 38% in 2013 from 71% in 2000, there are no obvious indications that 2014 will buck this trend.
The Biggest Problem w/ IoT in 2014 Won’t Be Security But Compatibility
All signs indicate that 2014 is likely to be a big year for the Internet of Things. From large scale M&A activity to conference traction, interest in network-attached devices is spiking. There’s seemingly a new Kickstarter every day for some new IoT project, and even traditional companies are moving into the space more aggressively. There are two big problems, however, looming. The first is security. Bruce Schneier, as always, deconstructs the issues here clearly in this piece from Wired. He is entirely correct: there is a reckoning on the horizon with respect to IoT devices, and just as was the case with PC security it’s going to get worse before it gets better. And the damage from compromised hardware is likely to be far greater; what if attackers were able to turn off tens of thousands of thermostats during freezing temperatures?
My bet is that the second problem, however, is the more acute in the year ahead. Specifically, compatibility. Part of the promise of IoT devices is that they can talk to each other, and operate more efficiently and intelligently by collaborating. And there are instances already where this is the case: the Nest Protect smoke alarm, for example, can shut off a furnace in case of fire through the Nest thermostat. But the salient detail in that example is the fact that both devices come from the same manufacturer. Thus far, most of the IoT devices being shipped are designed as individual silos of information. So much so, in fact, that an entirely new class of hardware – hubs – has been created to try and centrally manage and control the various devices, which have not been designed to work together. But while hubs can smooth out the rough edges of IoT adoption, they are more band-aid than solution.
And because this may benefit market leaders like Nest – customers have a choice between buying other home automation devices that can’t talk to their Nest infrastructure or waiting for Nest to produce ones that do – the market will be subject to inertial effects. Efforts like the AllSeen Alliance are a step in the right direction, but in 2014 would-be IoT customers will be substantially challenged and held back by device to device incompatibility.
Windows 7 Will Be Microsoft’s Most Formidable Competitor
The good news for Microsoft is that Windows 7 adoption is strong, with more than twice the share of Windows XP, the next most popular operating system according to Statcounter. The bad news for Microsoft is that Windows 7 adoption is strong.
With even Microsoft advocates characterizing Windows 8 as a “mess,” Microsoft has some difficult choices to make moving forward. Even setting aside the fact that mobile platforms are actively eroding the PC’s relevance, what can or should Microsoft tell its developers? Embrace the design guidelines of Windows 8, which the market has actively selected against? Or stick with Windows 7, which is widely adopted but not representative of the direction that Microsoft wants to head? In short, then, the biggest problem Microsoft will face in evangelizing Windows 8 is Windows 7.
The Low End Premium Server Business is Toast
Simply consider what’s happened over the last 12 months. IBM spun off its x86 server business to Lenovo, at a substantial discount from the original asking price if reports are correct. Dell was forced to go private. And HP, according to reports, is about to begin charging customers for firmware updates. Whether the wasteland that is the commodity server business is more the result of defections to the public cloud or big growth from ODMs is ultimately irrelevant: the fact is that the general purpose low end server market is doomed. This prediction would seem to logically dictate decommitments to low end server lines from other businesses besides IBM, but the bet here is that emotions win out and neither Dell nor HP is willing to cut that particular cord – and Lenovo is obviously committed.
2014 Will See One or More OpenStack Entities Acquired
Belatedly recognizing that the cloud represents a clear and present danger to their businesses, incumbent systems providers will increasingly double down on OpenStack as their response. Most already have some commitment to the platform, but increasing pressure from public cloud providers (primarily Amazon) as well as proprietary alternatives (primarily VMware) will force more substantial responses, the most logical manifestation of which is M&A activity. Vendors with specialized OpenStack expertise will be in demand as providers attempt to “out-cloud” one another on the basis of claimed expertise.
The Line Between Venture Capitalist and Consultant Will Continue to Blur
We’ve already seen this to some extent, with Hilary Mason’s departure to Accel and Adrian Cockcroft’s move to Battery Ventures. This will continue in large part because it can represent a win for both parties. VC shops, increasingly in search of a means of differentiation, will seek to provide it with high visibility talent on staff and available in a quasi-consultative capacity. And for the talent, it’s an opportunity to play the field to a certain extent, applying their abilities to a wider range of businesses rather than strictly focusing on one. Like EIR roles, they may not be long term, permanent positions: the most likely outcome, in fact, is for talent to eventually find a home at a portfolio company, much as Marten Mickos once did at Eucalyptus from Benchmark. But in the short term, these marriages are potentially a boon to both parties and we’ll see VCs emerge as a first tier destination for high quality talent.
Netflix’s Cloud Assets Will Be Packaged and Create an Ecosystem Like Hadoop Before Them
My colleague has been arguing for the packaging of Netflix’s cloud assets since November of 2012, and to some extent this is already occurring – we spoke to a French ISV in the wake of Amazon reInvent that is doing just this. But the packaging effort will accelerate in 2014, as would-be cloud consumers increasingly realize that there is more to operating in the cloud than basic compute/network/storage functionality. From Asgard to Chaos Monkey, vendors are increasingly going to package, resell and support the Netflix stack much as communities have sprung up around Cassandra, Hadoop and other projects developed by companies not in the business of selling software. To give myself a small out here, however, I don’t expect much from the ecosystem space in 2014 – that will only come over time.
Disruption Finally Comes to Storage and Networking in 2014
While it’s infrequently discussed, networking and storage have proven to be largely immune from the aggressive commoditization that has consumed first major software businesses and then low end server hardware. They have not been totally immune, of course, but by and large both networking and storage have been relatively insulated against the corrosive impact of open source software – in spite of the best efforts of some upstart competitors.
This will begin to change in 2014. In November, for example, Facebook’s VP of hardware design disclosed that they were very close to developing open source top-of-rack switches. That open source would eventually come for both the largely proprietary networking and storage providers was always inevitable; the question was timing. We are beginning to finally seen signs that one or both will be disrupted in the current year, whether its through collective efforts like the Open Compute Project or simply clever repackaging of existing technologies – an outcome that seems more likely in storage than networking.
To get an idea of what the impact of these disruptions might be, it’s worth considering Cisco’s own (reported) evaluation of what would happen if the networking giant aggressively embraced SDN: “They concluded it would turn Cisco’s ‘$43 billion business into a $22 billion business.’” The stakes are high, in other words.
The Most Exciting Infrastructure Technology of 2014 Will Not Be Produced by a Company That Sells Technology
More and more today the most interesting new technologies are being developed not by companies that make money from software – one reason that traditional definitions of “technology company” are unhelpful – but from those that make money with software. Think Facebook, Google, Netflix or Twitter. It’s not that technology vendors are incapable of innovating: there are any number of materially interesting products that have been developed for purposes of sale.
But if necessity is the mother of invention, few to none have the necessity that the web scale companies do.
Google Will Buy NestGoogle Will Move Towards Being a Hardware Company
In the wake of Google’s acquisition of Nest, which I cannot claim with a straight face that I would have predicted, this prediction probably would have been better positioned in the Safe or Likely categories, as it seemed to indicate a clear validation of this assertion. But then they went and sold Motorola to Lenovo, effectively de-committing from the handset business.
While Google’s true motivations in that sale are a matter of speculation, however, the fact is that Google is still on its way to becoming a hardware company. Not necessarily in revenue, at least in the short term, although revenue is part of the argument. Consider the following chart.
This is why people think of Google as an advertising company: because it is – at least from a revenue standpoint. The overwhelming majority of its revenues has historically, and does at present, come from ads. Slightly less each year, it’s true, but even today 95% of Google’s revenue is ad revenue. Not products, not services, ads.
But within the ad revenue numbers, there’s an interesting trend that ought to concern Google.
In 2005, almost $0.50 of every advertising dollar Google earned came from non-Google network sites. By 2012, that number was down to $0.29. By itself, this isn’t a problem. Because Google’s network has grown substantially over that seven year span, it’s understandable that a higher percentage of advertising revenue would come from it. This is, essentially, the model that Android, Google Fiber and so on are built on. But Google has always had to hedge against an increasingly fragmented online world. Android is one such, but Google’s M&A behavior indicates that hardware will play a bigger role than many expect. Nest is what everyone remembers when they think of Google and hardware, both because of the valuation and the high visibility of its flagship product. But Google’s got more hardware plays, Motorola or no Motorola: Glass, self-driving cars, Boston Dynamics’ robots and so on. And speaking of Motorola, not many people noticed, but Google kept their Advanced Technology and Products unit during the Lenovo transaction.
What exactly Google plans to do with all of this hardware talent is as yet unclear, but it wouldn’t be a surprise to see Google begin making more substantial forays into the home, as the apparently mothballed Android@Home initiative was intended to do. Having first colonized the browser experience and then the handset, the home – and the car – are logical next steps. So while I don’t expect hardware to show up in the balance sheet in a meaningful way in 2014, it seems probable that by the end of the year we’ll be more inclined to think of Google as a hardware company than we do today.
Possibly related: what is Google doing with Hangar One?
Google Will Acquire IFTTT
Acquisitions are always difficult to predict, because of the number of variables involved. But let’s say, for the sake of argument, that you a) buy the prediction that a major problem with the IoT is compatibility and b) that you believe Google’s becoming more of a hardware company broadly and IoT company over time: what’s the logical next step if you’re Google? Maybe you contemplate the acquisition of a Belkin or similar, but more likely you (correctly) decide the company has quite enough to digest at the moment in the way of hardware acquisitions. But what about IFTTT?
By more closely marrying the service to their collaboration tools, Google could a) differentiate same, b) begin acclimating consumers to IoT-style interconnectivity, and c) begin generating even more data about consumer habits to feed their existing (and primary) revenue stream, advertising.
Consider the mobile implications more specifically. Let’s assume, as Ben Thompson does – I believe correctly – that the viable business models in mobile are a) devices and b) services. Google has made clear which of those it will bet on, whether it was the sale of Motorola, the increasing distance between AOSP and the Google services behind it, or comments like these. Google’s existing services are solid, and market leading in some categories. But IFTTT offers the ability to extend these services and combine them with others in interesting, creative ways – particularly when Nest is factored in as well.
How the market would react is another question – at least anecdotally, the blowback from the Nest pickup appeared to be severe – but on paper at least IFTTT would be a very interesting complement to Google’s consumer services play, and just as importantly, one capable of feeding its still-advertising based bottom line.
As long as we have been doing our programming language rankings here at RedMonk, dating back to the original publication by Drew Conway and John Myles White, we have been trying to find the correct timing. Should it be monthly? Quarterly? Annually? While the appetite for up to date numbers is strong, the truth is that historically changes from snapshot to snapshot have been minimal. This is in part the justification for the shift from quarterly to bi-annual rankings. Although we snapshot the data approximately monthly, there is little perceived benefit to cranking out essentially the same numbers month after month. There are more volatile ranking systems that reflect more ephemeral, day-to-day metrics, but how much more or less popular can a programming language realistically become in a month, or even two? The aspect of these rankings that most interests us is the trajectories they may record: which languages are trending up? Which are in decline? Given that and the adoption curve for languages in general, the most reliable approach would seem to be one that measures performance over multi-month periods at a minimum.
Previously, GitHub’s Explore page ranked their top programming languages – theoretically by repository – and we simply leveraged those rankings in our plot. For reasons that are not clear, this provided ranking has been retired by GitHub and is thus no longer available for our rankings. Instead, this plot attempts to duplicate those rankings by querying the GitHub Archive on Google’s BigQuery. We select and count repository languages, excluding forks, for the Top 100 languages on GitHub. Without knowing precisely how GitHub produced their own rankings, however, we can’t be sure we’re duplicating their methods exactly. And there is some evidence to suggest that the new method is an imperfect replica. Previous iterations have produced correlations between GitHub’s rankings and Stack Overflow’s as high as .82 but never one lower than .78. This quarter’s iteration is the lowest yet at .75. It’s possible, of course, that this is reflective of nothing more than a natural divergence between the two communities. But it’s equally possible that our new method is slightly different, and therefore producing slightly distinct results, than in previous iterations. Until and unless GitHub decides to resume publishing of their own rankings, however, this is the best method available to us. This must be kept in mind when comparing these results against previous iterations.
Besides that notable caveat, there are a few others to reiterate here before we get to the plot and rankings.
- To be included in this analysis, a language must be observable within both GitHub and Stack Overflow.
- No claims are made here that these rankings are representative of general usage more broadly. They are nothing more or less than an examination of the correlation between two populations we believe to be predictive of future use, hence their value.
- There are many potential communities that could be surveyed for this analysis. GitHub and Stack Overflow are used here first because of their size and second because of their public exposure of the data necessary for the analysis. We encourage, however, interested parties to perform their own analyses using other sources.
- All numerical rankings should be taken with a grain of salt. We rank by numbers here strictly for the sake of interest. In general, the numerical ranking is substantially less relevant than the language’s tier or grouping. In many cases, one spot on the list is not distinuishable from the next. The separation between language tiers on the plot, however, is generally representative of substantial differences in relative popularity.
- In addition, the further down the rankings one goes, the less data available to rank languages by. Beyond the top 20 to 30 languages, depending on the snapshot, the amount of data to assess is minute, and the actual placement of languages becomes less reliable the further down the list one proceeds.
With that, here is the first quarter plot for 2014.
(embiggen the chart by clicking on it)
Because the plot doesn’t lend itself well to understanding precisely how languages are performing relative to one another, we also produce the following list of the Top 20 languages by combined ranking. The change in rank from our last snapshot is in parentheses.
- Java (-1)
- C# (+2)
- Python (-1)
- C++ (+1)
- Ruby (-2)
- CSS (new)
- Shell (-2)
- Scala (-1)
- R (1)
- Matlab (+3)
- Clojure (+5)
- CoffeeScript (-1)
- Visual Basic (+1)
- Groovy (-2)
A few observations of larger trends:
- Gains for C++/C# / Losses for Python/Ruby: It’s tough to say which was more odd from the result set: the slight gains from the compiled languages or the slight declines from the interpreted alternatives. To be clear, it’s dangerous to read much into the wider popularity of any of these runtimes based on these results. Ohloh, for one, does not concur with the trajectories implied.
But they do represent a change at least within this result set – which has been relatively static. There are some who are – anecdotally, at least – arguing that a C++ renaissance is underway. Until we see more hard data, it’s probably safest to chalk the small change in fortunes here up to statistical noise, but we’ll be watching compiled language trends closely and looking to test the hypothesis wherever possible.
- Clojure Makes the Top 20: For the first time since we began surveying, Clojure joins its JVM-based counterpart Scala as a Top 20 language. It is the continuing success not only of Java the language but JVM-based alternatives that makes the regular “Java is dead” arguments so baffling.
- Statistical Language Popularity: Both R and Matlab experienced gains this quarter, and this was the third consecutive quarter of growth for R in particular. While, as the plot indicates, these languages tend to outperform on Stack Overflow relative to GitHub, they are indicative of a continued rise in popularity for statistical analysis languages more broadly.
- The Rise of Go: Go, which we termed a notable performer in last year’s Q1 ranking, continued its rise. It checked in just outside the Top 20 at 22 this quarter, a gain of six spots from last quarter.
- Languages to Watch: In the intial run of the data for this quarter, Julia, Rust and Elixir finished back to back to back. After making a correction to the GitHub Archive query and re-running the data, they finished Julia, Rust and then Elixir one spot removed from Rust. Regardless, while these are not going to challenge for Top 20 rankings within the near future (Julia performs best at 62), they are each languages to watch, with notable followers and contributors. We’ll keep an eye on each as we move along.
Big picture, the takeaway from the rankings is that language diversity is the new norm. The Top 20 continues to evidence strong diversity in domain, and even non-general purpose languages like Matlab and R are borderline mainstream from a visibility perspective. Expect this to continue, with specialized tools being heavily leveraged alongside of general purpose alternatives, rather than being eliminated by same.
Given that it’s January, it’s time for predictions. Which means that it is first time to revisit last year’s predictions; for those interested here are the recaps for 2010, 2011 (Parts 1 and 2) and 2012. A few words before we continue, however.
Historically, the predictions here have been heavily based on logical extrapolations based on extant market factors. The advantage of this approach is that it tends to be fairly accurate; the worst I have performed in the three prior years of scoring was 67% correct. The disadvantage, however, is that this approach yields predictions that are grounded in fact but somewhat less exciting, as predictions go. These predictions have been accused, at times, of being “boring.”
In response to
public shaming a suggestion from Bryan Cantrill last year, however, I broke with this tradition and was more aggressive with my predictions for 2013. On the plus side, this yielded predictions that were at least theoretically more compelling. On the downside, however…well, you’ll see.
On to the predictions.
Apple Will Be Rumored to Be Looking at Yahoo
“Hence speculation about potential acquisitions like Twitter that might infuse the company with the DNA necessary to compete effectively with Google, Microsoft and others in the services arena.
While many names will continue to circulate, the prediction here is that one name that will emerge in 2013 will be Yahoo. Long regarded within the industry as a company adrift, one that excelled at alienating and hemorrhaging its best technical talent, the 2012 hire of Marissa Mayer seems to have something of a stabilizing effect on the company. Any potential turn around is likely years away, but Mayer seems to have at least stopped the bleeding. And with some recent successes like the Flickr iPhone app, they may be righting the ship.
But that won’t stop the rumors from circulating at some point, particularly if Yahoo continues to rebound under Mayer’s tenure. And to the Stephen O’Grady of January 2014, no, this piece does not count as a rumor.”
Being unable to count last year’s piece, the question is whether there were rumors of a Yahoo acquisition of Apple. The answer to that depends on how one defines rumors. Certainly Apple and Yahoo grew closer (“Yahoo, Apple Discuss Deeper iPhone Partnership“) but that does not constitute rumors of a merger. Many argued that the two companies were a logical match, see: “Should Apple Just Buy Yahoo!?”, The Motley Fool or “Why Apple Should Buy Yahoo!”, The Street. Those are rumors, right?
Unfortunately, no. Arguments from analysts – financial or industry – do not by themselves constitute a rumor. While I still believe in the logic behind this prediction, therefore, without even a single story citing unnamed “sources” concerning an acquisition, this prediction must be counted a miss.
The Biggest Innovation in Smartphones Will Be Pricing
“On Apple.com right now, the cheapest unsubsidized iPhone is $649.00. Even if one concedes that Apple’s design and polish is worth paying a premium for, the question is whether it’s worth twice as much as an Android device with comparable specifications. Currently, Apple is relying upon carriers to make them price competitive by presenting customers with consumer afforable pricepoints from $199 to $399.
But if Google and its partners can continue to make high end devices available at a price point half that of what Apple is charging, something has to give. And that may well be Apple’s margins. Rumors of Apple’s low-cost iPhone indicate that they are not only aware of this pricing umbrella, but poised to eliminate it.”
If I’m correct, the cost of smartphones will come down substantially in the next twelve months.”
On the one hand, there’s the $99 iPhone 5C and the $349 Nexus 5. On the other, there’s the $649 iPhone 5S, the $649 Samsung Galaxy S4 and the $599 HTC One. Clearly the market for higher end, higher margin handsets is being sustained. Just as clearly, however, signs of erosion – like the 5C itself – are evident.
This isn’t entirely a hit, but it’s not entirely a miss. I’ll call it a push.
Collaboration Innovation and M&A Will Spike
“Rapportive was not the first collaborative software add-on startup to be acquired (LinkedIn, 02/12), but it will certainly not be the last. With more collaboration softare users operating in SaaS environments, the opportunities expand horizontally for developers with bright ideas to target the space. Whether startups like 410 Labs (Mailstrom), Baydin (Boomerang) and so on get acquired or merely have their features replicated by the larger platforms they plug in to is unclear, but like Rapportive before them the featuresets they’ve produced independently are too valuable not to be incorporated back into the products they complement.
We will see that happen, one way or another, in 2013.”
In a word, no. We didn’t see that happen at all. Inexplicably, neither 410 Labs nor Baydin were acquired, nor were their features replicated. Collaboration more broadly saw minimal M&A activity, and while vendors like Box or Dropbox did expand sideways via features like Box Edit or Albums, overall the pace of innovation in the space appeared to slow.
Making this prediction a miss.
Data Moats Will Become a Stated Goal
“As mentioned in the review of last year’s predictions, many businesses are beginning to comprehend the opportunity – and more importantly, threat – of data aggregation and collection. What we’ll see in 2013 is an increased understanding of the data moat, and a more widespread utilization of them as points of differentiation.”
While the market as a whole still has yet to fully digest the reality that data is an asset, and that smart vendors aren’t selling products any longer but products enabled by data, there were those who perceived the opportunity. Spiceworks, of course, has seen the opportunity all along, but newcomers to the market like CloudPhysics with its concept of “collective intelligence” are helping push the rock up the hill. And the interest in – and active product plans around – data from within our customer base spiked dramatically.
Finally then, I’ve got a hit.
Google’s Compute Engine Will Emerge as the Most Important Amazon Challenger
“While there are many would-be cloud providers in market, Google is different. The company has the advantage of having run infrastructure at a massive scale for over a decade: the search vendor is Intel’s fifth largest customer. It also has deep expertise in relevant software arenas: it has run MySQL for years, the company was built upon customized versions of Linux and it is indirectly responsible for the invention of Hadoop (via the MapReduce and GFS papers).
In 2013, then, Google will emerge as Amazon’s most formidable competitor.”
Arguments could be made here for several providers, most notably Microsoft Azure, but with GCE finally going GA, interest in the platform is outpacing competitive platforms substantially. In spite of interesting announcements from the likes of CenturyLink (Tier 3), CSC (Infochimps/ServiceMesh), Joyent (Manta), Softlayer (IBM), and so on, then, GCE is top of mind for a great many developers at the moment. And while he stopped short of annointing them the second place candidate, Ex-Joyent CTO Jason Hoffman is at least bullish about Google’s technical capabilities if not their commitment level in this interview.
I’ll call this a hit.
The Focus of Online Education Innovation Will Be Less on Learning Than Certification
“First, and most obviously, there are questions regarding the rigor of the educational experience. Without classroom supervision, and at massive scale, how can educators ensure that students are actually paying attention to the coursework? Traditionally, this is accomplished via testing, but distance education poses challenges here as well. How can employers be sure that those claiming to have completed online courses were actually the ones taking tests?
In 2013, then, we’ll online educators using innovation to tackle the first problem.”
Contrary to this prediction, it’s possible that 2013 may represent the beginning of a decline in the importance of certificates and titles. As the Boston Globe’s Scott Kirsner wrote:
While MOOCs seem like they can only enhance a job candidate’s appeal, many people I talked to noted an important shift in the world of hiring. Credentials, whether a MOOC certificate or an MBA degree, are declining in importance, while portfolios are on the rise.
What’s a portfolio? Some sort of evidence of your expertise and abilities online, like design work showcased on Dribbble.com, software code on GitHub, a mobile app you’ve built, or a sales presentation you developed and posted to SlideShare.
And indeed, the lack of emphasis on a formal education is becoming increasingly common in the technology sector, at least outside of specific employers who put an emphasis on educational pedigree.
All of that said, tactically this was a year in which MOOCs began to examine and address the problem of certification in earnest. In September, for example, edX – the institution jointly founded by Harvard and MIT – announced a pilot program to certify that the students completing the coursework were the actual students in question. Coursera’s version of this is called Signature Track.
Whether or not employers will broadly choose to elevate the portfolio over certificates and degrees at some point in the future is as yet undetermined. But at least in the present, MOOCs are aggressively moving to reassure the downstream customers of their services – employers – that their offerings have real value.
I’ll call this a hit.
Every Business Will Throw Money at Data Teams
“In other contexts, enterprises will eschew efficiency and consume infrastructure resources (primarily via public cloud offerings like Redshift) at alarming rates, believing that the solution for poor algorithms is more, bigger data.
The outcomes from these collective efforts will be mixed, but spending will continue unabated. With the world having been revealed as the province of the Big Data winners, businesses will ratchet up already heightened spending on data fields to unprecedented levels.
It’ll be a good year to be on a data team, in other words.”
On the spectrum of prediction risk that Bryan Cantrill has proposed – Safe, Likely, Possible, Exciting, Spectacular – the above prediction is only fairly classified as “Safe.” Even if the demand for some new, unpredicted emerging technology outstripped the importance of data science resources, it doesn’t make them unimportant. Indeed, data scientists are a currency in demand from a wide variety of resources today, from financial services to traditional ad/media spend to social network/search to retail to, in the person of Hilary Mason, VC.
So while hard figures are harder to obtain here – more reliable than Google Trends, at least – based on the observed demand and direction of data science types in our space, it seems safe to call this a hit.
Explicit Services Will Be Advantaged Over Implicit Services
“[Google Now] is pretty easily Android’s best new feature, and is clearly the shape of things to come. In 2013, then, we’ll see Google continue to make Google Now more useful, but also see competitors like Apple or Microsoft attempt to surface some of the latent features of their platform in a more explicit fashion. Voice search, Siri and competitors may be enormously capable, but as long as vendors are depending on users to discover those hidden features, they will go generally unused.”
Google Now has certainly done its part to make this prediction come true, both expanding the topical coverage of Now and porting it to iOS. Apple, meanwhile, has invested inorganically in acquisitions such as Cue, in all probability as a competitive response to Now. As has Yahoo, interestingly, which picked up Aviate for roughly 2X what Apple paid for Cue. Aviate, for those who have not used it, is an alternative interface for Android that adjusts itself based on factors such as location.
Given the above, and the lack of any observable breakout success for voice services such as Siri, I feel safe in calling this a win.
Telemetry Based Models Will Be Democratized
“The bet here is that the 37signals experience will be predictive, and that many shops will begin collecting and mining their generated telemetry in search of performance gains, feature improvements or even additional value added services for customers. There’s not much choice in the matter: if competitors will be using their telemetry, businesses will be forced to use theirs to keep up. And the democratization via open source nature of these tools will enable this.”
While it is true that the tools 37signals employed – Impala, among others – saw upticks in adoption, the business of collecting and operating on telemetry was anything but democratized in 2013. It’s not primarily a software problem; the technology available, even at no cost, for collecting, storing, and querying datasets even at scale is quite good. But the challenge for many of the businesses we work with is focus. Even as many businesses as written above acknowledge their interest in leveraging telemetry for mutual opportunity between customer and vendor, and are actively exploring same, the lack of out-of-the-box solutions depresses interest. What we hear from them is something like: “I’m in the business of building $MYPRODUCT: leveraging the analytics it generates and selling that is a fundamentally different business.” The would-be solution to this demand, which is part packaging and part analytical expertise, is a substantial opportunity for someone.
But as it’s an opportunity largely unfulfilled at the present time, this is a miss.
The Most Important Cloud Question Will Be Not Whether to Use it But How Much Am I Already Using?
“While the cloud model generally and price competition specifically have acted to make cloud resources more and more affordable, at scale even small costs add up. As executives everywhere are discovering.
It should be no surprise, then, that a cottage industry of startups (Cloudability, Newvem, PlanForCloud (Rightscale), etc) has emerged to provide not only visibility into total costs, but proactive advice on utilization rates and optimization functionality.
While much speculation will still center on whether enterprises are or should be using public cloud resources, then, intelligent organizations will acknowledge that, like it or not, they will be using the public cloud in some form and seek the ability to measure that usage carefully.”
If Cloudability’s numbers are at all representative, then the enterprises that aren’t intensely focused on measurement of public cloud spend have a serious problem on their hands. In 2012, Cloudability customers spent an average of $21,500 an hour. By 2013, that number was $65,099 – an increase of 203%. And they’re predicting 3X to 4X growth by 2014. While those numbers are interesting, market behaviors alone would confirm the explosive growth of public cloud. From the acquisitions of the likes of Softlayer to Dell’s decision to go private to Oracle’s belated sponsorship of OpenStack, the pressure on market incumbents has become obvious to even casual observers. And as spend goes up, the importance of monitoring same – particularly for businesses unaccustomed to ongoing, annuity-style expenses – goes up in response. Which means that Newvem will not be the last company to be acquired in this space.
I’ll call this a hit.
The Final Tally
Out of ten total predictions, then, we have four predicted incorrectly, five correctly and one push. Which yields a success rate, less the push, of 55% – easily my worst ratio to date. That said, as with previous years I stand behind the substance of even the misses on this list, believing that many yet remain likely to occur – just not as quickly as I’d expect. Apple may not have literally have been rumored to be acquiring Yahoo, as one example, but the thought process behind it remains logical and defensible. While it’s a miss according to the binary scoring system here, then, it still has some value.
Which may help explain why, in spite of the bludgeoning I absorbed this year, I will be retaining the more aggressive approach for 2014 and years moving forward. Look for next year’s predictions shortly.
For all that it feels as if Git has exploded on to the scene, the advancement of distributed version control systems (DVCS) has in fact been laborious and slow. As far back as 2007, we were advocating on behalf of systems such as Bazaar, Mercurial and Git. While it was not yet possible to foresee which of the many DVCS options would emerge as the de facto standard, the evidence suggested that for sophisticated, progressive developers, distributed development was a fundamental game changer. If DVCS was the future, however, it was certainly unevenly distributed.
Three years later in 2010, when we first examined Ohloh’s data on repository distribution across 238,000 projects, Git was just emerging as the DVCS option of choice for developer populations. It was in use at just under 11% of the surveyed projects, easily bettering Mercurial’s 1.25% and Bazaar’s 0.59%. But while Git was outperforming its distributed counterparts, it was distinctly less competitive with centralized alternatives. The venerable CVS was in use at twice as many projects – 26% – and Subversion checked in with a dominating 60% share of all projects. And this was, remember, Ohloh’s survey of open source projects: one that theoretically should have favored DVCS more than a survey of enterprise VCS repositories might have.
Two years later, however, the tide had begun to turn. Git (28%) more than doubled its traction, while CVS usage (12%) was cut in half. Even with effectively no growth from either Bazaar nor Mercurial (~2% combined), then, the standard bearer of distributed version control was carrying the approach forward at the expense of older centralized systems.
Three years from our original checkpoint, and the picture is even clearer. Having examined Ohloh’s repository data and compared it to prior years, a few conclusions can be made. Before we get to that, the source and issues.
The data in this chart was taken from snapshots of the Ohloh data exposed here.
Objections & Responses
- “Ohloh data cannot be considered representative of the wider distribution of version control systems“: This is true, and no claims are made here otherwise. While it necessarily omits enterprise adoption, however, it is believed here that Ohloh’s dataset is more likely to be predictive moving forward than a wider sample.
- “Many of the projects Ohloh surveys are dormant“: This is probably true. But even granting a sizable number of dormant projects, it’s expected that these will be offset by a sizable influx of new projects.
- “Ohloh’s sampling has evolved over the years, and now includes repositories and forges it did not previously“: Also true. It also, by definition, includes new projects over time. When we first examined the data, Ohloh surveyed less than 300,000 projects. Today it’s over 600,000. This is a natural evolution of the survey population, one that’s inclusive of evolving developer behaviors.
With those caveats in mind, here is a chart depicting the total share of repositories attributable to centralized (CVS/Subversion) and distributed (Bazaar/Git/Mercurial).
The trendline here is clear: after years of languishing as a fringe technology, distributed version control systems are on a clear path towards a majority share. From less than 14% of the surveyed repositories just three years ago to almost half today, the growth of DVCS is clear and unimpeded. The qualitative evidence supports this conclusion as well: few vendors today with VCS integration points fail to include, if not standardize on, distributed tools broadly and Git more specifically.
On the latter point, while the growth of DVCS in general is clear, many are curious as to Git’s part in that. Is it the clear default it was a year ago, or have alternative projects benefited from the rising tide of distributed version control?
The short answer is no, they have not.
The focus of this chart tends to be either the growth of Git or the declines in both CVS and Subversion, but the continuing lack of traction for both Bazaar and Mercurial is notable. Git’s dominance, whether it’s because of its power and speed or in spite of its idiosyncratic syntax, is effectively locked in at this point. There are no factors in either in this data or anecdotal evidence to suggest a simmering interest in Git alternatives that could fuel a comeback. While it’s possible, then, that Bazaar and Mercurial could become greater factors in the version control space moving forward, it’s not likely.
Consider the following chart that depicts the overall gains or losses in share from 2010 to 2013.
In a mere three years, Git is up almost 30% (27.09%) while Bazaar and Mercurial are up 1.41% and 0.75%, respectively. But at least they’re positive over that timeframe, slight though their gains may have been. CVS usage is down almost 16% since 2010. Subversion’s case is even more interesting. From 2010 to 2012, it only declined 4.3%. In the last year, however, it began to slip much more dramatically, and since 2010 it is off 13%.
The data seems clear, but a few questions remain.
First, and maybe the most interesting, not why has this happened, but why has it taken so long? From open source databases to runtime fragmentation to cloud infrastructures, the world has been completely turned over in a far shorter span of time than it’s taken DVCS to gain even near equivalent acceptance amongst developer populations. To answer this, many point to the friction of migrating a given codebase from one version control tool to another, or the increasingly complicated integrations of existing VCS tools into build chains. Both are likely contributors, but my suspicion is more simple: developers didn’t drive the adoption of these tools as quickly as a few others because they were harder to master. MySQL, remember, is in most respects a simpler product to adopt for developers accustomed to the traditional, full function commercial databases. DVCS, on the other hand, not only requires that developers learn an entirely new tool and syntax, but that they change the way they think. This philisophical difference in approach has caused even high profile developers to struggle with its implications, and thus be more slow to adopt and propagate the technologies. This, more than any other factor including enterprise conservativsm, is likely why it’s taken the better part of a decade for DVCS to really hit its stride.
The second, and more important question is, what does the above mean to me? That depends, clearly, on what your position is.
- If you’re using Git already and/or advocating its usage: The above should either validate your usage or arguments in favor of usage.
- If you’re using Bazaar, Mercurial or another DVCS: Nothing in the above precludes you from continued usage. There may be advantages to migrating to Git, including relative differences in developer familiarity with the toolsets, but the fundamental importance of Git is its enablement of distributed development, which these tools allow. You might consider switching then, but it’s certainly not an immediate necessity.
- If you’re using CVS, Subversion or a centralized alternative: Much depends on your usage, of course, but the benefits of DVCS over a centralized alternative are substantial. The speed of distributed development, which can occur in parallel, is likely to greatly exceed that permitted by centralized alternatives, which operate on a serial model. With Git becoming a de facto developer standard, as well, reliance on non-distributed tools could impact both hiring and retention. There are exceptions, of course, but the recommendation here is to evaluate Git as a future path for all of the above reasons.
Overall, these charts will surprise only the most conservative of enterprises. The rise of Git has been well chronicled, and the number of large projects such as Eclipse that have transitioned from centralized to distributed models has elevated awareness of the trend. That DVCS is increasingly the preference and Git the tool of choice is, in many quarters, accepted as a given. The data above merely tests this assumption quantitatively and concludes that it is, according to this sample at least, justified.
Author’s note: this was originally written in September, but was lost amidst a sea of Sublime Text tabs until I rediscovered it earlier this week and never published. As late seems better than never, I offer it up months after the fact. You heard it here last, as always. – sog
Since Microsoft announced Steve Ballmer’s pending retirement, there has been no shortage of commentary and retrospective on his tenure. While opinions on the subject vary widely, it’s probably safe to characterize the conventional wisdom as more critical than not. The contrarian, therefore, is one that defends the outgoing Chief Executive. The former viewpoint is generally inclined to focus on a share price that has stagnated since the millennium, the latter Microsoft’s historically impressive ability to generate and sustain revenue.
As with most large entities, however, it’s probably a mistake to assign too much responsibility for either to a single person. In the case of the man most Microsoft employees refer to as Steve B, it’s true that his ascension to the role of Chief Executive came at a natural apex of the firm. Given how meteoric the rise of the company was, in other words, it would be difficult to expect anyone to sustain the growth that inflated the share price in the first place. On the other hand, it’s worth asking whether Ballmer created the twin revenue engines that have sustained Microsoft for over a decade and subsidized its efforts to replicate the success of Windows and Office, or whether he inherited them.
Which means that it’s probably best to deprecate evaluations of his tenure that focus purely on financial metrics and focus instead on broader assessments of the company’s ability to react to the market around it.
This is, of course, ground that has been covered nearly as completely as the financial metrics. Nearly every piece reacting to the Ballmer’s retirement and the forthcoming transition has referenced, appropriately enough, the disruption theory dissected and described by Harvard Business School Professor Clayton Christensen. By now, most are familiar with both the broader argument and how they pertain to Microsoft specifically. Businesses become successful over a period of time, and their success in one or more product areas blinds them from, and in fact actively disincents a response to, the threats posed by those that will succeed it. Companies built to dominate one market, in other words, are rarely able to replicate that success in the model that succeeds it. Applied to Microsoft, even laymen cannot fail to understand how a variety of market forces from open source to cloud to mobile collectively and individually first undermined and then actively disrupted Microsoft’s once unassailable dominant market position.
There might be some dispute over the specific implications, then, but there is near universal consensus that Microsoft has been disrupted just as numerous technology giants have before it. What’s up for debate is how Steve Ballmer responded – or failed to respond – to these disruptive forces. And, assuming that one acknowledges that the firm has been disrupted, whether anyone might have responded more constructively than Microsoft’s outgoing CEO.
For critics, the answer is simple: Steve Ballmer uniquely failed to identify the potential and threat of markets such as cloud or mobile, and is as a result directly responsible for the disruption. The majority of these arguments imply, further, that others might have done better in Ballmer’s stead.
Asymco’s Horace Dediu describes this line of thinking most clearly in a piece entitled “Steve Ballmer and the Innovator’s Curse“:
The most common, almost universally accepted reason for company failure is “the stupid manager theory”. It’s the corollary to “the smart manager theory” which is used to describe almost all company successes. The only problem with this theory is that it is usually the same managers who run the company while it’s successful as when it’s not. Therefore for the theory to be valid then the smart manager must have turned stupid at a specific moment in time, and as most companies in an industry fail in unison, then the stupidity bit must have been flipped in more than one individual at the same time in some massive conspiracy to fail simultaneously.
So the failures of Microsoft to move beyond the rapidly evaporating Windows business model are attributed to the personal failings of its CEO.
He goes on, however, to call these theories nonsense. He is correct, at least in the implication that Steve Ballmer did not suddenly become stupid. Whatever else that may be said about the man, even his enemies acknowledge his intelligence. And his track record supports this; as Dediu wryly notes, Ballmer’s “only failing was delivering sustaining growth (from $20 to over $70 billion in sales.)”
This defense, however, is built on a core assumption I do not happen to share. Specifically, the following:
The Innovator’s Dilemma is very clear on the causes of failure: To succeed with a new business model, Microsoft would have had to destroy (by competition) its core business. Doing that would, of course, have gotten Ballmer fired even faster.
Having studied under Christensen, Dediu doubtless understands the disruption theory articulated by the Innovator’s Dilemma as well or better than anyone save Christensen himself. But it seems worth examining the assumption that Microsoft was intrinsically and unavoidably vulnerable to the disruptions it is currently coping with.
Consider mobile. It’s easy to forget now, but Microsoft actually didn’t miss this market: it was simply out-executed. Along with the rest of the market, to be fair. Microsoft had a presence, and a sizable one, in mobile prior to the arrival of the iPhone, which fundamentally altered the landscape in one stroke. But its mobile offering was heavily and unfortunately influenced by its desktop roots. The important questions are first, having seen what mobile has done to the PC market, whether Microsoft should have been investing in mobile in the first place, and second, if they had invested, whether a desktop computer company could effectively adapt to a mobile market.
The answer to the first question is simpler than it might appear. Many analyses point to the massive disruption in the PC market as evidence that Microsoft could not have, and in fact should not have, invested in mobile due to the possibility (now a certainty) that their existing PC business would be cannibalized. But this response tends to omit the financial opportunity mobile came to represent. If Microsoft had decimated its own PC business but ended up owning the profits of Apple’s mobile businesses, one suspects the market would have few complaints.
As to whether or not Microsoft could have been successfully innovative in mobile the way that Apple was, what prevented it? Adherents to the theory of disruption might argue that it was impossible; that Microsoft was so fixated on its success on the desktop that success in a fundamentally different model – mobile – was virtually impossible. On a technical level, of course, but in broader terms as well. How could a company built on selling licenses of software, one utterly convinced of the superiority of software over hardware and validated by years of market confirmation, adapt to the radically different model of selling an integrated package of hardware and software?
Besides the fact that Microsoft today is in fact selling integrated hardware and software produts, challenging this theory of inevitable disruption are both Apple and Google. Apple was a computer company that created entirely new markets out of MP3 players, smartphones and tablets. Even granting that Apple enjoyed advantages in its singular focus and expertise in user experience, enabled in part by its ownership of the entire hardware and software package, it seems difficult to make the case that what Apple was able to accomplish was fundamentally impossible for Microsoft.
Google, meanwhile, was an online advertising company that created the operating system that’s the closest facsimile to the Windows model in mobile we have seen to date. They did this, in fact, understanding that it was likely to damage a relationship with Apple at the time that was remarkably close in retrospect. While their motives in doing so are a matter for speculation, it is probable that Android was created and pushed – much like Chrome – to avoid having their core business – which is, again, advertising – disrupted through third party control points, i.e. mobile operating systems.
If a computer market and an advertising company can both create new markets and stave off disruption, it is not reasonable to conclude that Microsoft – once the biggest, most powerful technology company on the planet, like Apple today – would be fundamentally unable to do the same.
Particularly because there is little intrinsic to mobile that is fundamentally incompatible with their primary software-based revenue model. Certainly Android, as mentioned, resembles it reasonably closely today. True, Google effectively gives Android away, because its development costs are subsidized by its advertising business. But Microsoft has managed, through its aggressive utilization of its intellectual property, to recreate a licensing business all the same. What if, for example, Microsoft had approached all of the current Android partners in the wake of the iPhone’s launch with a version of Windows Phone that was similar to what it is today? The bet here is that, just as happened with Android, they’d rush towards anyone offering them a weapon with which to do battle with Apple.
Which means, in turn, that Microsoft wasn’t necessarily doomed to disruption, but merely executed poorly.
Likewise cloud. If you are convinced that the fundamental value of the cloud lies in price, you must concede that Microsoft was doomed to an uphill battle in cloud without destroying its existing businesses. Microsoft’s primary competition then and now, obviously, was an operating system that could be obtained and run at no cost. And that is fundamentally disruptive to Microsoft’s business. But if you recall that Amazon has always commanded substantial margins above competitors, the opportunity for a premium for software seems somewhat plausible. And if instead of price one considers convenience as the primary driver of cloud consumption rather than price, Microsoft’s opportunity becomes clearer. Again, what if Microsoft had been able to offer Windows instances it hosted within a reasonable timeframe of the launch of EC2? As many cloud providers have discovered, it’s much easier to build in the premiums you need when you’re charging by the hour, as it mitigates the sticker shock by masking the premiums.
Both cloud and mobile, in that analysis, represent opportunity as much as threat.
It’s easier to understand, on the other hand, how Microsoft was willing to wage total war against open source for so many years before pivoting towards a more comprehensive strategy. Open source is, after all, a fundamental repudiation of the model Microsoft built itself on. To Microsoft, certainly then and to a lesser extent now, software is an asset of intrinsic and inextricable value. The tens of thousands of people Microsoft employs to write software, in fact, are dependent on this assumption. Open source, however, doesn’t imply that software has no value – but it does require a dramatically different understanding of its commercial value. How does one charge money for an asset that is available for free is a question that every open source business contends with daily. Each evolves different mechanisms to adapt, but none have nor are likely to replicate the growth that Microsoft, Oracle and other primarily software entities have achieved.
So how could Ballmer – the head of a company built upon an assumption fundamentally undermined by open source – possibly have avoided being disrupted by the explosive growth of open source? Maybe by watching the company Microsoft once itself disrupted and were specifically built to compete with: IBM. No less singularly focused on profit than the Redmond software giant, IBM nevertheless found ways to leverage free as in money software to its strategic gain. Rather than fight the tide, IBM found ways to leverage assets like the Apache web server, Eclipse or most recently Hadoop, to its gain. It recognized that for many of its customers, software sales were essentially a packaging exercise. If bottom-line and obsessively profit-focused IBM could perceive opportunities in free-as-in-beer software, it is difficult to make the case that it would be impossible for Microsoft to do the same. Microsoft could have embraced a wide variety of open source strategies while still protecting its crown jewels of Windows and Office, but for the first thirty-three years of its existence open source was a religious rather than business issue within the company.
It’s fine to defend Ballmer by saying that few, if any, perceived the opportunities that Steve Jobs and Jeff Bezos did. But it’s harder to make the case that, having seen that they did, that it would be impossible for Ballmer to do the same. Giving him a pass, effectively, on not cannibalizing his Windows or Office franchises would be akin to giving Steve Jobs a pass for not creating the iPhone to protect the iPod, or the iPad to shield the iPhone. Much like the iPhone and iOS, the cloud and mobile both could have been – and arguably are becoming so today – adjacent, complementary markets to Microsoft’s core Office/Windows franchises. It’s one thing to give Ballmer a pass for missing a fundamentally different opportunity like search; it’s quite another to forgive his slow reaction to two markets – cloud and mobile – that both are dependent to varying degrees on operating system technology.
Disruption is more than likely to overcome every company eventually, but the evidence suggests that Microsoft could at a minimum have acted more proactively to the threats and opportunities they presented. And that, as much as the overwhelming revenue growth, is Steve Ballmer’s responsibility.
Thanks to a combination of market factors, but principally increased competition, there is no technology market where prices are moving more quickly than cloud infrastructure. After completing an initial survey and deconstruction of IaaS pricing trends in August of last year, in fact, follow ups proved difficult because prices dropped so frequently that an analysis was frequently obsolete before it was even published. As was the case last week, when Google at once announced the General Availability of its Cloud Compute Engine (GCE) as well as significant price drops – the day after the original analysis had been re-run.
A week later with no corresponding price drops from competitive providers, the decision was made to publish this before the inevitable arrives. As a reminder, this analysis is intended not as a literal expression of cost per service; this is not, in other words, an attempt to estimate the actual component costs for compute, disk, and memory per provider. Such numbers would be speculative and unreliable, relying as they would on non-public information, but also of limited utility for users. Instead, this analysis compares base hourly instance costs against the individual service offerings. What this attempts to highlight is how providers may be attempting to differentiate by prioritizing memory over compute capacity, as one example. In other words, it’s an attempt to answer the question: for a given hourly cost, who’s offering the most compute, disk or memory?
As with the previous iteration, a link to the aggregated dataset is provided below, both for fact checking and to enable others to perform their own analyses, expand the scope of surveyed providers or both.
Before we continue, a few notes.
- No special pricing programs (beta, etc)
- Linux operating system, no OS premium
- Charts are based on price per hour costs (i.e. no reserved instances)
- Standard packages only considered (i.e. no high memory, etc)
- Where not otherwise specified, the number of virtual cores is assumed to equal available compute units
Objections & Responses
- “This isn’t an apples to apples comparison“: This is true. The providers do not make that possible.
- “These are list prices – many customers don’t pay list prices“: This is also true. Many customers do, however. But in general, take this for what it’s worth as an evaluation of posted list prices.
- “This does not take bandwidth and other costs into account“: Correct, this analysis is server only – no bandwidth or storage costs are included. Those will be examined in a future update.
- “This survey doesn’t include [provider X]“: The link to the dataset is below. You are encouraged to fork it.
- HP’s 4XL (60 cores) and 8XL (103 cores) instances were omitted from this survey intentionally for being twice as large and better than three times as large, respectively, as the next largest instances. While we can’t compare apples to apples, those instances were considered outliers in this sample. Feel free to add them back and re-run using the dataset below.
- Microsoft Azure’s “Extra Small” instance, which lists its core count as “Shared,” has been represented as .5 of a core for this analysis. If a better estimation is available, we’ll include it.
- All of Google and Microsoft’s instances and two of Amazon’s are omitted from the disk cost comparison because they do not include a fixed disk amount per instance.
- Versus the original analysis, IBM’s offerings have been replaced by Softlayer’s, by reason of that acquisition.
- While we’ve had numerous requests to add providers, and will undoubtedly add some in future, the original dataset – with the above exception – has been maintained for the sake of comparison.
How to Read the Charts
- There was some confusion last time concerning the charts and how they should be read. The simplest explanation is that the steeper the slope, the better the pricing from a user perspective. The more quickly cores, disk and memory are added relative to cost, the less a user has to pay for a given asset.
With that, here is the chart depicting the cost of disk space relative to the price per hour.
As mentioned, two of Amazon’s instances (which are EBS only) and all of Google and Microsoft’s are omitted from this chart because they do not include fixed disk price. That said, of the remaining providers, it’s interesting to see that Amazon remains the most aggressive in terms of the disk space made available. Even granting their economy-of-scale cost advantages versus some of the competitors here, with the cost of disk falling it’s interesting that no other players besides Joyent have been more willing to challenge Amazon on this front. Whether this signals a lack of ambition on the part of AWS competitors or a lack of interest from the market in disk versus memory and compute capacity is unclear, but it seems probable that it’s a combination of both.
If these charts are intended to expose prioritization from a feature perspective, meanwhile, the plot of memory capacity versus hourly cost is potentially the most revealing.
Several things become obvious from this chart.
- The high correlation in available memory/hourly costs indicates a shared understanding of the importance of memory pricing.
- Google is the most aggressive in terms of the memory per hourly unit of cost.
- HP is apparently pegging itself to AWS.
- Softlayer and, to a lesser extent, Rackspace, are likely to be less competitive for memory focused buyers.
Lastly, we have a chart of the available compute units relative to the hourly cost.
Clearly signalling its intent to compete with AWS is HP, which matched its pricing on relative available memory costs and betters it in available compute units. Rackspace is more competitive within compute than memory, and AWS – though the dominant market player – is still among the most aggressive in terms of pricing per compute unit. Softlayer and Microsoft’s Azure form the middle of the pack, with Google conspicuously less competitive in terms of available compute units as a function of hourly cost. This is a marked shift from our prior analysis, which had Google among the leaders in terms of compute per hourly cost.
A few overall takeaways, with the reminder that this is merely a survey of standard instances:
- Amazon remains the standard against which other programs are judged and/or judging themselves. In virtually every category, Amazon is amongst the leaders, with competitive providers looking to advantage themselves by either matching or exceeding the price/component offered by Amazon.
- Intentionally or not, providers are signaling their prioritizations from an infrastructure perspective. HP, for example, is clearly attempting to separate itself from the pack on the basis of compute value for the dollar (not to mention instance sizes), while Google is doing the same for memory. Disk, meanwhile, seems to be something of an afterthought, with multiple providers not including it as a standard portion of the offering and no one attempting to outcompete AWS as they are in compute and memory, save perhaps Joyent.
- Given the aforementioned prioritization, it will be interesting to observe its impacts moving forward. Amazon is essentially pursuing a leader’s course: highly competitive by each factor, but not necessarily obsessed with being the absolute leader in each. Two obvious competitive strategies emerge from the above deconstruction. The first, exemplied best by Microsoft here, is a middle of the road value proposition, never the most expensive but never the least. The second appears to be a weighted bet, sacrificing performance in one category (e.g. Google with compute) to achieve leadership in another (Google in memory).
- Besides the implications for users, it will be interesting to monitor how a given vendor’s price prioritization may vary over time, based both on customer demands as well as internal resource availability and costs. Google and HP’s strategies, as but two examples, certainly appear to have evolved since the last snapshot.
In the next iteration of our cloud pricing research, we’ll explore how pricing has changed over time across the surveyed vendors collectively. In the meantime, here is a link to the dataset used in the above analysis.
Disclosure: Amazon Web Services, IBM (Softlayer), and Rackspace are RedMonk customers. Google, HP, Joyent and Microsoft are not current customers.
In March of last year, spurred by in part by a high volume of requests, we examined a few of the community metrics around the configuration management tools Chef and Puppet. Not intended as a technical comparison, it was rather an attempt to assess their traction and performance relative to one another across a number of distinct communities. At the time, there was no clear winner or loser from the comparison.
An interesting thing has happened since we ran those numbers, however. In the interim, two new projects have emerged as alternatives that we’re encountering more and more frequently in our conversations with and surveys of various developer populations.
While it is true that there are a number of open source configuration management tools besides Chef and Puppet, those have commanded the majority of the attention in the category. But increasingly, and in spite of the relative maturity and volume usage of both Chef and Puppet, Ansible and Salt are beginning to attract a surprising amount of developer attention. Where it once was reasonable to conclude that the configuration management space would evolve in similar fashion to the open source relational database market – i.e. with two dominant projects – that future is now in question. Certainly that remains one possible path, but with the sustained in interest in alternatives it’s now worth questioning whether configuration management will more greatly resemble the NoSQL market – which is characterized by its diversity – than its relational alternative.
Because there has been no clear winner in the Chef/Puppet battle and because there are two new market entrants, then, it is not surprising that we’ve been fielding a similar volume of requests to compare the projects across some of the same community metrics as we did a year and a half ago. Here then are how Ansible, Chef, Puppet and Salt compare with one another within various developer related communities, open job postings and more.
Before we get to dissecting the charts, a word on Debian usage. Per a conversation with Jesse Robbins last year following the original Chef vs Puppet analysis, it should be noted that installing via the Debian package management system (apt) – what’s reflected in this chart – is not the preferrred installation method for Chef (gems is). This means that Chef will be under-represented in these charts. Salt, meanwhile, provides installation instructions for Debian that leverage apt and Ansible’s documentation explicitly recommends installation via operating system provided package management systems. One other caveat: while there are in some cases multiple packages for the individual projects, this analysis only includes the most popular for each.
While this is useful for communicating the dominance of Puppet in terms of installations via Debian packages, the chart obscures any other useful information on trajectory.
If we grant that Puppet leads in this context however and subtract it, it’s easier to perceive the growth of each platform. Chef is outpacing the other two projects, while Salt enjoys a moderate lead on Ansible. It’s possible that Ansible’s performance here is related to its close ties to the Red Hat ecosystem; it ships by default in Fedora and is available on RHEL via EPEL. Surveying the distribution on that ecosystem would be interesting, were data available.
GitHub offers a variety of metrics about the projects it hosts. For our purposes here, we’ve chosen the number of times a project has been forked, the number of pull requests accepted over the last 30 days and the number of times a repository has been starred. This is intended to assess, among other things, project activity and developer interest.
Superficial though the signal of a GitHub star may be, it is interesting nevertheless to see Ansible outperforming Salt and Ansible and Salt both outperforming the better known Chef and Puppet.
The leadership of Ansible and Salt within the pull requests, meanwhile, was predictable. As more older and more mature projects, it’s natural that Chef and Puppet would see a lower rate of pull requests. It’s interesting to note, however, that Ohloh shows a much higher number of all time contributors for Ansible (559) and Salt (661) than Chef (331) and Puppet (332). GitHub doesn’t concur with those numbers precisely, but does show a similar disparity in terms of contributor volume.
In terms of the number of times each project has been forked on GitHub, the numbers are closer, but still advantage Ansible. Chef is forked slightly more often than Salt, which in turn is more widely forked than Puppet.
As far as we can tell from these rough GitHub metrics, then, developer activity within the new market entrants signals them as projects to be watched closely.
Within the Hacker News community, the metric is merely mentions of the individual technologies plotted over time. Unfortunately, plotting with ‘Salt’ points to an issue with the metric.
Not only does its performance on this chart wildly outperform expectations, the mentions predate the actual existence of the project by some four years. Clearly we’re dealing with artifacts then, recording mentions on “salting” password databases and the like rather than strictly mentions of the project. If we instead query using SaltStack, the results look slightly more reasonable.
It’s necessary to note that this disadvantages Salt in that the ‘SaltStack’ query will omit some legitimate mentions of ‘Salt,’ but that can’t be helped without a Google Trends-style topical understanding of the subject matter. In the meantime, the results are more or less in line with reasonable expectations. Chef and Puppet outperform their younger counterparts, particularly when the latter hadn’t yet been created, and appear to maintain a substantial edge in overall mentions – although Ansible has been spiking this year and may be currently competitive in terms of discussion volume.
In terms of job queries, we run into similar issues as above. Chef and Salt massively outperform the other two projects, in part because they reflect jobs other than those working on these tools. If we attempt to subset the data we’re looking for, adding ‘technology’ to the query to restrict our search to technology jobs only, we still have issues with Salt and thus are forced to omit them, but have somewhat more reasonable looking data for the other three.
In terms of the absolute number of jobs, both Chef and Puppet are massively overrepresented relative to Ansible as would be expected. Chef’s lead over Puppet is clearly somewhat artificial, as its traction dates back to 2006 while the initial drop of the project was in 2009. But in general, it seems reasonable to conclude that Chef and Puppet offer a higher volume of jobs at the present time than either Ansible or Salt.
In terms of their relative performance, rather than the absolute number of jobs, the most notable feature of the chart is Puppet’s rapid and sustained growth. Ansible looks to be growing, but not nearly at the rate that Chef and Puppet are.
In another counting statistic, meaning that time is a factor, the relative membership rates of LinkedIn user groups were no surprise.
Ansible and Salt were substantially outperformed by both Chef and Puppet. Interestingly, however, Puppet dominated not only the two newer projects but Chef as well. It’s difficult to say, however, whether this genuinely represents an advantage in traction for Puppet’s community, or whether it’s another artifact: this time of the low discoverability of Chef’s user group. Simply entering Chef turns up pages of cooking related user groups; would be members have to begin their LinkedIn query with Opscode to turn up the user group they’re looking for.
To examine the Stack Overflow dataset, this script by Bryce Boe was used to examine the performance of two of the selected projects by Stack Overflow tags by week over a multi-year period. Ansible and Salt did not generate high enough returns to be plotted here.
While Chef comes out slightly ahead, the correlation between questions tagged Chef or Puppet is strong, with neither taking a commanding lead. Importantly, however, the trajectories for both is upwards, if uneven. To get a sense of how all four projects compare in a snapshot, the following chart depicts the tag volume for each project.
To no one’s surprise, Chef and Puppet have generated substantially more questions over time than either Ansible or Salt – if only because they are older projects. Notable in addition to this, however, is Chef’s lead over Puppet. This is interesting because Puppet’s initial release was in 2005, four years before Chef became available. To be fair, however, Stack Overflow itself was only launched in 2008, so it’s not as if Puppet could capitalize on its first to market status with traction on a site that didn’t yet exist. Apart from Chef and Puppet, Ansible (84) demonstrates marginally more traction than Salt (37), but the total volumes mean the importance of that difference is negligible.
What do we take then from all of these charts, with all the mentioned caveats? Most obviously, the data suggests no clear winner of this market at the present time. It indicates greater existing traction and usage for Chef and Puppet, of course, but this is to be expected given their longer track record. Even narrowing the field to those two projects, neither holds a position of dominance as judged by these metrics.
The most interesting conclusion to be taken from this brief look at a variety of community data sources, however, may well be the relevance of both Ansible and Salt. That these projects appear to have viable prospects in front of them speaks to the demand for solutions in the area, as well as the strong influence of personal preferences – e.g. the affinity for Salt amongst Python developers. Neither of the newer market entrants is remotely competitive with the incumbents in terms of counting stats, but they are more than holding their own in metrics reflective of simple interest.
How this market evolves in the future is still unclear, as few projected it to be more than a two horse race as recently as a few years ago. But while Chef and Puppet continue to sustain growth, it is likely that they’ll be facing more competition over time from the likes of Ansible and Salt.
Disclosure: Ansibleworks is a RedMonk customer. Opscode and Puppet Labs have been RedMonk customers, but are not currently. Saltstack is not a RedMonk customer.
In an article entitled “Python Displacing R As The Programming Language For Data Science,” MongoDB’s Matt Asay made an argument that has been circulating for some time now. As Python has steadily improved its data science credentials, from Numpy to Pandas, with even R’s dominant ggplot2 charting library having been ported, its viability as a real data science platform improves daily. More than any other language in fact, save perhaps Java, Python is rapidly becoming a lingua franca, with footholds in every technology arena from the desktop to the server.
The question, per yesterday’s piece, is what this means for R specifically. Not surprisingly, as a debate between programming languages, the question is not without controversy. Advocates of one or the other platforms have taken to Twitter to argue for or against the hypothesis, sometimes heatedly.
Python advocates point to the flaws in R’s runtime, primarily performance, and its idosyncratic syntax. Which are valid complaints, speaking as a regular R user. They are less than persuasive, given that clear, clean syntax and a fast runtime correlate only weakly with actual language usage, but they certainly represent legitimate arguments. More broadly, and more convincingly, others assert that over a long enough horizon, general purpose tools typically see wider adoption than specialized alternatives. Which is again, a substantive point.
R advocates, meanwhile, point to R’s anecdotal but widely accepted traction within academic communities. As an open source, data-science focused runtime with a huge number of libraries behind it, R has been replacing tools like MATLAB, SAS, and SPSS within academic settings, both in statistics departments and outside of them. R’s packaging system (CRAN), in fact, is so extensive that it contains not only libraries for operating on data, but datasets themselves. Not only does it contain datasets for individual textbooks taught by academia, it will store different datasets by the edition of those textbooks. An entire generations of researchers is being trained to use R for their analysis.
Typically this is the type of subjective debate which can be examined via objective data sources, but comparing the trajectories is problematic and potentially not possible without further comparative research. RStudio’s Hadley Wickham, creator of many of the most important R libraries, examined GitHub and StackOverflow data in an attempt to apply metrics to the debate, but all the data really tells us is that a) both languages are growing and that b) Python is more popular – which we knew already. Searches of package popularity likewise are unrevealing; besides the difficulty of comparing runtimes due to the package-per-version protocol, there is the contextual difficulty of comparing Python to R. Python represents a superset of R use cases. We know Python is more versatile and applicable in a much wider range of applications. We also know that in spite of Python’s recent gains, R has a wider library of data science libraries available to it.
My colleague Donnie Berkholz points to this survey, which at least is context-specific in its focus on languages employed for analytics, data mining, data science. It indicates that R remains the most popular language for data science, at 60.9% to Python’s 38.8%. And for those who would argue that current status is less important than trajectory, it further suggests that R actually grew at a higher rate this year than Python – 15.1% to 14.2%. But without knowing more about the composition and sampling of the survey audience, it’s difficult to attribute too much importance to this survey. Granted, it’s context specific, but we have no way of knowing whether the audience surveyed is representative or skewed in one direction or another.
Ultimately, it’s not clear that the question is answerable with data at the present time. Still, a few things seem clear. Both languages are growing, and both can be used for data science. Python is more versatile and widely used, R more specialized and capable. And while the gap has been narrowing as Python has become more data science capable, there’s a long way to go before it matches the library strength of R – which continues to progress in the meantime.
How you assess the future path depends on how you answer a few questions. At RedMonk, we typically bet on the bigger community, but that’s not as easy here. Python’s total community is obviously much larger, but it seems probable that R’s community, which is more or less strictly focused on data science, is substantially larger than the subset of the Python community specifically focused on data. Which community do you bet on then? The easy answer is general purpose, but that undervalues the specialization of the R community on a discipline that is difficult to master.
While the original argument is certainly defensible, then, I find it ultimately unpersuasive. The evidence isn’t there, yet at least, to convince me that R is being replaced by Python on a volume basis. With key packages like ggplot2 being ported, however, it will be interesting to watch for any future shift.
In the meantime, the good news is that users do not need to concern themselves with this question. Both runtimes are viable as data science platforms for the foreseeable future, both are under active development and both bring unique strengths to the table. More to the point, language usage here does not need to be a zero sum game. Users that wish to leverage both, in fact, may do so via the numerous R<==>Python bridges available. Wherever you come down on this issue, then, rest assured that you’re not going to make a bad choice.
Disclosure: I use R daily, I use Python approximately monthly.
The problem for Microsoft isn't that the PC ceased being the primary computing device. It's that you can't charge for software anymore.
— Horace Dediu (@asymco) August 27, 2013
On the surface, this statement by Asymco analyst Horace Dediu is clearly and obviously false. For 2013, Microsoft’s Windows and Business (read: Office) divisions alone generated, collectively, $44B in revenue. This number was up around 4% from the year before, after being up 3% in 2012 versus the year prior. This comment, in other words, is easily dismissed as hyperbole.
But given that the overwhelming amount of evidence contradicting the above statement, and his familiarity with capital markets, it’s highly unlikely that Dediu would be unaware of this. Which makes it reasonable, therefore, to conclude that he did not intend for the statement to interpreted literally. Which in turn implies that Dediu’s making a directional statement rather than a literal description of the market reality.
Even if one gives, for the sake of argument, Dediu the benefit of the doubt and assumes subtlety, the next logical counterargument is that he’s unduly influenced by his focus on consumer markets. The trend there, after all, is clear: the majority of available consumer software is subsidized by either advertising (e.g. Facebook, Google, Twitter) or hardware (e.g. Apple). More to the point, both of these models are attempting to exert pressure on the paid software model, as in the case of the Apple iWork and Google Docs competing for mindshare with the non-free Microsoft Office or the now free OS X (non-server) positioned against the non-free Microsoft Windows. Even in hot application spaces like mobile, it’s getting increasingly difficult to commercialize the output.
If this is your analytical context, then – and certainly Dediu’s primary focus (Asymcar notwithstanding) is on Apple and markets adjacent to Apple – the logical conclusion is indeed that software prices are heading towards zero in most categories, and that software producers need to adjust their revenue models accordingly.
No surprise then that it is by labeling the decline of realizable revenues as a consumer software-only phenomenon that enterprise providers are able to reassure both themselves and the market that they are uniquely immune, insulated from an erosion in the valuation of software as an asset by factors ranging from the price insensitivity and inertia of enterprise buyers to technical and/or practical lock-in. And to be fair, enterprise software markets are eminently more margin-oriented than consumer alternatives, not least because businesses are used to regarding technology as a cost of doing business. For consumers, it has historically been more of a luxury.
But the fact is that the assertion that it’s getting more difficult to charge for software is correct, as we have been arguing since 2010/2011.
The surface evidence, once again, contradicts this claim. Consider the chart of Oracle’s software revenue below.
This, for Oracle, is the good news. With few exceptions, notably a market correction following the internet bubble, Oracle has sustainably grown its software revenue every year since 2000. The Redwood Shores software giant, in fact, claimed in October that it was now the second largest software company in the world by revenue behind Microsoft, passing IBM. If a company that large can continue to generate growth, year after year, it’s easy to vociferously argue that the threat of broader declines in the viability of commercial software-only models is overblown. But this behavior, common to software vendors today, increasingly has a whistling-past-the-graveyard ring to it.
Whatever your broader thoughts on the mechanics of Dediu-mentor and Harvard Business School professor Clayton Christensen’s theory of disruption, history adequately demonstrates that even highly profitable, revenue generating companies are vulnerable. Oracle, for example, is as a software-sales business challenged by a variety of actors from open source projects to IaaS or SaaS service-based alternatives. To its credit, the company has hedges against both in BerkeleyDB/MySQL/etc and its various cloud businesses. It’s not clear, however, that even collectively they could offset any substantial impact to its core software sales business – while not broken out, MySQL presumably generates far less revenue than the flagship Oracle database. Software was 67% of Oracle’s revenue in 2011, a year after they acquired Sun Microsystems and its hardware businesses. In 2013, software comprised 74% of Oracle’s revenue.
The question for Oracle and other companies that derive the majority of their income from software, rather than with software, is whether there are signs underneath the surface revenue growth that might reveal challenges to the sustainability of those businesses moving forward. Consider Oracle’s 10-K filings, for example. Unusually, as discussed previously, Oracle breaks out the percentage of its software that derives from new licenses. This makes it easier to document Oracle’s progress at attracting new customers, and thereby the sustainability of its growth. The chart below depicts the percentage of software revenue Oracle generated from new licenses from 2000-2013.
There are a few caveats to be aware of. First, there are contradictions in the 2002 and 2003 10-K’s; second, where the 2012 10-K reported “New software licenses,” the 2013 10-K is now terming this “New software licenses and cloud software subscriptions.” With those in mind, the trendline here remains clear: Oracle’s ability to generate new licenses is in decline, and has been for over a decade. At 38% in 2013, the percentage of revenue Oracle derives from new licensees is a little less than half of what it was in 2000 (71%). Some might attribute this to the difficulty for large incumbents to organically generate new business, but in the year 2000 Oracle was already 23 years old.
What this chart indicates, instead, is that Oracle’s software revenue growth is increasingly coming not from new customers but from existing customers. Which is to the credit of Oracle’s salesforce, in spite what of the company characterized as their “lack of urgency.”
It may not be literally true, as Dediu argued above, that you can’t charge for software anymore. But it’s certainly getting harder for Oracle. And if it’s getting harder for Oracle, which has a technically excellent flagship product, it’s very likely getting harder for all of the other enterprise vendors out there that don’t break out their new license revenues as Oracle does. This is not, in other words, an Oracle problem. It’s an industry problem.
Consumer software, enterprise software: it doesn’t much matter. It’s all worth less than it was. If you’re not adapting your models to that new reality, you should be.
Disclosure: Oracle is not a RedMonk client. Microsoft has been a RedMonk client but is not currently.
In the beginning – October, 2003 to be precise – there was the Google File System. And it was good. MapReduce, which followed in December 2004, was even better. Together, they served as a framework for Doug Cutting’s original work at Yahoo, work that resulted in the project now known as Hadoop in 2005.
After being pressed into service by Yahoo and other large web properties, Hadoop’s inevitable standalone commercialization arrived in the form of Cloudera in 2009. Founded by Amr Awadallah (Yahoo), Christophe Bisciglia (Google), Jeff Hammerbacher (Facebook) and Mike Olson (Oracle/Sleepycat) – Cutting was to join later – Cloudera oddly had the Hadoop market more or less to itself for a few years.
Eventually the likes of MapR, Hortonworks, IBM and others arrived. And today, any vendor with data processing ambitions is either in the Hadoop space directly or partnering with an entity that is – because there is no other option. Even vendors with no major data processing businesses, for that matter, are jumping in to drive other areas of their businss – Intel being perhaps the most obvious example.
The question is not today, as it was in those early days, what Hadoop is for. In the early days of the project, many conversations with users about the power of Hadoop would stall when they heard words like “batch” or compared MapReduce to SQL (see Slide 22). Even already on-board employers like Facebook, meanwhile, faced with a market shortage of MapReduce-trained candidates were forced to write alternative query mechanisms like Hive themself. All of which meant that conversations about Hadoop were, without exception, conversations about what Hadoop was good for.
Today, the revese is true: it’s more difficult to pinpoint what Hadoop isn’t being used for than what it is. There are multiple SQL-like access mechanisms, some like Impala driving towards lower and lower latency queries, and Pivotal has even gone so far as to graft a fully SQL-compliant relational database engine on to the platform. Elsewhere, projects like HBase have layered federated database-like capabilities onto the core HDFS Hadoop foundation. The net of which is that Hadoop is gradually transitioning away from being a strictly batch-oriented system aimed at specialized large dataset workloads and into a more mainstream, general purpose data platform.
The large opportunity that lies in a more versatile, less specialized Hadoop helps explain the behavior of participating vendors. It’s easier to understand, for example, why EMC is aggressively integrating relational database technology into the platform if you understand where Hadoop is going versus where it has been. Likewise, Cloudera’s “Enterprise Data Hub” messaging is clearly intended to achieve separation from the perception that Hadoop is “for batch jobs.” And the size of the opportunity is the context behind IBM’s comments that it “doesn’t need Cloudera.” If the opportunity, and attendant risk, was smaller, IBM would likely be content to partner. But it is not.
Nor is innovation in the space limited to those would sell software directly; quite the contrary, in fact. Facebook’s Presto is a distributed SQL engine built directly on top of HDFS, and Google Spanner et al clones are as inevitable as Hadoop was once upon a time. Amazon’s RedShift, for its part, is gathering momentum amongst customers who don’t wish to build and own their own data infrastructure.
Of course, Hadoop could very well be years behind Google from a technology perspective. But even if the Hadoop ecosystem is the past to Google, it’s the present for the market. And questions about that market abound. How does the market landscape shake out? Are smaller players shortly to be acquired by larger vendors desperate not be locked out of a growth market? Will the value be in the distributions, or higher level abstractions? How do broadening platform strategies and ambitions affect relationships with would-be partners like a MongoDB? How do the players continue to balance the increasing trend towards open source against the need to optimize revenue in an aggressively competitive market? Will open source continue to be the default, baseline expectation, or will we see a tilt back towards closed source? Will other platforms emerge to sap some of Hadoop’s momentum? Will anyone seriously close the gap between MapReduce/SQL analyst and Excel user from an accessibility standpoint?
And so on. These are the questions we’re spending a great deal of time exploring in the wake of the first Strata/HadoopWorld in which Hadoop deliberately and repeatedly asserted itself as a general purpose technology. From here on out, the stakes are higher by the day, and margin for error low. To she who gets more of the answers to the above questions correct go the spoils.
Not surprisingly for an organization that has updated its product line 200 times this year as of the first of the month, Amazon had a few tricks up its sleeve for its annual re:Invent conference. For the company that effectively created the cloud market, the show was an important one for showcasing the sheer scope of Amazon’s targets.
Amazon is correctly regarded as one of the fastest innovating vendors in the world, with the release pace up over 500% from 2008 through last year. And if Amazon keeps up its pace for releases through the end of the year, it will have released 36% more features this year than last.
But as impressive as the pace is, the more impressive – and potentially more important – aspect to their release schedule is its breadth. Consider what Amazon announced at re:Invent:
- AppStream (Mobile/Gaming)
- CloudTrail (Compliance and Governance)
- Kinesis (Streaming)
- New Instance Types in C3/I2 (Performance compute)
- RDS Postgres (Database as a Service)
- Workspaces (VDI)
The majority of cloud vendors today are focused on executing with core cloud workloads, or basic compute and storage. There are certainly players focused on adding value through differentiated, specialized technologies such as Joyent with its distributed-Unix data-oriented Manta offering or ProfitBricks with its scale up approach, but these are the exception rather than the rule. Whether it’s public cloud providers or enterprises attempting to build out private cloud abilities, most of the focus is on simply keeping the lights on.
At re:Invent, Amazon did upgrade its traditional compute offerings via C3/I2, but also signaled its intent to embrace and extend entirely new markets. Most obviously, Amazon has with Workspace turned its eye towards VDI, for years a market long on promise but short on traction. The theoretical benefits of VDI, from manageability to security, have to date rarely outweighed the limitations and costs of delivery, making it the Linux desktop of IT – with success always just over the horizon. Amazon’s bet here is that by removing the complexity of execution it can engage with customers in a manner that its core cloud businesses cannot, and thereby grow its addressable market in the process.
Similarly, Kinesis is an entry into a specialized market that has typically been the province either of vendor packages – e.g. IBM InfoSphere Streams – or more recent open source project combinations such as Storm/Kafka. Of specific interest with Kinesis is the degree to which Amazon is leading the market here rather than responding to it. When questioned on the topic, Amazon said that Kinesis was unlike other Amazon offerings such as Workspaces that were a response to widespread customer demand. Instead, Amazon is anticipating future market needs with Kinesis, and attempting to deliver ahead of same.
AppStream, for its part, is effectively a Mobile/Gaming-backend-as-a-service, putting providers in that space on notice. The addition of Postgres as an RDS option, meanwhile, came to wide developer acclaim, but means that Amazon will increasingly be competing with AWS customers like Heroku. And CloudTrail, particularly with its partner list, means that AWS is taking the enterprise market seriously, which is both opportunity and threat for its enterprise ecosystem partners.
Big picture, re:Invent was an expansion of ambition from Amazon. Its sights are even broader than was realized heading into the show, which should give the industry pause. It has been difficult enough to compete with AWS on a rate of innovation basis in core cloud markets; with its widening portfolio of services, the task ahead of would-be competitors large and small just got more difficult.
That being said, however, it is worth questioning the sustainability of Amazon’s approach over the longer term. Microsoft similarly had ambitions not just to participate in but fundamentally dominate and own peripheral or adjacent markets, and arguably that near infinite scope impacted their focus in their core competencies. The broader and more diverse the business, the more difficult it becomes to manage effectively – not least because you end up making more enemies along the way. It remains to be seen whether or not Amazon’s increasing appetite to cloudify all the things has a similar effect on its ability to execute moving forward, but in the interim customers have a brand new stable of toys to play with.
Disclosure: Amazon, Heroku, and IBM are RedMonk customers, Joyent, Microsoft and ProfitBricks are not.