In an article entitled “Python Displacing R As The Programming Language For Data Science,” MongoDB’s Matt Asay made an argument that has been circulating for some time now. As Python has steadily improved its data science credentials, from Numpy to Pandas, with even R’s dominant ggplot2 charting library having been ported, its viability as a real data science platform improves daily. More than any other language in fact, save perhaps Java, Python is rapidly becoming a lingua franca, with footholds in every technology arena from the desktop to the server.
The question, per yesterday’s piece, is what this means for R specifically. Not surprisingly, as a debate between programming languages, the question is not without controversy. Advocates of one or the other platforms have taken to Twitter to argue for or against the hypothesis, sometimes heatedly.
Python advocates point to the flaws in R’s runtime, primarily performance, and its idosyncratic syntax. Which are valid complaints, speaking as a regular R user. They are less than persuasive, given that clear, clean syntax and a fast runtime correlate only weakly with actual language usage, but they certainly represent legitimate arguments. More broadly, and more convincingly, others assert that over a long enough horizon, general purpose tools typically see wider adoption than specialized alternatives. Which is again, a substantive point.
R advocates, meanwhile, point to R’s anecdotal but widely accepted traction within academic communities. As an open source, data-science focused runtime with a huge number of libraries behind it, R has been replacing tools like MATLAB, SAS, and SPSS within academic settings, both in statistics departments and outside of them. R’s packaging system (CRAN), in fact, is so extensive that it contains not only libraries for operating on data, but datasets themselves. Not only does it contain datasets for individual textbooks taught by academia, it will store different datasets by the edition of those textbooks. An entire generations of researchers is being trained to use R for their analysis.
Typically this is the type of subjective debate which can be examined via objective data sources, but comparing the trajectories is problematic and potentially not possible without further comparative research. RStudio’s Hadley Wickham, creator of many of the most important R libraries, examined GitHub and StackOverflow data in an attempt to apply metrics to the debate, but all the data really tells us is that a) both languages are growing and that b) Python is more popular – which we knew already. Searches of package popularity likewise are unrevealing; besides the difficulty of comparing runtimes due to the package-per-version protocol, there is the contextual difficulty of comparing Python to R. Python represents a superset of R use cases. We know Python is more versatile and applicable in a much wider range of applications. We also know that in spite of Python’s recent gains, R has a wider library of data science libraries available to it.
My colleague Donnie Berkholz points to this survey, which at least is context-specific in its focus on languages employed for analytics, data mining, data science. It indicates that R remains the most popular language for data science, at 60.9% to Python’s 38.8%. And for those who would argue that current status is less important than trajectory, it further suggests that R actually grew at a higher rate this year than Python – 15.1% to 14.2%. But without knowing more about the composition and sampling of the survey audience, it’s difficult to attribute too much importance to this survey. Granted, it’s context specific, but we have no way of knowing whether the audience surveyed is representative or skewed in one direction or another.
Ultimately, it’s not clear that the question is answerable with data at the present time. Still, a few things seem clear. Both languages are growing, and both can be used for data science. Python is more versatile and widely used, R more specialized and capable. And while the gap has been narrowing as Python has become more data science capable, there’s a long way to go before it matches the library strength of R – which continues to progress in the meantime.
How you assess the future path depends on how you answer a few questions. At RedMonk, we typically bet on the bigger community, but that’s not as easy here. Python’s total community is obviously much larger, but it seems probable that R’s community, which is more or less strictly focused on data science, is substantially larger than the subset of the Python community specifically focused on data. Which community do you bet on then? The easy answer is general purpose, but that undervalues the specialization of the R community on a discipline that is difficult to master.
While the original argument is certainly defensible, then, I find it ultimately unpersuasive. The evidence isn’t there, yet at least, to convince me that R is being replaced by Python on a volume basis. With key packages like ggplot2 being ported, however, it will be interesting to watch for any future shift.
In the meantime, the good news is that users do not need to concern themselves with this question. Both runtimes are viable as data science platforms for the foreseeable future, both are under active development and both bring unique strengths to the table. More to the point, language usage here does not need to be a zero sum game. Users that wish to leverage both, in fact, may do so via the numerous R<==>Python bridges available. Wherever you come down on this issue, then, rest assured that you’re not going to make a bad choice.
Disclosure: I use R daily, I use Python approximately monthly.
The problem for Microsoft isn't that the PC ceased being the primary computing device. It's that you can't charge for software anymore.
— Horace Dediu (@asymco) August 27, 2013
On the surface, this statement by Asymco analyst Horace Dediu is clearly and obviously false. For 2013, Microsoft’s Windows and Business (read: Office) divisions alone generated, collectively, $44B in revenue. This number was up around 4% from the year before, after being up 3% in 2012 versus the year prior. This comment, in other words, is easily dismissed as hyperbole.
But given that the overwhelming amount of evidence contradicting the above statement, and his familiarity with capital markets, it’s highly unlikely that Dediu would be unaware of this. Which makes it reasonable, therefore, to conclude that he did not intend for the statement to interpreted literally. Which in turn implies that Dediu’s making a directional statement rather than a literal description of the market reality.
Even if one gives, for the sake of argument, Dediu the benefit of the doubt and assumes subtlety, the next logical counterargument is that he’s unduly influenced by his focus on consumer markets. The trend there, after all, is clear: the majority of available consumer software is subsidized by either advertising (e.g. Facebook, Google, Twitter) or hardware (e.g. Apple). More to the point, both of these models are attempting to exert pressure on the paid software model, as in the case of the Apple iWork and Google Docs competing for mindshare with the non-free Microsoft Office or the now free OS X (non-server) positioned against the non-free Microsoft Windows. Even in hot application spaces like mobile, it’s getting increasingly difficult to commercialize the output.
If this is your analytical context, then – and certainly Dediu’s primary focus (Asymcar notwithstanding) is on Apple and markets adjacent to Apple – the logical conclusion is indeed that software prices are heading towards zero in most categories, and that software producers need to adjust their revenue models accordingly.
No surprise then that it is by labeling the decline of realizable revenues as a consumer software-only phenomenon that enterprise providers are able to reassure both themselves and the market that they are uniquely immune, insulated from an erosion in the valuation of software as an asset by factors ranging from the price insensitivity and inertia of enterprise buyers to technical and/or practical lock-in. And to be fair, enterprise software markets are eminently more margin-oriented than consumer alternatives, not least because businesses are used to regarding technology as a cost of doing business. For consumers, it has historically been more of a luxury.
But the fact is that the assertion that it’s getting more difficult to charge for software is correct, as we have been arguing since 2010/2011.
The surface evidence, once again, contradicts this claim. Consider the chart of Oracle’s software revenue below.
This, for Oracle, is the good news. With few exceptions, notably a market correction following the internet bubble, Oracle has sustainably grown its software revenue every year since 2000. The Redwood Shores software giant, in fact, claimed in October that it was now the second largest software company in the world by revenue behind Microsoft, passing IBM. If a company that large can continue to generate growth, year after year, it’s easy to vociferously argue that the threat of broader declines in the viability of commercial software-only models is overblown. But this behavior, common to software vendors today, increasingly has a whistling-past-the-graveyard ring to it.
Whatever your broader thoughts on the mechanics of Dediu-mentor and Harvard Business School professor Clayton Christensen’s theory of disruption, history adequately demonstrates that even highly profitable, revenue generating companies are vulnerable. Oracle, for example, is as a software-sales business challenged by a variety of actors from open source projects to IaaS or SaaS service-based alternatives. To its credit, the company has hedges against both in BerkeleyDB/MySQL/etc and its various cloud businesses. It’s not clear, however, that even collectively they could offset any substantial impact to its core software sales business – while not broken out, MySQL presumably generates far less revenue than the flagship Oracle database. Software was 67% of Oracle’s revenue in 2011, a year after they acquired Sun Microsystems and its hardware businesses. In 2013, software comprised 74% of Oracle’s revenue.
The question for Oracle and other companies that derive the majority of their income from software, rather than with software, is whether there are signs underneath the surface revenue growth that might reveal challenges to the sustainability of those businesses moving forward. Consider Oracle’s 10-K filings, for example. Unusually, as discussed previously, Oracle breaks out the percentage of its software that derives from new licenses. This makes it easier to document Oracle’s progress at attracting new customers, and thereby the sustainability of its growth. The chart below depicts the percentage of software revenue Oracle generated from new licenses from 2000-2013.
There are a few caveats to be aware of. First, there are contradictions in the 2002 and 2003 10-K’s; second, where the 2012 10-K reported “New software licenses,” the 2013 10-K is now terming this “New software licenses and cloud software subscriptions.” With those in mind, the trendline here remains clear: Oracle’s ability to generate new licenses is in decline, and has been for over a decade. At 38% in 2013, the percentage of revenue Oracle derives from new licensees is a little less than half of what it was in 2000 (71%). Some might attribute this to the difficulty for large incumbents to organically generate new business, but in the year 2000 Oracle was already 23 years old.
What this chart indicates, instead, is that Oracle’s software revenue growth is increasingly coming not from new customers but from existing customers. Which is to the credit of Oracle’s salesforce, in spite what of the company characterized as their “lack of urgency.”
It may not be literally true, as Dediu argued above, that you can’t charge for software anymore. But it’s certainly getting harder for Oracle. And if it’s getting harder for Oracle, which has a technically excellent flagship product, it’s very likely getting harder for all of the other enterprise vendors out there that don’t break out their new license revenues as Oracle does. This is not, in other words, an Oracle problem. It’s an industry problem.
Consumer software, enterprise software: it doesn’t much matter. It’s all worth less than it was. If you’re not adapting your models to that new reality, you should be.
Disclosure: Oracle is not a RedMonk client. Microsoft has been a RedMonk client but is not currently.
In the beginning – October, 2003 to be precise – there was the Google File System. And it was good. MapReduce, which followed in December 2004, was even better. Together, they served as a framework for Doug Cutting’s original work at Yahoo, work that resulted in the project now known as Hadoop in 2005.
After being pressed into service by Yahoo and other large web properties, Hadoop’s inevitable standalone commercialization arrived in the form of Cloudera in 2009. Founded by Amr Awadallah (Yahoo), Christophe Bisciglia (Google), Jeff Hammerbacher (Facebook) and Mike Olson (Oracle/Sleepycat) – Cutting was to join later – Cloudera oddly had the Hadoop market more or less to itself for a few years.
Eventually the likes of MapR, Hortonworks, IBM and others arrived. And today, any vendor with data processing ambitions is either in the Hadoop space directly or partnering with an entity that is – because there is no other option. Even vendors with no major data processing businesses, for that matter, are jumping in to drive other areas of their businss – Intel being perhaps the most obvious example.
The question is not today, as it was in those early days, what Hadoop is for. In the early days of the project, many conversations with users about the power of Hadoop would stall when they heard words like “batch” or compared MapReduce to SQL (see Slide 22). Even already on-board employers like Facebook, meanwhile, faced with a market shortage of MapReduce-trained candidates were forced to write alternative query mechanisms like Hive themself. All of which meant that conversations about Hadoop were, without exception, conversations about what Hadoop was good for.
Today, the revese is true: it’s more difficult to pinpoint what Hadoop isn’t being used for than what it is. There are multiple SQL-like access mechanisms, some like Impala driving towards lower and lower latency queries, and Pivotal has even gone so far as to graft a fully SQL-compliant relational database engine on to the platform. Elsewhere, projects like HBase have layered federated database-like capabilities onto the core HDFS Hadoop foundation. The net of which is that Hadoop is gradually transitioning away from being a strictly batch-oriented system aimed at specialized large dataset workloads and into a more mainstream, general purpose data platform.
The large opportunity that lies in a more versatile, less specialized Hadoop helps explain the behavior of participating vendors. It’s easier to understand, for example, why EMC is aggressively integrating relational database technology into the platform if you understand where Hadoop is going versus where it has been. Likewise, Cloudera’s “Enterprise Data Hub” messaging is clearly intended to achieve separation from the perception that Hadoop is “for batch jobs.” And the size of the opportunity is the context behind IBM’s comments that it “doesn’t need Cloudera.” If the opportunity, and attendant risk, was smaller, IBM would likely be content to partner. But it is not.
Nor is innovation in the space limited to those would sell software directly; quite the contrary, in fact. Facebook’s Presto is a distributed SQL engine built directly on top of HDFS, and Google Spanner et al clones are as inevitable as Hadoop was once upon a time. Amazon’s RedShift, for its part, is gathering momentum amongst customers who don’t wish to build and own their own data infrastructure.
Of course, Hadoop could very well be years behind Google from a technology perspective. But even if the Hadoop ecosystem is the past to Google, it’s the present for the market. And questions about that market abound. How does the market landscape shake out? Are smaller players shortly to be acquired by larger vendors desperate not be locked out of a growth market? Will the value be in the distributions, or higher level abstractions? How do broadening platform strategies and ambitions affect relationships with would-be partners like a MongoDB? How do the players continue to balance the increasing trend towards open source against the need to optimize revenue in an aggressively competitive market? Will open source continue to be the default, baseline expectation, or will we see a tilt back towards closed source? Will other platforms emerge to sap some of Hadoop’s momentum? Will anyone seriously close the gap between MapReduce/SQL analyst and Excel user from an accessibility standpoint?
And so on. These are the questions we’re spending a great deal of time exploring in the wake of the first Strata/HadoopWorld in which Hadoop deliberately and repeatedly asserted itself as a general purpose technology. From here on out, the stakes are higher by the day, and margin for error low. To she who gets more of the answers to the above questions correct go the spoils.
Not surprisingly for an organization that has updated its product line 200 times this year as of the first of the month, Amazon had a few tricks up its sleeve for its annual re:Invent conference. For the company that effectively created the cloud market, the show was an important one for showcasing the sheer scope of Amazon’s targets.
Amazon is correctly regarded as one of the fastest innovating vendors in the world, with the release pace up over 500% from 2008 through last year. And if Amazon keeps up its pace for releases through the end of the year, it will have released 36% more features this year than last.
But as impressive as the pace is, the more impressive – and potentially more important – aspect to their release schedule is its breadth. Consider what Amazon announced at re:Invent:
- AppStream (Mobile/Gaming)
- CloudTrail (Compliance and Governance)
- Kinesis (Streaming)
- New Instance Types in C3/I2 (Performance compute)
- RDS Postgres (Database as a Service)
- Workspaces (VDI)
The majority of cloud vendors today are focused on executing with core cloud workloads, or basic compute and storage. There are certainly players focused on adding value through differentiated, specialized technologies such as Joyent with its distributed-Unix data-oriented Manta offering or ProfitBricks with its scale up approach, but these are the exception rather than the rule. Whether it’s public cloud providers or enterprises attempting to build out private cloud abilities, most of the focus is on simply keeping the lights on.
At re:Invent, Amazon did upgrade its traditional compute offerings via C3/I2, but also signaled its intent to embrace and extend entirely new markets. Most obviously, Amazon has with Workspace turned its eye towards VDI, for years a market long on promise but short on traction. The theoretical benefits of VDI, from manageability to security, have to date rarely outweighed the limitations and costs of delivery, making it the Linux desktop of IT – with success always just over the horizon. Amazon’s bet here is that by removing the complexity of execution it can engage with customers in a manner that its core cloud businesses cannot, and thereby grow its addressable market in the process.
Similarly, Kinesis is an entry into a specialized market that has typically been the province either of vendor packages – e.g. IBM InfoSphere Streams – or more recent open source project combinations such as Storm/Kafka. Of specific interest with Kinesis is the degree to which Amazon is leading the market here rather than responding to it. When questioned on the topic, Amazon said that Kinesis was unlike other Amazon offerings such as Workspaces that were a response to widespread customer demand. Instead, Amazon is anticipating future market needs with Kinesis, and attempting to deliver ahead of same.
AppStream, for its part, is effectively a Mobile/Gaming-backend-as-a-service, putting providers in that space on notice. The addition of Postgres as an RDS option, meanwhile, came to wide developer acclaim, but means that Amazon will increasingly be competing with AWS customers like Heroku. And CloudTrail, particularly with its partner list, means that AWS is taking the enterprise market seriously, which is both opportunity and threat for its enterprise ecosystem partners.
Big picture, re:Invent was an expansion of ambition from Amazon. Its sights are even broader than was realized heading into the show, which should give the industry pause. It has been difficult enough to compete with AWS on a rate of innovation basis in core cloud markets; with its widening portfolio of services, the task ahead of would-be competitors large and small just got more difficult.
That being said, however, it is worth questioning the sustainability of Amazon’s approach over the longer term. Microsoft similarly had ambitions not just to participate in but fundamentally dominate and own peripheral or adjacent markets, and arguably that near infinite scope impacted their focus in their core competencies. The broader and more diverse the business, the more difficult it becomes to manage effectively – not least because you end up making more enemies along the way. It remains to be seen whether or not Amazon’s increasing appetite to cloudify all the things has a similar effect on its ability to execute moving forward, but in the interim customers have a brand new stable of toys to play with.
Disclosure: Amazon, Heroku, and IBM are RedMonk customers, Joyent, Microsoft and ProfitBricks are not.
A year ago, a CTO that had landed a large public round and secured a quarter as much in a less public investment candidly described the process saying, “this used to be called going public.” MongoDB, the recent beneficiary of a $150M round led by Intel, Saleforce.com and Sequoia would likely agree. As might Uber, who received $250M in financing from Google Ventures. Going public is clearly no longer the sole route to market for outsized capital requirements.
Which isn’t to imply that venture deal sizes are, on average, increasing. Thanks to a combination of factors from the rise of early stage investment vehicles like Y Combinator to open source software and the public cloud, data gathered by Chris Tacy (below) indicates that if we conflate angel and traditional venture investments, deal volume is up but the size of individual deals is actually in decline.
But at the opposite end of the spectrum, anecdotal evidence suggests that private funding is increasingly competing with public markets in ways not seen previously. The question is whether the data validates the assumption that private companies are being funded on a scale historically competitive with public market returns, and what this means for the wider market moving forward.
To expore the first question, it’s useful to examine data (PDF) on US Initial Public Offerings from 1980-2012 collected by Professor Jay R Ritter of the University of Florida. In his own words, the sample includes “IPOs with an offer price of at least $5.00, excluding ADRs, unit offers, closed-end funds, REITs, partnerships, small best efforts offers, banks and S&Ls, and stocks not listed on CRSP (CRSP includes Amex, NYSE, and NASDAQ stocks).” For example, here is the total number of IPOs per year beginning in 1980.
It should be no surprise to most that public offerings spiked in the late 1990′s. The Tulipmania hysteria that absorbed the technology industry – and eventually, the world – during the bubble has been well documented. What’s interesting about the this chart, however, is that it indicates that the market has yet to recover from the tech-driven crash in public offering volumes. The median number of IPOs per year from 1980 to 2012 is 174. We have not seen that many in a given year since 2004. The recent recession, of course, undoubtedly depressed the appetite for entities to take themselves public. But even in years of relative prosperity, domestically, IPOs seem to have lost some of their luster.
One potential explanation would be the returns. Below is a chart of the aggregate proceeds from all IPOs in a given year as calculated by Ritter. To normalize them for context, however, all numbers have been adjusted for inflation. Dollar amounts depicted, therefore, represent an approximated value in 2013 US dollars.
While the trendlines don’t match precisely, it’s interesting and perhaps not surprising to note the strong correlation between the returns from public offerings and their frequency. It is also worth noting that while proceeds have recovered more strongly than volume, the aggregate returns from public offerings remain depressed. From 1980 to 2012, the median return in 2013 dollars for the aggregate of a year’s worth of public offerings is $28.5B – a figure that hasn’t been reached in four of the last six years. An analysis of the average individual returns, however, challenges the hypothesis that the lack of an expected return is preventing would be IPOs from transacting.
The above chart depicts the aggregate returns for a given year divided by the number of IPOs – providing us with, essentially, an average IPO return. Even after normalizing against a 2013 dollar scale, it’s apparent that the realizable returns per transaction are still growing (if you’re curious about the 2008 outlier, that’s the year VISA went public and raised ~$17B). Which in turn should mean that the incentive to go public remains, and certainly entities from Google (1998) to Facebook (2012) to the aforementioned Twitter have chosen that path in spite of the availablility of capital in private markets.
Still, it is interesting to observe that deals like MongoDB’s $150M round dwarf the expected returns from historical IPOs, even after adjusting for inflation. For example, from 1980 to 1997 the average adjusted return from a public offering never eclipsed $100M. Since then it has dramatically expanded, with the median adjusted return since 1997 weighing in at approximately $253M, or approximately $100M more than MongoDB raised in its last round.
If more companies then are either delaying going public or avoiding the public markets entirely, one would expect to see a rise in venture backed companies eventually going public. While the costs of starting and running businesses have in many respects come down due to dramatic drops in the costs of technical infrastructure among other categories, these have in many respects been offset by spikes in other areas, notably healthcare. Which means that whether public or private, growing companies are likely to still require financing to fuel growth. And indeed, we find exactly this sort of trajectory in venture backed companies.
The above chart depicts percentage of IPOs from technology entities that were backed by venture capital. While the overall percentage has always been high, the trendline is clearly towards greater VC participation. Which makes sense in the wake of a decade of decreased reliance on public market alternatives.
As for what all of this means moving forward, the answers are unclear. In the aggregate, the private market is obviously not lacking for available capital. Just as clearly, decline in volume or no, the returns remain there for public market entrants – or at least some of them. But as the number of large venture deals that approximate the anticipated returns from a public offering appears to be on the rise, it’s worth monitoring the dynamic between public and private funding sources. In the meantime, we’re likely to continue seeing the kinds of deals that “used to mean going public.”
(All photos courtesy Maney Digital)
In a 2001 piece for the New York Times, the now sadly departed Elmore Leonard summed up his tenth and final rule on how to write simply: “Try to leave out the part that readers tend to skip.” Without claiming any particular success, this is essentially the philosophy behind the Monktoberfest. In effect, it’s an attempt to answer the question: what would happen if we threw a conference without the parts that people skip?
Consider sponsored talks, for example. While it is not technically impossible to deliver a sponsored talk that engages an audience, the overlap between great talks and paid talks is tiny. Most end up as little more than infomercials. So we lose them. Then there’s timing. For a conference aimed at and built for developers, who tend to not be the early rising type, why would we start the conference at the more typical 8 AM? 10 AM is much more civilized. And what do people most frequently want to skip at a conference? Meals delivered by a staff whose focus is scaling the food, not crafting the food. Many fewer people, on the other hand, skip a sushi lunch or a dinner that includes lobsters caught by the caterer’s husband the afternoon before.
While this is a bit of a different approach for conferences, the logic behind it seems straightforward. In my experience, the quality of any given conference will ultimately be determined not by the food, drink or even the speakers – as important as they are. The value of a conference is determined instead by its people. Why, then, would we optimize for anything but the people?
Whether we succeeded will be determined in the weeks and months ahead, as the impact of the individual talks ripples outwards, we see the manifestations on social media and elsewhere of new connections made at the show and so on. But the early returns are gratifying.
— plank10 (@plank10) October 5, 2013
— CWD (@baerspokes) October 5, 2013
#monktoberfest a blast. Met lots of smart people, learned much about tech… and beer.
— Brian Proffitt (@TheTechScribe) October 5, 2013
Awesome 3 days at #monktoberfest. Very fortunate to get a chance to attend and speak.
— Michael Ducy (@mfdii) October 4, 2013
— Deirdré Straughan (@DeirdreS) October 4, 2013
Home again, home again, jiggity jog. Thanks to all for a wonderful #monktoberfest!
— Kelly Smith (@kellypuffs) October 4, 2013
#monktoberfest was awesome – great to meet all the new people
— Aneel (@aneel) October 4, 2013
— Duffy Fron (@Duffy_Fron) October 4, 2013
— Pat Patterson (@metadaddy) October 4, 2013
— Travis Vachon (@tvachon) October 4, 2013
If I was going to be a tech conference today, it would've been #monktoberfest.
— Defrag/Glue (@defrag) October 3, 2013
I love how much of my Twitter feed is dedicated to the awesome stuff going on at #monktoberfest.
— Alex King (@alexkingorg) October 3, 2013
— Caroline McCarthy (@caro) October 3, 2013
— monkchips (@monkchips) October 3, 2013
#monktoberfest always makes me leave more hopeful for the industry and society. People really care.
— Christopher Petrilli (@petrillic) October 4, 2013
— Mike Shaver (@shaver) October 3, 2013
The last quote from Mike is perhaps the most important to me personally. People who have never attended the Monktoberfest will ask me what it’s all about, and my answer is that it’s about the intersection of social and technology. It’s about how technology changes the way that we socialize, and how the way that we socialize changes the way that we build technology. But within that broad framework, speakers have a great deal of latitude to interpret the constraints in interesting ways. In doing so, as Mike says, they make me think about why I think what I think. They make me think about what I’m doing, why I’m doing it, and how I can help. They inspire me, and I seriously doubt that I’m the only one. They are, in short, the kinds of talks that don’t necessarily have a home at other shows.
As with most large productions, the Monktoberfest is a group effort, and as such, there are many people to thank.
- Our Sponsors: Without them, there is no Monktoberfest
- IBM MobileFirst: In an industry littered with the carcasses of businesses that couldn’t adapt to change, IBM is one of the few major technology companies in existence that has survived not one but multiple waves of disruption. The driving force behind most disruption today is the developer – nowhere is this more apparent than in mobile – and we appreciate IBM’s strong support as our lead sponsor in helping to bring them the conference they deserve.
- Red Hat: As the world’s largest pure play open source company, there are few who appreciate the power of the developer better than Red Hat. Their support as an Abbot Sponsor – the third year in a row they’ve sponsored the conference, if I’m not mistaken – helps us make the show possible.
- ServiceRocket: When we post the session videos online in a few weeks, it is Service rocket that you will have to thank.
- EMC: Enjoyed your surf & turf dinner? Take a minute to thank the good folks from EMC.
- Rackspace/Splunk: It’s much easier to splurge on fresh sushi when you have partners like Rackspace and Splunk helping to make it possible.
- Basho: When you came in a little under the weather on Thursday and treated yourself to a breakfast sandwich, that was Basho’s doing.
- Atlassian/AWS/Brick Alloy/Citrix/CloudSpokes/Docker/Moovweb/Opscode/Rackspace: Remember the rare beers served at the event – one of which included the only barrel available in the US? These are the people that brought it to you. And be sure to thank Atlassian especially, as they brought you four separate rounds.
- Brick Alloy/Crowd Favorite: While we continue to search for a reasonable solution to the difficult challenges posed by a hundred plus bandwidth-hungry geeks carrying three or more devices per person, Brick Alloy and Crowd Favorite at least deferred the load onto local repeaters.
- Rackspace: The glasses this year came courtesy of Rackspace, as our attendees will be reminded every time they drink a craft beverage from one.
- Moovweb: Moovweb, meanwhile, addressed the afternoon munchies.
- O’Reilly: Lastly, we’d like to thank the good folks from O’Reilly for being our media partner yet again.
- Our Speakers: Every year I have run the Monktoberfest I have been blown away by the quality of our speakers, a reflection of their abilities and the effort they put into crafting their talks. At some point you’d think I’d learn to expect it, but in the meantime I cannot thank them enough. Next to the people, the talks are the single most defining characteristic of the conference, and the quality of the people who are willing to travel to this show and speak for us is humbling.
- Ryan and Leigh: Those of you who have been to the Monktoberfest previously have likely come to know Ryan and Leigh, but for everyone else they are one of the best craft beer teams not just in this country, but the world. And they’re even better people, having spent the better part of the last few months sourcing exceptionally hard to find beers for us. It is an honor to have them at the event, and we appreciate that they take time off from running the fantastic Of Love & Regret on behalf of Stillwater Ales down in Baltimore, MD to be with us.
- Lurie Palino: Lurie and her catering crew have done an amazing job for us every year, but this year was the most challenging yet due to some unfortunate and unnecessary licensing demands presented days before the event. As she does every year, however, she was able to roll with the punches and deliver on an amazing event yet again. With no small assist from her husband, who caught the lobsters, and her incredibly hard working crew at Seacoast Catering.
- Kate (AKA My Wife): Besides spending virtually all of her non-existent free time over the past few months coordinating caterers, venues and overseeing all of the conference logistics, Kate was responsible for all of the good ideas you’ve enjoyed, whether it was the masseuses last year or the cruise this year. She also puts up with the toll the conference takes on me and my free time. I cannot thank her enough.
- The Staff: From Juliane and James securing and managing all of our sponsors to Marcia handling all of the back end logistics to Kim, Ryan and the rest of the team handling the chaos that is the event itself, we’ve got an incredible team that worked exceptionally hard.
- Our Brewers: I’d like to thank Jim Conroy of The Alchemist, Josh Wolf of Allagash, Greg Norton of Bier Cellar, Mike Fava and Tim Adams of Oxbow, and Brian Strumke of Stillwater for taking time out of their busy schedules to be with us. The Alchemist and Allagash, in addition, were kind enough to provide giveaways to our attendees and speakers, respectively.
- Mike Maney: If he’s not the most enthusiastic Monktoberfest attendee, I’m not sure who would be. Last year he embarked on an epic 7 state road trip to the conference, and this year he sourced three bottles of Dogfish hand signed by none other than the founder of the brewery, Sam Calagione. These we were able to give away to attendees thanks to Mike’s efforts.
- Caroline McCarthy & Mike McClean of Abbey Cat Brewing: At the conclusion of our brewer’s panel featuring the Alchemist, Allagash, Bier Cellar, Oxbow and Stillwater, our panelists were each issued a customized Monktoberfest mash paddle. This came courtesy of a connection from Monktoberfest speaker Caroline McCarthy, who introduced me to Mike McClean, who graciously furnished us with the paddles gratis. Abbey Cat Brewing, in Mike’s words, makes “mash paddles, with the help of a sweatshop staffed entirely by foster kittens.” What he failed to add is that they are gorgeous creations. And before you ask, yes, we have pictures of the paddles with kittens.
With that, we close this year’s Monktoberfest. For everyone who was a part of it, I owe you my sincere thanks. You make all the blood, sweat, tears worth it. Stay tuned for details about next year, and in the meantime, you might be interested in Thingmonk or the Monki Gras, RedMonk’s other two conferences.
The following was meant to be ready in time for the Platform conference last week, but travel. While it’s belated, however, the following may be of interest to those tracking the PaaS market. At RedMonk, the volume of inquiries related directly and indirectly to PaaS has been growing rapidly, and these are a few of the more common questions that we’re fielding.
Q: Is PaaS growing?
A: The short answer is, by most measurements – search traffic included – yes.
The longer answer is that while interest in PaaS is growing, its lack of visibility on a production basis is adding fuel to those who remain skeptical of the potential for the market. Because PaaS was over-run in the early days by IaaS, there are many in the industry who continue to argue that PaaS is at best a niche market, and at worst a dead end.
To make this argument, however, one must address two important objections. First, the fact that the early failures in the PaaS space were of execution, not model. Single, proprietary runtime platforms are less likely to be adopted than open, multi-runtime alternatives for reasons that should be obvious. But perhaps more importantly, those arguing that the lack of production visibility for PaaS today means that it lacks a future must explain why this is true, given that history does not support this point. Quite the contrary, in fact: dozens of technologies once dismissed as “non-production” or “not for serious workloads” are today in production, running serious workloads. The most important factor for most technologies isn’t where they are today, but rather what their trajectory is.
Q: How convenient is PaaS, really?
A: That depends on one’s definition of convenience. It is absolutely true that PaaS simplifies or eliminates entirely many of the traditional challenges in deploying, managing and scaling applications. And given that developers are typically more interested in the creation of applications than the challenges of managing them day to day, these abilities should not be undersold.
That said, PaaS advocates are frequently unaware of the friction relative to traditional IaaS alternatives. Terminology, for example, is frequently an object of confusion: the linguists of infrastructure-as-a-service, which is essentially a virtual representation of physical alternatives, are simple. Servers are instantiated, run applications and databases, have access to a storage substrate and so on. Would-be adopters of PaaS platforms, however, must reorient themselves to a world of dynos, cartridges and gears. Even the metrics are different; rather than being billed by instance, they may be billed by memory or transactions – some of which can be difficult to predict reliably.
Is PaaS more convenient, then? Over the longer term, yes, it will abstract a great deal of complexity away from the application development process. In the short term, however, there are trade offs. It’s akin to someone who speaks your language, but with a heavy accent or in a different dialect. It’s possible to discern meaning, but it can require effort.
Q: What’s the biggest issue for PaaS platforms at present?
A: While the containerization of an application is far from a solved problem – some applications will run with no issues, while others will break instantly – it is relatively mature next to the state of database integrations. Most PaaS providers at present have distanced themselves from the database, for reasons that are easy to understand: database issues associated with multi-tenant, containerized and highly scalable applications are many. But it does present problems for users. PaaS platform database pricing has typically reflected this complexity, with application charges forming a fraction of the loaded application cost next to data persistence. And many platforms, in fact, have openly advocated that the data tier be hosted on entirely separate, external platforms, which spells high latency as applications are forced to call to remote datacenters even for simple tasks like rendering a page. Expect enhanced database functionality and integration to be a focus and differentiation point for PaaS platforms in the future. This is why several vendors in the space have invested heavily in relationships with communities like PostgreSQL and MongoDB.
Q: Where do the boundaries to PaaS end and the layers above and below it begin?
A: This is one of the most interesting, and perhaps controversial, questions facing the market today. In many respects, PaaS is well defined and quite distinct from other market categories; the previously mentioned lack of database integration, for example. But in others, the boundaries between PaaS and complementary technologies is substantially less clear. Given the PaaS space’s ambition to abstract away the basic mechanics of application and deployment, for example, it seems logical to question the intersection and potential overlap of PaaS and configuration management/orchestration/provisioning software such as Ansible, Chef, Puppet, or Saltstack. PaaS users, after all, are inherently bought into abstraction and automation; will they be content to manage the underlying physical and logical infrastructures using a separate layer? Or would they prefer that be a feature of the platform they choose to encapsulate their applications with?
If we assume for the sake of argument that, at least on some level, traditional configuration management/provisioning will become a feature of PaaS platforms, the next logical question is: what does this mean both for PaaS platform providers and configuration management/orchestration/provisoning players? Should the latter be aggressively be pursuing partnership strategies? Should the former rely upon one or more of these projects or attempt to replicate the feature themselves?
From the conversations we’re having, these are the important strategic questions providers are asking themselves right now.
Q: What’s the market potential?
A: We do not do market sizing at RedMonk, believing that it is by and large a guess built on a foundation of other guesses. That said, it’s interesting that so many are relegating PaaS to niche-market status. Forget the fact that even those companies serving conservative buyers such as IBM have chosen to be involved. Consider instead the role that PaaS was built to play. Much as the J2EE application servers abstracted Java applications from the operating systems and hardware layers underneath them, so too does PaaS. It is the new middleware.
Given the size of the Java middleware market at its peak, this is a promising comparison for PaaS. Because while it is true that commercial values of software broadly have declined since traditional middleware’s apex, PaaS offers something that the application servers never did: multi-runtime support. Where middleware players then were typically restricted to just those workloads running in Java, which was admittedly a high percentage at the time, there are few if any workloads that multi-runtime PaaS platforms will be unable to target. Which makes its addressable market size very large indeed.
Disclosure: IBM and Pivotal (Cloud Foundry) are clients, as is Red Hat (OpenShift), MongoDB and Salesforce/Heroku. In addition, Ansible, Opscode, Puppet Labs are or have been clients.
If you haven’t been following the saga of the Moto X, the short version is that it’s one of the first post-Google acquisition products from the company that gave us the Star Tac and the RAZR. Besides carrying the expectations of a market that needs to see something compelling from Motorola because it’s been a while, the X is also the focal point for analysts seeking an answer to one simple question: why did Google pay over twelve billion dollars for the company? Given the input the folks from Mountain View have had into the product, it’s been assumed that the Moto X would, if not answer that question outright, at least provide a hint.
If that’s the case, however, the answer for many seems to be: because they made a mistake. While the Moto X has seen its share of excellent reviews – see Gizmodo‘s “Moto X Hands On: Forget Specs, This Thing Is Awesome” or the Verge which gave it an 8 out of 10 – negative reactions have been common. It may not be a surprise to see John Gruber dismiss the product, but pieces like BGR’s “Motorola in Dreamland” or TechCrunch’s “Hell no Moto X” are representative of the industry’s disappointment.
While I have yet to get my hands on one of the devices, my bet is some of the gadget reviewers are simply missing the bigger picture. Which is, at least in part, that this phone isn’t built for them.
Consider the various complaints about the device. The disappointing processor? Yes, the benchmarks confirm the Moto X is based on a chip that’s slower than the equivalent in the HTC One or Samsung Galaxy S4. So? How many consumers, realistically, are aware of the chipset in their phone? The only time they’ll notice the processor is if the phone feels slow; none of the hands-on reviews I’ve seen yet make this claim.
How about the “gross” AMOLED Screen? Well, if a reviewer like The Verge’s Joshua Topolsky has to study the Moto X and a higher resolution screen such as the HTC One “side by side to make out the difference,” it seems unlikely the average consumer will have a problem with the display. Particularly given that the pixel density is just shy of iPhone-Retina.
Most of the negative reactions are focusing, in other words, on the phone’s underwhelming technical benchmarks. Which is interesting, because the history of this market, brief though it may be, does not suggest that the market rewards the most sophisticated handset. The iPhone, you might recall, did not add the 3G connectivity common to competitive handsets until Apple could ensure acceptable battery life. The HTC Evo, by contrast, was a marvel of engineering with an enormous screen and every connectivity option known to man – from HDMI to WIMAX. It also had a battery life of about an hour.
What Apple understands, and what the Moto X may reflect, is that technology is less important than experience. All things being equal, faster processors and brighter, higher resolution screens are preferable. But until we see significant advances in battery technology – the kinds of advances that are perpetually two to three years away – all things are not going to be equal.
So while those critiquing the Moto X for its pedestrian processor and so on focus on the components that won’t be found in the coming Ifixit Teardown, my guess is that the average user will be more impressed by a full day’s worth of usage – which is very different than a full day’s worth of talk time (who uses their phone as a phone these days anyway) – than a faster phone with a brighter screen. Just as they once picked EDGE capable iPhones over 3G competition.
Maybe the Made-in-the-USA factor will emerge as a selling point as well, and probably the ability to customize the appearance will be, but at the end of the day the performance of the Moto X will depend on how well Google and Motorola have learned from Apple. Apple’s never been about the underlying technology, and much to the consternation of tech reviewers everywhere, the Moto X doesn’t appear to be either.
Whether that pays off will be interesting to see.
Disclosure: There’s nothing to disclose. Google is not a client, and I do not have a Moto X, review unit or otherwise.
One of the things we track internally, for the sake of contextual curiosity more than anything, is the market performance of firms that can be at least loosely described as technology oriented. While it’s foolish to assign any serious import to rankings based on the vagaries of market performance, it is nevertheless interesting to understand how the market values or does not various entities, particularly in relation to one another. Besides simple market metrics such as market capitalization, it’s also useful to be aware of the wider context: when was a firm founded? How does it generate revenue? From these patterns, and particularly from watching them over time, it is possible to get a sense for how the technology landscape is evolving over time, and from there understand what adaptations may be necessary moving forward.
The list of the 55 largest public technologies entities that we’re tracking at present, ordered by current market cap, is available here. Please note, however, that no claims are made that this list is definitive. The most notable omission is carriers, and their omission looks increasingly problematic as they push further into cloud and network related services. It’s likely they’ll be added in future iterations.
If there are other public entities you believe to be missing, then, by all means let us know in the comments and we’ll review them and amend the list as necessary.
With the aforementioned caveats that the list is not definitive and that market perceptions do not necessarily match company merit, a few notable takeaways from a quick examination of the list.
- Age: The median age of the Top 55 tech companies is 28 years old, meaning that a representative entity would have been founded in 1985. This is less than surprising in one sense, given that larger companies have long leveraged startups as a means of outsourced innovation; rather than enter higher risk emerging markets themselves – which they are not built to attack in any event. Instead, they can sit back and attempt to acquire the successful innovators, considering the M&A premium their cost of innovation. Still, it’s interesting that the shape of the technology market, often considered one of the fastest moving industries in the world, is in part defined by the decisions of companies that might have been founded the year that New Coke debuted and Back to the Future was released.
- Revenue Source: Of the 55 companies tracked, 21 derive their revenue primarily from the sales of software while 34 of them do not. In other words, the average Top 55 technology firm is around one and a half times more likely to not generate the bulk of their revenue from software than it is from it. Interestingly, the Top 25 members of the list are even less likely to rely on software as their primary revenue source; 7 are primarily software oriented while 18 are not. Make no mistake: software is absolutely eating the world, as Marc Andreessen has said. Every company on this list relies on software for their business. But the data indicates that more companies are making money with software rather than from software, which is something of a departure from a decade ago when Microsoft’s dominance encouraged many to replicate the software revenue model.
- Annual Performance: In market performance over time, here is the list of companies in descending order of market cap generated per year of existence. 1) Google ($20B/year), 2) Apple ($11B), 3) Facebook ($10B), 4) Amazon ($8B), 5) Microsoft ($7B), 6) Cisco ($5B), 7) Oracle ($4B), 8) Qualcomm ($4B), 9) Baidu ($4B), 10) Taiwanese Semiconductor ($3B). Given that this measurement advantages to some degree younger firms, the presence of entities like Apple, Microsoft and Oracle is impressive.
- The Arena: A few relative valuations that may be of interest. Google is currently worth almost 10 Yahoo’s. Dell ($22.4B) is currently worth less than LinkedIn ($23.1B). ARM Holdings is worth less than you might expect, given its importance; Nokia is currently worth ~$2B more. The world’s only pure play open source company in Red Hat, meanwhile, is worth almost as much as Teradata ($10.2B to $9.9B) and more than Electronic Arts, F5 and Rackspace. And while Qualcomm tends to be something of a behind the scenes player, its Top 10 performance has it more valuable than VMware, Yahoo and Salesforce.com combined.
- Industry Size: The combined worth of the these entities is $2.8T.
Again, it’s important not to read too much into the above, particularly with respect to market valuations which are volatile by nature. As a snapshot of a given point in time, however, it’s useful to understand how companies and their strategies are perceived more broadly, and what this means for them moving forward.
When the OpenStack project was launched in 2010, IBM was one of many vendors in the industry offered the opportunity to participate. And though OpenStack launched with a nearly unprecedented list of supporters, IBM was not among them. In spite of their lack of a public commitment to an existing open source cloud platform – they had their own service offering in SmartCloud – they declined to join the project.
Until they did two years later.
In 2012, IBM joined along with Red Hat, another industry player that had passed on the initial opportunity to get on the OpenStack train. The original decision and the subsequent about face may seem contradictory, but it is nothing more or less than the inevitable consequence of how IBM approaches emerging markets.
For many customers, particularly risk averse large enterprises and governments, one of IBM’s primary assets is trust. IBM is in many respects the logical reflection of its customers, who are disinclined – for better and for worse – to reinvent themselves technically as each new wave of technology breaks, as each new “game changing” technology arrives. Instead, IBM adopts a wait and see approach. It was nine years after the Linux kernel was released that IBM determined that the project’s momentum, not to mention the potential strategic impact, made it a worthwhile bet. At which point they promised to inject $1 billion dollars into the ecosystem, a figure that represented a little over 1% of their revenue and fully a fifth of its R&D expenditures that year.
Which is not to compare IBM’s commitment last week to Cloud Foundry to its investment in Linux, in either dollars or significance. As much as one-time head of VMware now-head of Pivotal Paul Maritz is seeking to make Cloud Foundry “the 21st-century equivalent of Linux,” even the project’s advocates would be likely to admit there’s a long way to go before such comparisons can be made.
The point is rather that when evaluating the significance of IBM’s decision to publicly back Cloud Foundry, it’s helpful to put their decision making in context. Decisions of this magnitude cannot be made lightly, because IBM cannot return to enterprise customers who have built on top of Cloud Foundry at their recommendation in two years with a mea culpa and a new platform recommendation.
IBM’s support for the Cloud Foundry project signals their belief that the PaaS market will be strategic. Given the aforementioned context, it also means that after an extended period of evaluation, IBM has decided that Cloud Foundry represents the best bet in terms of technology, license and community moving forward. These are the facts, as they say, and they are not in dispute. The primary question to be asked around this announcement, in fact, is less about Cloud Foundry and IBM – we now know how they feel about one another – and more to do with what it portends for the PaaS market more broadly.
A great many in the industry, remember, have written off Platform-as-a-Service for one reason or another. For some VC’s it’s the lack of return from various PaaS-related investments, for the odd reporter here or there it’s the lack of traction for early PaaS players like Force.com or Google App Engine relative to IaaS generally and Amazon specifically. And for developers, it’s frequently the question of whether yet another layer of abstraction needs to be added to virtual machine, IaaS fabric, operating system, runtime / server, programming language framework and so on. The developer’s primary complaint used to be the constraints – run time choice, database options and so on – but these have largely subsided in the wake of what we term third generation PaaS platforms. PaaS platforms that offer multiple runtimes and other choices, in other words. Platforms like Cloud Foundry, OpenShift and so on.
But while it’s difficult to predict the future of PaaS, particularly the rate of uptake – certainly it hasn’t gone mainstream as quickly as anticipated here – the history of the industry may offer some guidance. For as long as we’ve had compute resources, additional layers of abstraction have been added to them. Generally speaking this has been for reasons of accessibility and convenience; it’s easier to code in Ruby, as but one example, than Assembler. But some abstractions, middleware in particular, have long served business needs by offering greater portability between application environments. True, the compatibility was never perfect, and write-once-run-anywhere claims tried the patience of anyone who actually tried it.
Will PaaS benefit from the long term industry trend towards greater levels of abstraction? Having corrected many of the early mistakes that led to premature dismissals of PaaS, it’s certainly possible. Oddly, however, many of the would-be players in the space remain reluctant to make the obvious comparison, that PaaS is the new middleware. Rather than attempt to boil the ocean by educating and evangelizing the entire set of capabilities PaaS can offer, it would seem that the simplest route to market for vendors would be to articulate PaaS as an application container, one that can be passed from environment to environment with minimal friction. It’s not a dissimilar message from the idea of “virtual appliances” that VMware championed as early as 2006, but it has the virtue of being more simple than packaging up entire specialized operating systems, and is thus more likely to work.
If we assume for the sake of argument, however, that PaaS will continue to make gains with developers and the wider market, the question is what the landscape looks like in the wake of the Cloud Foundry-IBM announcement. It’s obviously early days for the market; IBM-approved or no, Cloud Foundry isn’t yet listed as a LinkedIn skill, and the biggest LinkedIn user group we track had a mere 195 members as of July 15th. But in an early market, the IBM commitment is unquestionably a boost to the project. Open source competitors such as Red Hat’s OpenShift project, closed source vendors like Apprenda, hosted providers like Engine Yard, Force.com/Heroku or GAE will all now be answering questions about Cloud Foundry and IBM, at least in their larger negotiated deals.
As it always does, however, much will come down to execution. Specifically, execution around building what developers want and making it easy for them to get it. All the engineering and partnerships in the world can’t save a project that makes developers lives harder, as we’ve already seen with the first wave of PaaS vendors that failed to take over the world as expected. Whether or not Cloud Foundry can do that with the help of IBM and others will depend on who wins the battle for developers, and that’s one that’s far from over.
Disclosure: IBM is a RedMonk customer, as are Apprenda, Red Hat and Salesforce.com/Heroku. Pivotal is not a RedMonk customer, nor are Google or Engine Yard.
A week away from August, below are our programming language ranking numbers from June, which represent our Q3 snapshot. The attentive may have noticed that we never ran numbers for Q2; this is because little changed. Which is not to imply that a great deal changed between Q1 and Q3, please note, but rather than turn this into an annual exercise snapshots every six months should provide adequate insight into the relevant language developments occuring over a given time period.
For those that are new to this analysis, it is simply a repetition of the technique originally described by Drew Conway and John Myles White in December of 2010. It seeks to correlate two distinct developer communities, Github and Stack Overflow, with one another. Since that analysis, they have published a more real time version of their data available for those who wish day to day insights. In all of the times that this analysis has been performed, the correlation has never been less than .78, with this quarter’s correlation .79.
As always, there are caveats to be aware of.
- No claims are made here that these rankings are representative of general usage more broadly. They are nothing more or less than an examination of the correlation between two populations we believe to be predictive of future use, hence their value.
- There are many potential communities that could be surveyed for this analysis. GitHub and Stack Overflow are used here first because of their size and second because of their public exposure of the data necessary for the analysis.We encourage, however, interested parties to perform their own analyses using other sources.
- All numerical rankings should be taken with a grain of salt. We rank by numbers here strictly for the sake of interest. In general, the numerical ranking is substantially less relevant than the language’s tier or grouping. In many cases, one spot on the list is not distinuishable from the next. The separation between language tiers, however, is representative of substantial differences in relative popularity.
- In addition, the further down the rankings one goes, the less data available to rank languages by. Beyond the top 20 to 30 languages, depending on the snapshot, the amount of data to assess is minute, and the actual placement of languages becomes less reliable the further down the list one proceeds.
With that, here is the third quarter plot for 2013.
(embiggen the chart by clicking on it)
Because of the number of languages now included in the survey and because of the nature of the plot, the above can be difficult to process even when rendered full size. Here then is a simple list of the Top 20 Programming Languages as determined by the above analysis.
- Java *
- PHP *
- Python *
- Ruby *
- C# *
- C++ *
- C *
- Objective-C *
- Shell *
- Perl *
- Visual Basic
(* denotes a Tier 1 language)
Elsewhere, other findings of note.
- Outside of Java, nothing in the Top 10 has changed since the Q1 snapshot.
- For the second time in a row, ASP lost ground, declining one spot.
- For the first time in three periods, R gained a spot.
- Visual Basic dropped two spots after rising one.
- Assembly language, interestingly, jumped two spots.
- After breaking into the Top 20 in our last analysis, Groovy jumped up to #18.
- After placing 16th the last two periods, ActionScript dropped out of the Top 20 entirely.
Outside of the Top 20, Clojure held steady at 22 and Go at 28, while D dropped 5 spots and Arduino jumped 4.
In general, then, the takeaways from this look at programming language traction and popularity are consistent with earlier findings. Language fragmentation, as evidenced by the sheer number of languages populating the first two tiers, is fully underway. The inevitable result of which is greater language diversity within businesses and other institutions, and the need for vendors to adopt multiple-runtime solutions. More specifically, this analysis indicates a best tool for the job strategy; rather than apply a single language to a wide range of problems, multiple languages are leveraged in an effort to take advantage of specialized capabilities.
For many years after the de facto industry standardization on the MP3 format, the primary problem remained music acquisition. There were exceptions, of course: serious Napster addicts, participants in private online file trading or even underemployed office workers who used their company LAN to pool their collective music assets. All of these likely had more music than they knew what to do with. But for the most part, the average listener maintained a modestly sized music catalog; modest enough that millions of buyers could fit the entirety of their music on the entry level first generation iPod, which came with a capacity of 5 GB. Even at smaller, borderline-lossy compression levels – which weren’t worth using – that’s just over a thousand songs.
These days, however, more and more consumers are opting into platforms with theoretically unlimited libraries behind them. From iTunes Radio to Pandora to Play’s All Access to Rdio to Spotify, listeners have gone from being limited by the constraints of their individual music collection to having virtually no limits at all. Gone are the days when one needed to purchase a newly released album, or even more worse, drive to a store to buy it. Instead, more often than not, it’s playable right now – legally, even.
The interesting thing about music lovers getting what they always wanted – frictionless online access to music – was that it created an entirely new set of problems. Analysis paralysis, the paradox of choice, call it what you will: it’s become exponentially harder to choose what to listen to.
Which is why those who would continue to sell music are turning to data to do so. Consider iTunes Genius, for example, introduced in 2008. It essentially compares the composition of your music library and any ratings you might have applied to the library and ratings of every other Genius user. From the dataset created from the combined libraries, it automatically generates a suggested playlist based on a seed track. While it seems like magic, because curating playlists manually can be tedious, it’s really nothing more than an algorithmic scoring problem on the backend. Pandora takes an even more direct route, because it has real-time visibility into both what you’re listening to as well as metadata about that experience: did you rate it thumbs up or down, did you finish listening to it, did you even listen to it at all, are there other similar bands you wish played in the channel? All of this is then fed right back into the algorithms which do the best they can to pick out music that you, and thousands of other users similar to you, might like.
While the approaches of these and other services may differ, what they have in common is simple: a critical mass of listeners who are all voluntarily – whether they know it or not – building an ever larger, and ideally ever smarter, dataset of musical preferences on behalf of the vendor they’re buying from.
This is one of the examples that software companies should be learning from, although that should be “non-music” software companies, since just about every important new music company, including the examples above, is a software company first, music company second. Like the music companies, software companies should increasingly not be focused merely on the asset they wish to sell – software, in most cases – but data they might be in a position to collect that can be used to sell that software. Or as a saleable asset in and of itself.
Like music services, most technology platforms – particularly those that are run in a service context – are generating valuable data that can be used to inform customer choices. To date, however, very few platform providers are even thinking about this data in a systematized fashion let alone exposing it back to their customers in meaningful ways. We know this because we ask about it in every briefing.
Those customers that embrace a software plus data approach, therefore, are likely to have a competitive advantage over their peers. And importantly, it’s the rare competitive advantage that becomes a larger barrier to entry – a data moat, if you will – over time.
Two years ago Mikeal Rogers wrote a controversial piece called “Apache considered harmful” that touched a nerve for advocates of open source software foundations. Specifically, the piece argued that the ASF had outlived its usefulness, but in reality the post-GitHub nature of the criticism applied to a wide range of open source foundations.
For many years, open source foundations such as Apache counted project hosting as one of their core reasons for being. But in the majority of cases, the infrastrcture supporting this functionality was antiquated, as few of the foundations had embraced modern Distributed Version Control Systems such as Git. The Eclipse Foundation, for example, had a number of projects controlled by CVS, an application whose first release was in 1990. The ASF, meanwhile, was fully committed to its own Subversion project, a centralized VCS that was over a decade old at the time of Rogers’ post.
Outside the foundations, meanwhile, the traction of GitHub’s implementation of Git had exploded. It had become, almost overnight, the default for new project hosting. And because GitHub was in the business of hosting a version control system, and paid for it, it was no surprise that the quality of their hosting implementation was substantially better than what open source foundations like Apache or Eclipse could offer.
This preference for GitHub’s implementation led some developers, like Rogers, to question the need for foundations like Apache or Eclipse. In a world where GitHub was where the code lived and the largest population of developers was present, of what use were foundations?
One answer, in my view, was brand. Others included IP management, project governance, legal counsel, event planning, predictable release schedules and so on. But even assuming those services represent genuine value to developers, it would be difficult to adequately offset GitHub’s substantial advantages in interface and critical mass. GitHub makes a developer’s life easier now; intellectual property policies might or might not make their life easier at some point in the future.
As of this morning, however, developers at one foundation no longer need to choose. As the Eclipse Foundation’s FAQ covers, the Eclipse Foundation will now permit projects – just new ones, for the time being – to host their primary repository external to the foundations servers at GitHub.
The move is not without precedent; the OuterCurve (neé CodePlex) Foundation has permitted external hosting for several years. But the announcement by Eclipse is one of the first large mature foundations to explicitly fold external properties such as GitHub into its workflow.
This change should benefit everyone involved. Properties like GitHub gain code and developers, foundations can focus on areas they’re likely to add more value than project hosting, and developers get the benefits of a software foundation without having to sacrifice the tooling and community they prefer. For this reason, it seems probable that over time this will become standard practice, particularly as foundations look to stem criticism that they’re part of the problem rather than part of the solution. In the short term, however, there are likely to be some bumps in the road as new school populations within the foundations push their old school counterparts for change. Eclipse will in that respect be an interesting case study to watch.
Either way, while Eclipse may be the first large foundation to adapt itself to the post-GitHub environment, but it’s unlikely to be the last.
Disclosure: The Eclipse and OuterCurve Foundations are RedMonk clients.
While the bulk of the attention at Google I/O last week, at least in terms of keynote airtime, was devoted to improvements to user-facing projects like Android and Chrome, the Cloud team had announcements of their own. Most obviously, the fact that the Google Compute Engine (GCE) had graduated to general availability. Both because it’s Google and because the stakes in the market for cloud services are high, there are many questions being asked concerning Google’s official entrance to the market. To address these, let’s turn to the Q&A.
Q: The first and perhaps most obvious question is why now? Or more critically, why did it take Google so long for GCE to GA?
A: The flip answer is to point to how long Gmail was in beta. Google, historically, has had no reluctance to preview their services to a limited audience, a necessary precaution in many cases given their scale. The way one story goes, Google was forced to scramble for mere bandwidth following the release of Google Maps, having substantially underestimated the overwhelming demand for what was, at the time, a revolutionary mapping product. At scale, even simple things become hard. And delivering IaaS services, while a solvable problem, is not simple.
All of that said, Google’s late entrance to this market is also likely to be the product of a strategic misstep. Consider that Google App Engine – the company’s PaaS platform, and one of the first to market – has been available since 2008. It has been abundantly clear in the years since that, while PaaS may yet become a mainstream application deployment model, IaaS is more popular by an order of magnitude or more. Whether it was Google’s belief that PaaS would eventually become the preferred choice over IaaS, or whether Google had questions about their interest or ability to execute effectively in that type of a business, the fact is that they’re seven years late to market.
Q: So is it too late?
A: Surprisingly, the answer is probably not. Google’s delay has certainly created an enormous hill to climb; Amazon has spent the past seven years not only inhaling the market, they’ve actually been able to sustain a remarkable pace of innovation while doing so. Rather than being content with a few core services, Amazon has continued to roll out new capabilities at an accelerating rate. And in a departure from traditional IT supplier practices, they have lowered their prices rather than raised them. Repeatedly.
All of that said, two factors are in Google as well as other would-be Amazon competitors’ favor. First, far more workloads are not running in public clouds today than are. This means that as impressive as the growth in the cloud sector has been, a great deal of oxygen remains. Second, cloud infrastructure is by design more ephemeral than the physical alternatives that preceded it. It’s far more difficult to decommit from thousands of physical machines than cloud instances. While migrations between public clouds, then, are not without complication or risk, they are more plausible than customers swapping out their on premise infrastructure wholesale for a competitor.
So while Google’s delay was costly, it is unlikely to be fatal.
Q: Is Google serious? Or are these cloud services just more Google experiments that will be shut down?
A: It may be natural to ask this question in the wake of the house cleaning Google’s done over the past few years, shuttering a variety of non-core projects. There is no real evidence that this concern is legitimate regarding the Google cloud offerings, however. In App Engine, Google has technically been in market for years, and in that time, they have ramped their involvement up, not down. GAE has expanded its capabilities, multiple datastore options have been launched, GCE has been previewed and then released as a production product.
Google also probably cannot afford to sit this one out. A world in which an increasing number of compute workloads run on infrastructure maintained by competitors like Amazon or Microsoft is a multi-dimensional threat to Google’s business. Besides infusing those businesses with capital that can be used to subsidize efforts to attack Google in areas like mobile, owning customer relationships via cloud sales may allow competitors to cross-sell other services, such as collaboration or even advertising.
For those still not reassured, it’s worth noting that – like Amazon – Google is compelled to maintain large scale infrastructure as part of its core business. While its primary revenue source is obviously advertising, Google is at its core an infrastructure company. Which means that reselling infrastructure is not exactly a major departure from its business model.
Q: So Google’s serious about the cloud market – are they equally serious about the enterprise market?
A: The answer to this depends in part on how you believe cloud is currently being adopted by enterprises. If you’re of the belief that enterprise cloud adoption will resemble that of traditional infrastructure, Google does not currently appear to quote unquote serious about the enterprise market. Certainly they are not offering at present the certification program, for example, that Amazon is in an attempt to court enterprise buyers. Google’s recent standardization on Debian, in fact, could be construed as an active rejection of enterprise requirements; CentOS, at least, would represent an opportunity to market to current Red Hat customers.
What if, however, you believed that cloud adoption was proceeding not from the top down but rather the bottom up? What if you believed that developers were leading the adoption of cloud services within the enterprises? How might you optimize your offering for developer adoption? Well, you might begin by standardizing on the preferred distribution of developers. Which would be, according to the research of my colleague Donnie Berkholz, none other than Debian-based distros. You might price competitive with the current developers’ choice, Amazon, and go one step further to offer sub-hourly billing. And you’d obviously expose the whole thing via a single JSON API, accessible via a command line tool.
The punchline, of course, is that Google has done all of the above. In a perfect world, you would build cases for both developer and enterprise, as Amazon has done. But playing from behind, Google appears to be betting on the developer rather than pursuing the features that would appeal to traditional enterprise buyers.
If you think developers are playing a deciding role with respect to adoption, then, within the enterprise, you can argue that Google is serious about that market. If you believe that CIOs remain firmly in control, then no, Google is not serious about the enterprise.
Q: What was the most significant cloud-related announcement from I/O?
A: The answer depends on timeframe. In the short term, the addition of PHP support on App Engine dramatically expands that platform’s addressable market. Likewise, the more granular pricing will potentially lower costs while allowing developers the ability to experiment.
Over the longer term, the introduction of the non-relational Google Datastore gives GCE an alternative to Amazon’s Dynamo or SimpleDB, as well as the countless other NoSQL databases saturating the market, and a complement to their existing BigQuery and Cloud SQL (MySQL-as-a-Service). Given the massive popularity of non-relational stores, this announcement may be the most significant over the longer term.
Q: How serious a threat is Google to Amazon’s cloud? Or Microsoft’s, or Rackspaces for that matter?
A: I argued in my 2013 predictions piece that Google would be the most formidable competitor Amazon has yet faced, and nothing that’s occurred since has caused me to rethink that position.
In the short term, neither Google nor anyone else will challenge Amazon, whose dominance of the cloud is substantially understated, in my opinion, by this 451 Group survey indicating a 19% market share. The Register, meanwhile, points to the disparity in available services. Amazon is to the cloud what Windows was to operating systems and what VMware is to virtualization, and it would be difficult to build the case otherwise.
Over the medium to longer term, however, Google has economies of scale, expertise in both software and infrastructure, and existing relationships with large numbers of developers. More specifically:
[Google] has the advantage of having run infrastructure at a massive scale for over a decade: the search vendor is Intel’s fifth largest customer. It also has deep expertise in relevant software arenas: it has run MySQL for years, the company was built upon customized versions of Linux and it is indirectly responsible for the invention of Hadoop (via the MapReduce and GFS papers).
Google’s a fundamentally different type of competitor to AWS, and there are signs that Amazon recognizes this.
Which is what will make the months ahead interesting to watch.
Disclosure: Amazon, Microsoft and VMware are RedMonk clients, Google is not.
“Google announced so many things yesterday that it makes my head spin.” – Fred Wilson
The challenge with a conference like Google I/O, where the announcements arrive one after another, is to see both forest and trees. Analysis of individual announcements – such as Google’s new Pandora/Rdio/Spotify competitor All Access, or the granular pricing for its compute infrastructure – is relatively straightforward. What’s more important, however, is perceiving the larger pattern.
The most obvious feature of Google I/O is the emphasis on the developer. As they have in years past, Google demonstrated their commitment to developers financially, handing out over a thousand dollars of free hardware in the Chromebook Pixel. But the content itself reflected this prioritization. Rather than easing into the keynote with something accessible to non-programmers such as All Access, Google devoted fully the first forty minutes to API announcements. And then followed that up with the release of a new Android development tool that is already eliciting favorable comparisons to Apple’s Xcode. And so on. I/O is, to its credit, remaining true to its roots – it is a developer show first, and everything else second.
So Google gets the importance of developers: this does not exactly qualify as news.
Perhaps less obvious, however, was the strategy implied by the announcements. Many were surprised – in spite of the hints ahead of the show – that Google did not unveil a new piece of hardware, or even an updated version of their Android operating system. There was even disappointment in some quarters; a few of the developers seated behind me were grumbling that there wasn’t even an update to the Nexus 7, as had been speculated ahead of the show. Nor should the disappointment have been a surprise: Apple has created the expectation that developer events must also serve as launch platforms for hardware and software. Developers have been conditioned to expect new hardware, new operating systems and more.
By not even announcing either a new device – unless you count the already available Samsung Galaxy device running stock Android that will be sold in late June – or a new version of the operating system, Google is telegraphing their belief that the basis of competition lies elsewhere.
Some might argue that this is less of a strategic statement than a matter of timing; that Google simply didn’t have either a new operating system or device to present. And there is truth in that. With its Nexus 4 device less than six months old, an X phone announcement was always unlikely. The past four releases of Android, meanwhile, have arrived in either July or November/December. It is hard to make the argument, however, that Google could not have at least previewed upcoming technologies. Technology companies reveal products far ahead of their production readiness all the time.
The statement made by Google yesterday, instead, is that the war for mobile will not be won with devices or operating systems. It will be won instead with services.
Last November, Patrick Gibson argued that Google was getting better at design faster than Apple was getting better at services. While Google’s design credibility can be debated, Apple’s history in services cannot. While its systemic issues have been damaging enough to require more than one apology from the company, Apple has – in spite of its resources – seemingly made little progress in the services area. Those who have worked with the company point to cultural issues as one factor – the company’s secrecy can make it difficult for infrastructure teams to work effectively together – but whatever the reason, Apple has been less successful in services than it has in virtually any other area of its business. At a time when services are becoming more important to users.
Now consider what Google announced at I/O yesterday:
- Commerce services (instant buy, wallet objects, send money in Gmail)
- Education services (apps sorted by class / grade level, automated rollout to classes)
- Collaboration services (cross-platform persistent conversations, video chat)
- Game services (save to cloud, multiplayer, etc)
- Map services (activity recognition, geofencing, low power location)
- Market improvements (in-market translation, automated application recommendations)
- Music services (All Access, curated playlists)
- Now services (media recommendations, public transit commute times, reminders)
- Photo services (automated photo triage, generated motion images, automatic improvements)
- Search services (hotword triggering)
And that’s without getting into any of the Google Cloud Platform announcements. Apple has competitive offerings – superior offerings, in some cases – to some of the above. But it’s missing many, and in others, Maps most notably, Apple lags considerably behind. Hence Google’s approach, in which it attempts to apply its strengths in delivering services at scale to Apple’s perceived weakness – delivering services at scale.
Whether Google’s strategy here is successful depends in part on timing and the commoditization of the user interface. Essentially Google needs Android to be, at a minimum, a “good enough” user interface to be considered a reasonable alternative for a large enough subset of the addressable market to make Google’s advantages in services relevant. Two or three years ago, this was not the case. Today, it might be. Apple, meanwhile, needs to increase the distance between Android and iOS enough to give itself time to either build or acquire a competency in services.
Either way, two things are clear; the developers, as ever, are firmly in control, and WWDC in June should be very interesting.
Most of the charts and analysis you see in this space is done, as a few of you know, via R and, more specifically, R Studio. R Studio is an excellent tool that streamlines the process of working with R, and while it’s certainly not necessary to work with the language, I recommend it to those looking for a more comprehensive interface. As much as I appreciate the tool, however, it has never obviated my need for a scratchpad. Like a lot of developers, I often work outside of my chosen development environment, maintaining a separate Sublime Text window to capture snippets of code, notes on what they do and how they do it, and more. And like a lot of developers, I’ve never really thought about this scratchpad or the process behind it.
I do eventually migrate a subset of these snippets from Sublime – how to get ggplot to generate a stacked chart incorporating negative values, for example – into a Google Doc for more permanent storage than just another an open tab. But the process is manual, imperfect and not collaborative at all. All of which may help explain why I think Alex King’s new project Capsule is both important and relevant to anyone interested in the craft of software development. As he put it, it’s potentially a solution to a problem that I didn’t know I had.
Described as the “Developer’s Code Journal,” Capsule is a WordPress based replacement for the scratch document that you maintain to capture everything that doesn’t fit well within your development tool (be that a text editor or IDE) of choice: extended comments, outlines, code snippets and so on. Basically, its function is akin to a diary or journal, but one designed and built to cater specifically to the art and task of coding.
As you’d expect, it incorporates a good code editor (the same as GitHub) with language autocomplete and so on. Even better, you can organize these by project (@ notation) or tag (# notation) for easy retrieval later. And as a web based application rather than a local text doc, it’s possible to build a collective scratchpad equivalent, collaboratively, containing the shared thoughts of development teams.
If you’re interested in the craft of software, and I’d hope that most of the people reading this fit that description, this is a project I’d recommend looking at. And given that it’s open source – it’s up on GitHub here – potentially even contributing back to.
The only question I’m trying to decide on before implementing it is whether to do so locally and back up the database to Dropbox regularly as Alex does, or remote, which removes a few complications but means that you lose access to the data when offline.
Either way, it’s a tool that I expect to be using soon.
Disclosure: Alex is a friend of mine.
In the wake of last week’s well attended OpenStack Summit, there has been much discussion of the state of the project. As is typical, this ranges from heated criticism of the project’s community, governance or technology to grandiose claims regarding its trajectory and marketplace traction. And as is typical, the truth lies somewhere in between.
Critics of the project suggesting that it has real organizational issues and engineering shortcomings to address are correct. As are proponents arguing that the project’s momentum is accelerating, both via additions to its community and by the lack thereof from competitive projects and products. The former is, in all probability, the more important of the two developments. Engineering quality is important, but as we tell all of our clients, it has become overvalued in many technology industry contexts. With the right resources, quality of implementation is – usually – a solvable problem. The lack of a community, and attendant interest, is much less tractable. More often than not, the largest community wins.
In the case of OpenStack, however, this can be considered a positive for the project only as long as there is one OpenStack community. It is unclear that this will remain the case moving forward.
Historically, some of the most important and highest profile platform technologies – Linux being the most obvious example – have been reciprocally licensed. In practical terms, this requires vendors distributing the codebase to make any modifications to it available under precisely the same terms as the original code. OpenStack, like Cloud Foundry, Hadoop and other younger projects, is permissively licensed. Unlike reciprocally licensed assets, then, distributors of OpenStack technologies are not required to make any of their bugfixes, feature upgrades or otherwise available under the same terms, or indeed available at all.
Though not required by the license, the overwhelming majority of code is contributed back to the project, because there is little commercial incentive to heavily differentiate from OpenStack. There are, however, commercial incentives to differentiate in certain areas. Which could, over the longer term, lead to fragmentation within the OpenStack community.
To combat this, the OpenStack Foundation and its Board of Directors must make two difficult decisions regarding compatibility.
First, it needs to answer a currently existential question regarding OpenStack: specifically, what is it, exactly? What constitutes an OpenStack instance? One interpretation is that an OpenStack instance is one that has implemented Nova and Swift, the compute and object storage components within OpenStack. What of vendors or customers who have found Swift wanting, and turned to Ceph or RiakCS, then, as an alternative? Are they not OpenStack? Further, how might the definition of what constitutes an OpenStack project evolve over time? Over what timeframe, for example, might customers have to implement Quantum (networking), Keystone (identity), Heat (orchestration) to be considered ‘OpenStack?’
Answering this question will involve difficult decisions for the OpenStack project, because opinions on the answer are likely to vary depending on the nature of existing implementations and the larger strategies they reflect. Because much of OpenStack’s value to customers – and the marketing that underpins it – lies in its avoidance of lock-in, however, answering this question is essential. A customer that cannot move with relative ease from one OpenStack cloud to another because the underlying storage substrates differ is, open source or no, effectively locked in.
The OpenStack Foundation could decline to take an aggressive position on this question, leaving it to the market to determine a solution. This would be a mistake, because as we’ve seen previously in questions of compatibility (e.g. Java), trademark is the most effective weapon to keeping vendors in line. OpenStack implementations that are denied the right to call themselves OpenStack as a result of a breach of interoperability guidelines are effectively dead products, and vendors know it. Given that the Foundation controls the trademark guidelines, then, it is the only institution with the power to address the question of what is OpenStack and what is not.
Assuming that the question of what foundational components are required versus optional in an OpenStack implementation can be answered to the market’s satisfaction, the second cause for concern lies in compatibility between the differing implementations of those foundational components. The nature of implementations, for instance, may introduce unintended, accidental incompatibilities. Consider that shipped distributions are likely to be based on older versions of the components than those hosted, which are frequently within a week or two of trunk. How then can a customer seeking to migrate workloads to and from public and private infrastructure be sure that they will run seamlessly in each environment?
This type of interoperability is by definition more complex, but it is not without historical precedent. As discussed previously in the context of Cloud Foundry, one approach the Foundation may wish to consider is Sun’s TCK (Technology Compatibility Kit) – should a given vendor’s implementation fail to pass a standard set of test harnesses, it would be denied the right to use the trademark. Indeed, this seems to be the direction that Cloud Foundry itself is following in an attempt to forestall questions of implementation compatibility.
Ultimately, the pride on display at the OpenStack Summit last week was well justified. The project has come a long way since its founding, when several of its now members declined to participate after examining the underlying technologies. But its future, as with any open source project, depends heavily on its community, which in turn is dependent on the Foundation keeping that community from fragmenting. The good news for OpenStack advocates is that there are indications the board understands the importance of these questions, and is working to address them. How effective they are at doing so is likely to be the single most important factor in determining the project’s future.
Disclosure: Multiple vendors involved in the OpenStack project, including Cisco, Cloudscaling, Dell, HP, IBM, Red Hat and are RedMonk customers. VMware, which is both a participant in the OpenStack community and a competitor to it, is a customer.
While this trend is easily observed in a variety of contexts, one question that hasn’t been asked to date is how the academic world is adapting to, coping with or driving this change. What role have colleges and universities played in the proliferation of alternative languages?
To answer this, our own Marcia Chappell researched the published computer science curriculums at the Forbes Top 10 Colleges and Universities. We picked Forbes as opposed to alternatives like US News and World Report because it did not differentiate between size of institution, theoretically providing us with a broader sample.
Unfortunately, the research proved in many cases fruitless. Over 519 courses with descriptions published online, we were only to collect a mere 93 mentions of particular programming languages or runtimes. Many courses do not include specifics regarding languages taught, either because it may vary by instructor or because the language is viewed as less important than the course material. As a result, it’s impossible to draw any statistically significant conclusions from the data, because sampling it properly is impractical or impossible given the nature of the online course descriptions.
Of those courses that did publish information about the technologies taught, however, here is the distribution.
With the above caveat that this chart cannot be considered actually representative of the curriculum of the institutions in question, nor could those institutions be considered representative of the wider academic community, this chart does prompt questions about the content of today’s academic computer science and related coursework.
- The relative infrequency with which R was mentioned generally – and particularly relative to MATLAB – was somewhat surprising. In my experience, more academic statisticians today seem to be advantaging R over MATLAB, which this data contradicts.
- With the exception of Java (Android), mobile appears to be substantially under-represented within this slice of academia. C# (Xamarin) and Objective C (iOS) reflect a minimal presence. Clearly some programs are catering to increased demand for mobile skills (see the Harvard Extension School’s CompSci E-76) but these efforts appear to be, at this point, rare.
The most interesting question raised by this examination, of course, is not necessarily what role academia plays in language adoption, but rather what role it should play. Is the purpose of college coursework to provide students with the equivalent of a classical education in Greek and Latin, a deep understanding of the foundation of the discipline? Or should universities be more responsive to industry trends and demands from the job market, particularly in a context in which their graduates are facing a highly challenging hiring climate?
Ideally, students might have a choice. Those seeking long term careers as computer scientists and software engineers have the depth of coursework necessary to ground them moving forward, while those seeking the most marketable and in demand skills might pick from courses satisfying that shallower need. In practical terms, however, universities – at least those surveyed here – seem to be far more focused on the former than the latter. The question then becomes whether that approach best serves their students.
In December of 2004, Adam Bosworth wrote a seminal essay entitled “Where have all the good databases gone.” Anticipating by years the rise of the MapReduce/NoSQL movements, it succinctly identified the central problem: “The products that the database vendors were building had less and less to do with what the customers wanted.”
In the years since, it has become clear that databases are not the only software area challenged in this respect. For the first time since the rise of Microsoft, Oracle and other large software players, businesses of all shapes and sizes are beginning to turn not to vendors for technical solutions, but to their own staff, or the software products of other businesses released as open source software. The software industry, in other words, is in the process of being disrupted by its would be customers: we’re seeing the return of roll your own.
None of which is particularly surprising. As the constraints of available software and hardware are removed from developers by open source and the public cloud, respectively, their collective output goes up. With a growing subset of this output released as open source software by entities such as Facebook, LinkedIn or Twitter who see software as a means to an end rather than an end in and of itself, it begins to create a virtuous cycle: more high quality freely available software means less reinvention of the wheel, thus begetting more open source software.
What is surprising, however, is the degree to which hardware vendors are proving to be vulnerable to the same trend. This is counterintuitive given the perceived difference between authoring code and manufacturing hardware. Aside from the difference in upfront capital required, there is the fact that manufacturing at any reasonable scale has been considered a complicated exercise, one that ideally would be outsourced to specialty vendors (i.e. Dell, HP, IBM, etc).
One of the industry’s worst kept secrets, however, has been that Google’s servers are not products of these vendors but rather machines it designed itself. While acknowledging this fact, most hardware manufacturers have dismissed its importance as either a quirk of Google’s culture or a problem unique to the only business in the world operating at that scale. In other words, Google’s roll-your-own was the exception, not the rule.
And for the most part, industry analysts tracking the hardware market implicitly validated these claims, because their numbers reflected minimal competition for traditional hardware suppliers. But as discussed in a 2011 Wired article by Bob McMillan, it was clear that the picture painted by the analyst reports was at best incomplete, because it was significantly under-reporting traction from Original Device Manufacturers (ODMs). As Andy Bechtolsheim (who co-founded Sun in 1982) put it,
“It’s hard to get those numbers because they’re not reported to IDC or any of the market firms that count servers.”
If the ODM numbers were inaccurate, then, what did the server market look like in reality? One hint arrived a year later, when Intel indirectly crowned Google as the world’s fifth largest server manufacturer. By itself, this was interesting, but could still be dismissed as an isolated trend.
Recent events, however, call that into question. Rackspace, an environment that has historically been 60% Dell and 40% HP will be in April, according to CTO John Engates:
“Basically back to our own designs because it really doesn’t make a lot of sense to put cloud customers on enterprise gear. Clouds are different animals – they are architected and built differently, customers have different expectations, and the competition is doing different things.”
While it’s true, then, that the initial customers for roll-your-own gear remain large entities like Facebook, Google or Rackspace, it is likely just a matter of time until the ODM manufacturers of the customized gear such as Quanta begin to target enterprises directly. And as for the much larger market of businesses that lack the need or wherewithal to build and design their own servers, an increasing number of them will be running on the custom ODM manufactured gear anyway in the form of public clouds.
The impacts of this shift, once so easily dismissed, are evident already. Dell is attempting to go private to retool its business away from the harsh scrutiny of public markets. VMware is attempting to enlist – among others – service providers built on Dell, HP and other major label hardware in a war on Amazon. The percentage of both revenue and profit derived from hardware at IBM have both declined over the last five years. And just yesterday, Oracle announced that its hardware revenues were down by 23 percent and that they aren’t expected to grow this year.
The simple fact is that most hardware manufacturers – like Bosworth’s database vendors – have not responded to what customers have indicated they want – in most cases because it’s at cross purposes with their margins. While they furiously add new features to hardware in an effort to justify their premium pricing, the market has accelerating its consumption of lower cost, more available alternatives in cloud or ODM gear in spite of their respective limitations. Hardware vendors are guilty, in other words, of overestimating the value of innovation at the expense of convenience. Worse, Jevon’s Paradox tells us that increased technical efficiencies – such as the public cloud’s instant provisioning – will lead to increases in consumption for the parties that provide it. Parties like Amazon.
Enterprise hardware vendors, like their software brethren, have an important question to answer: how to offer customers what they want without jeopardizing their business in the process. The market makes clear what the answer is not, at least on a volume basis: packing more features into ever higher priced hardware. The likely candidates for revenue growth moving forward instead will incorporate a combination of sources rather than relying on the hardware alone – sources such as data or network-enabled services.
Given that hardware vendors tend to be maniacally focused on the hardware, however, it remains to be seen whether they will be able to pivot culturally, embracing alternative revenue models. All the more so because the evidence is mounting that they won’t have much time to do so.
Disclosure: Amazon, Dell, and IBM are RedMonk customers. Facebook, Google, and Twitter are not.
Possibly because the bad ones can kill you, bacteria get a bad rap. Those Purell stations you see at conferences? They’re barely competent as a viricide, but excel at destroying bacteria. And while the CDC says they’re not necessary, anti-bacterial soaps remain all the rage these days.
We’ve been conditioned to consider bacteria as the enemy by way of related horror stories. The toxin produced by Clostridium botulinum, for example, the bacteria which allows celebrities to give their faces a carboard-like appearance, is incredibly toxic. 1 gram of it, in fact, is enough to kill 14,000 people. Escherichia coli, a normally helpful occupant of our digestive tract, has a variant that can cause hemorrhagic diarrhea, kidney failure or even death.
From these stories and others we’ve acquired an instinctive mistrust of bacteria, a faith that they are instrincally harmful. The facts, however, say otherwise.
We know that bacteria are critical to digestion; they have the ability to process compounds the human body does not. They are also able to output vitamins for the body, including B12, Folic Acid, and Vitamin K. Premliminary research, as Wikipedia terms it, suggests that bacteria may be of use in the treament or alleviation of conditions like IBS, lactose intolerance and even colon cancer. In explaining why cavemen had better teeth than modern fluorinated citizens, meanwhile, researchers point to a decrease in bacterial diversity within the mouth.
However beneficial they might be, then, as a species we apparently consider it our mandate to wipe out bacteria wherever we find them. In spite of the fact there is less than no chance that we’ll ever successfully eradicate them, or would be better off in their absence even if we could.
All of which might sound a great deal like quote unquote “Shadow IT” if you’re in the technology industry. If you’re unfamiliar with the term you’re not likely to be unfamiliar with the concept: it refers to individuals or teams within organizations operating independent of IT in areas that have typically fallen under the latter’s jurisdiction. Within most enterprises, Shadow IT is regarded as an almost existential threat, a specter of disorder and chaos. Predictably, the organizational response to this perceived threat is to eliminate it – regardless of the cost.
Like most patterns in the technology industry, Shadow IT is not new. The first PCs made their way into the enterprise unofficially, like the minicomputers before them. Early Lotus sales people, for their part, were explicitly instructed to avoid IT staff and instead make their case directly to the line of business. Shadow IT exists because IT staffs are vulnerable to exactly the same disruptive forces as the vendors they buy from. Each successive generation of IT adoption leads to a calification of infrastructure, skills and mindset. As a result, each new transition – from mainframe to minicomputer, minicomputer to PC, PC to mobile/cloud/etc – leads to a repeat of the cycle, in which new, potentially disruptive technologies are attacked only to become become the incumbent over time.
In other words: all of this has happened before, and all of this will happen again.
The only real difference today is that disruption is occuring at an accelerated rate. Consider that open source really began to take effect in the early part of the last decade, that AWS launched in 2006 and the iPhone one year later. Things are moving quickly, and IT organizations – already ill equipped through no fault of their own to adapt to systemic change – have to adapt to more, faster. The unsurprising result is that IT organizations have been increasingly overwhelmed – and frequently outmatched. To the point that the term IT itself is, in some contexts, a perjorative.
It is not technically true to say that Shadow IT is strictly a reaction to IT dysfunction, but certainly the high latency of application development or the glacial pace of server provisioning, to pick two examples, have contributed to its growth. When line of business requests an application from IT and are told it will take months to develop, they now turn to a Shadow IT armed with tools like dynamic programming languages and PaaS to compress those delivery cycles. When CMOs, for example, need hardware to host the applications they’ve built, and are told delivery times will be measured in weeks, workloads inevitably shift to public clouds. And so on.
One perspective on these developments is that Shadow IT represents a dangerous, destabilizing force within an organization sure to run afoul of compliance regulations or compromise internal security. Which, it should be acknowledged are non-theoretical possibilities. But organizations should also consider the bigger picture: Shadow IT is doing trying to get things done more efficiently. They are the good guys, the beneficial bacteria.
While Shadow IT resources will doubtless bristle at the comparison, they have a lot in common. They fill an essential role: they allow the larger organization to operate more efficiently, and to accomplish things they would otherwise be unable to. And like bacteria, organizations are desperate to eradicate them using any and all means at their disposable – irrespective of the prospective benefits.
Before turning to the antibiotics, however, organizations would do well to examine questions of alignment. It is possible, even probable, that they’ll find that Shadow IT is not only improving the overall health of the organization, they’re doing it better than the factions trying to stamp them out. Either way, Shadow IT resources can take comfort in one other thing they have in common: whatever the response, both are going be with us indefinitely.