diff options
Diffstat (limited to 'published/amazonmx-apache.txt')
-rw-r--r-- | published/amazonmx-apache.txt | 29 |
1 files changed, 29 insertions, 0 deletions
diff --git a/published/amazonmx-apache.txt b/published/amazonmx-apache.txt new file mode 100644 index 0000000..40615cb --- /dev/null +++ b/published/amazonmx-apache.txt @@ -0,0 +1,29 @@ +The MXNet Machine Learning project was recently accepted to the Apache Software Foundation's incubator for open source projects. What's surprising about the announcement isn't that the ASF is accepting yet another machine learning tool -- it's hard to turn around the in software world these days without tripping over a couple of ML tools -- it's that MXNet developers, most of whom come from Amazon, still think the ASF is relevant. + +MXNet is an open-source "deep learning" framework that allows you to define, train, and deploy so-called neural networks on a wide array of devices. It also happens to be the machine learning tool of choice at Amazon.com and is available today via ready-to-deploy EC2 instances. + +Deep learning is the currently very popular subset of machine learning that focuses on hierarchical algorithms with non-linearities, which help find patterns and learn representations within data sets. That's a fancy way of saying it learns as it finds. Deep learning tools are currently popular thanks to their success in applications like speech recognition, natural language understanding and recommendation systems (think Siri, Alexa, et al). Every time you sit on your couch yelling at Alexa you're using a deep learning system. + +What makes MXNet interesting at this stage is Amazon claims it's the most scalable tool the company has and Amazon is a company that knows a thing or two about what scales and what doesn't. + +MXNet is far from the only kid on the deep learning block. In fact it's a bit late to the game. Other popular tools in the deep learning world include Torch -- used at Facebook, Google and NYU -- and Microsoft's "Adam", but perhaps the biggest direct competitor is Google's TensorFlow. TensorFlow is open source, using the Apache License as well. + +If you're new to the open source world -- and machine learning tools and developers often are -- you'd be forgiven for having no real idea what the Apache Software Foundation is. Even if you're very familiar with the ASF you might still wonder why a multibillion dollar company like Amazon would be so excited to have its pet project adopted by an all volunteer group that somehow manage to run the ASF on barely $500k a year? + +In a word, community. + +The purpose of the ASF incubator is to help external projects improve the quality of their code and participate in the larger community. It is in other words, a kind of seal of approval for an open source project that it is truly open source and uses the ASF voting procedures and all the rest of the quasi-democratic governance system the ASF has developed, known among the anointed as "The Apache Way". + +Given a choice between that sort of community and the TensorFlow community, which, while open source is very heavily managed by Google, MXNet starts to look more appealing. And the more appeal it has the more developers that get involved and the better the code gets. If you want to think of it in terms of machine learning, the ASF is a learning network for developers. + +It's worth noting that not every project that enters the ASF incubator manages to escape its parents. But officially projects don't get to move past the incubation stage until they demonstrate independence from any one contributor or sponsoring or entity. + +Incubation is the first step for a project that wants to become an official ASF project. It is in sort, no guarantee that a project will either succeed or end up in the auspices of the ASF. Among the incubator's successes are SpamAssassin and of course the Apache web server, which despite being bested by half a dozen newer, lighter weight, faster web servers, somehow still manages to power about half of the web. Then there's OpenOffice, another incubator graduate, but one that has largely been eclipsed by LibreOffice. + +Now Amazon is hoping that MXNet can learn a few tricks from the ASF and maybe build a community that can help it catch up to competitors. + +As Amazon's Dr. Matt Wood writes on the AWS blog, the reason the project wants to be part of the Apache Incubator is to "take advantage of the Apache Software Foundation’s process, stewardship, outreach, and community events". In short it wants to use the ASF's clout to attract more developers. + +It's tempting to see Amazon's move as entirely self-serving, and indeed it is, but that's just the beginning of the story. The ASF may not be the household name it once was, but it still has considerable clout and its governance and so-called "Apache Way" really do turn out some impressive, well-developed community projects. With that behind MXNet its odds of besting TensorFlow and others does go up considerably. + +And of course the ASF gets what's probably its best ML project to date. MXNet is certainly one of the easiest to deploy, given that there's already a AWS Deep Learning AMI available, complete with MXNet, and plenty of example code pre-compiled and ready to use. That that server instance you just spun up happens to be closely tied into other AWS services, which you might want to invest in as well, is just coincidence I'm sure. |