Sponsored by
Connexin Software
ABSTRACT: Hadooop and Mahout
In recent years, cheap commodity storage hardware and cloud enabled companies to collect and accumulate vast amounts of information. The need for understanding and making use of this information has given birth to the concept of Big Data and necessitated the need for distributed processing frameworks among which Hadoop is a current leader. While Hadoop provides a platform for Big Data processing, in and of itself it does not expose facilities for pattern and knowledge discovery in data. Add knowledge and pattern discovery capabilities to Hadoop is the goal of the Apache Mahout project, which exposes a set of cutting edge Machine Learning and Artificial Intelligence libraries designed for the Hadoop platform.
In this talk, I am going to present a use case of using Machine Learning in the Big Data context and walk through an example of addressing the use case using Mahout and Hadoop. The walk-through will assume basic understanding of Hadoop installation and configuration and will focus on challenges associated with using the Mahout product, which (while providing an excellent collection of machine learning algorithms) is not trivial to set up and configure.
SPEAKER BIO: Anton Slutsky
Anton Slutsky is an experienced information technology professional with over a decade of experience in the field. He has a Masters degree in Computer Science from Villanova University and is currently working on his PhD at Drexel University with published research works in the area of Artificial Intelligence, Machine Learning and Data Mining. Prior to his current position as Chief Science Officer at Zelant Software, Inc, Anton led engineering efforts at the Oracle and BEA Systems. Currently, Anton is involved in promoting the concept of embedded analytics – an approach meant to operationalize Big Data using cutting edge machine learning and data mining techniques and research findings.
MEETING SLIDES
hadoop+mahout PDF Hadoop + Mahout code zip file