by Alan F Gates
Publisher: O'Reilly Media 2011
Number of pages: 344
Apache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.
Home page url
Download or read it online for free here:
by Matthew North - Global Text Project
In this book, professor Matt North uses simple examples, clear explanations and free, powerful, easy-to-use software to teach you the basics of data mining; techniques that can help you answer some of your toughest business questions.
MySQL is a free, widely used SQL engine. It can be used as a fast database as well as a rock-solid DBMS using a modular engine architecture. The purpose of this wikibook is to provide a practical knowledge on using the database ...
by Karl Seguin - openmymind.net
Redis represents a simplification in the way we deal with data. It peels away much of the complexity and abstraction available in other systems. The goal of this book is to build the foundation you'll need to master Redis.
by Open Knowledge Foundation - School of Data
The Data Wrangling Handbook is a companion text to the School of Data. Its function is something like a traditional textbook -- it will provide the detail and background theory to support the School of Data courses and challenges.