Data Wrangling Handbook
by Open Knowledge Foundation
Publisher: School of Data 2012
The Data Wrangling Handbook is a companion text to the School of Data. Its function is something like a traditional textbook -- it will provide the detail and background theory to support the School of Data courses and challenges.
Home page url
Download or read it online for free here:
by Eric Redmond - GitBook
This is a free little book about Riak, a scalable, high availability NoSQL datastore. Riak is an open-source, distributed key/value database for high availability and near-linear scalability. Riak has remarkably high uptime and grows with you.
by Alan F Gates - O'Reilly Media
Apache Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs. The structure of Pig programs is amenable to parallelization, which enables them to handle very large data sets.
by Karl Seguin - openmymind.net
MongoDB is a document-oriented database -- it should be viewed as an alternative to relational databases. This book covers a number of topics with a focus on the fundamentals you will need to get comfortably up and running.
MySQL is a free, widely used SQL engine. It can be used as a fast database as well as a rock-solid DBMS using a modular engine architecture. The purpose of this wikibook is to provide a practical knowledge on using the database ...