Data-Intensive Text Processing with MapReduce
by Jimmy Lin, Chris Dyer
Publisher: Morgan & Claypool Publishers 2010
Number of pages: 175
This book focuses on MapReduce algorithm design, with an emphasis on text processing algorithms common in natural language processing, information retrieval, and machine learning. We introduce the notion of MapReduce design patterns, which represent general reusable solutions to commonly occurring problems across a variety of problem domains. This book not only intends to help the reader 'think in MapReduce', but also discusses limitations of the programming model as well.
Home page url
Download or read it online for free here:
by J. M. Hellerstein, M. Stonebraker - UC Berkeley
These lecture notes provide students and professionals with a grounding in database research and a technical context for understanding recent innovations in the field. The readings included treat the most important issues in the database area.
by Ronald Bourret
This paper gives a high-level overview of how to use XML with databases. It describes how the differences between data-centric and document-centric documents affect their usage with databases and how XML is commonly used with relational databases.
by Anand Rajaraman, Jeffrey D. Ullman - Stanford University
At the highest level of description, this book is about data mining. However, it focuses on data mining of very large amounts of data. Because of the emphasis on size, many of our examples are about the Web or data derived from the Web.
by Julio Ponce, Adem Karahoca - InTech
This book presents different ways of theoretical and practical advances and applications of data mining in different promising areas. The book will serve as a Data Mining bible to show a right way for the students, researchers and practitioners.