Junvn / Apache-Hive-Essentials-Second-Edition

Apache Hive Essentials, Second Edition published by Packt

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Apache Hive Essentials - Second Edition

Apache Hive Essentials - Second Edition

This is the code repository for Apache Hive Essentials - Second Edition, published by Packt.

Essential techniques to help you process, and get unique insights from, big data

What is this book about?

In this book, we prepare you for your journey into big data by frstly introducing you to backgrounds in the big data domain, alongwith the process of setting up and getting familiar with your Hive working environment.

This book covers the following exciting features:

  • Create and set up the Hive environment
  • Discover how to use Hive's definition language to describe data
  • Discover interesting data by joining and filtering datasets in Hive
  • Transform data by using Hive sorting, ordering, and functions
  • Aggregate and sample data in different ways

If you feel this book is for you, get your copy today!

https://www.packtpub.com/

Instructions and Navigations

All of the code is organized into folders. For example, Chapter02.

The code will look like the following:

export HADOOP_HOME=/opt/hadoop
export HADOOP_CONF_DIR=/opt/hadoop/conf
export HIVE_HOME=/opt/hive
export HIVE_CONF_DIR=/opt/hive/conf
export PATH=$PATH:$HIVE_HOME/bin:$HADOOP_HOME/
bin:$HADOOP_HOME/sbin

Following is what you need for this book: If you are a data analyst, developer, or simply someone who wants to quickly get started with Hive to explore and analyze Big Data in Hadoop, this is the book for you. Since Hive is an SQL-like language, some previous experience with SQL will be useful to get the most out of this book.

With the following software and hardware list you can run all code files present in the book (Chapter 1-10).

Software and Hardware List

Chapter Software required OS required
2,3,4,5,6,7,8 NA Windows, Mac OS X, and Linux (Any)
8 Eclipse Windows, Mac OS X, and Linux (Any)

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. Click here to download it.

Related products

Get to Know the Author

Dayong Du Dayong Du is a big data practitioner, author, and coach with over 10 years' experience in technology consulting, designing, and implementing enterprise big data architecture and analytics in various industries, including finance, media, travel, and telecoms. He has a master's degree in computer science from Dalhousie University and is a Cloudera certified Hadoop developer. He is a cofounder of Toronto Big Data Professional Association and the founder of DataFiber website.

Other books by the author

Suggestions and Feedback

Click here if you have any feedback or suggestions.

About

Apache Hive Essentials, Second Edition published by Packt

License:MIT License


Languages

Language:HiveQL 79.9%Language:Java 19.6%Language:Python 0.5%