Saturday, November 2, 2013

Hadoop Hands-On exercise with Hortonworks: HiveQL basics

This is second post in the Hadoop hands-on series. In last post we have,
  1. Hadoop instance running
  2. Data uploaded in HDFS and HCatalog
In this post we query the data using Hadoop. HiveQL is the query language similar to SQL. We will use it to query the data stored in Hadoop clusters.    

Go to Hue > Beeswax (User interface for HiveQL)

Now you should be able to see the query editor. You can enter one query at a time from list of queries below and get the results. Refer the following screenshots

As shown above, enter the query and click on "Execute".

Select all records
Query:  Select * from nyse_stocks

Save the results
You can see the option "Save" in screenshot above, which will save the results for corresponding query. We have two options, either we can save it as a new table or a file in HDFS directory as shown below.

Other than this we can get the file using "Download as" option, in comma separated (csv) or excel (xls) file.

Describe the database table
Query:  describe nyse_stocks

Get number of records
Query:  select count(*) from nyse_stocks

Display records with specific condition
Query:  select * from nyse_stocks where stock_symbol="IBM"

No comments:

Post a Comment