Scenario/Feature
|
HiveQL
|
SQL
|
Remarks
|
Default Join
|
"equi" join
|
Inner join
|
"equi" join - the only
entries that are returned are the ones where the condition is true and
returns no null values
|
Join syntax
|
LEFT OUTER JOIN
RIGHT OUTER JOIN
|
LEFT JOIN
RIGHT JOIN
|
|
Largest table last
|
Hive attempts to perform a
map-side join where it loads the first table into memory and reads the second
table in as normal input to the map function
|
|
When writing queries, try to
facilitate this as much as possible and order the tables used in the join so
that the largest table is last.
|
Data Type
|
No interval types
|
|
|
|
All queries must reference a table
|
'dual' or table-less queries supported
|
|
|
No session-scoped temp tables
|
|
|
|
No 'IN' predicate
|
|
|
|
No 'FIND' string search function for producing the offset to a match
|
|
|
|
No find/replace string functions for plain strings (i.e. not regex)
|
|
|
|
No regular UNION, INTERSECT, or MINUS operators
|
|
|
|
Null values are treated differently than empty string, and are
exported differently. IE, empty strings are exported as '\n' and nulls
are exported as nulls
|
|
This isn't unique to Hive but still annoying when exporting data from
Hive into another system.
|
|
No hierarchical/self-referencing querying
|
|
Most distributed computing solutions can't do this, but it can be
very handy.
|
|
No Update or Delete statements
|
|
|
|
No cost-based explain plans.
|
|
Running explain plans generally just shows the path of accessing
data. Useful to some degree but it would be great if it was more
advanced in that it could help the user understand which steps are causing
the biggest slowdowns
|
|
Hive Does not support the ability to run a query that select from
tables in more than one database
|
It is possible
|
|
|
Hive does not support sub-queries such as those connected by
IN/EXISTS in the WHERE clause
|
|
|
|
Hive does not support the truncation of data from a table
|
|
|
|
No inequality join
|
|
|
|
group_concat () is missing in Hive QL
|
|
it is available with Impala
|
Saturday, June 14, 2014
HiveQL vs SQL
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment