If you have existing HBase tables it can be very handy to create Hive external tables wrapping these so that you can run HiveQL queries.
The following HiveQL will create the metastore table schema on top of MyTableName:
CREATE EXTERNAL TABLE MyTableName(key string, Column1 string, Column2 string)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf:Column1,cf:Column2")
TBLPROPERTIES ('hbase.table.name' = 'MyTableName');
The next step is to add the hbase auxlib jars to hive-site.xml to ensure certain HiveQL queries will run (e.g. select count(1) from MyTableName):
<property>
<name>hive.aux.jars.path</name>
<value>file:///opt/cloudera/parcels/CDH/lib/hive/lib/zookeeper.jar,file:///opt/cloudera/parcels/CDH/lib/hive/lib/hbase.jar,file:///opt/cloudera/parcels/CDH/lib/hive/lib/hive-hbase-handler-0.10.0-cdh4.6.0.jar,file:///opt/cloudera/parcels/CDH/lib/hive/lib/guava-11.0.2.jar</value>
</property>
This property must be added to the hive1 service > Config > Service-Wide > Advanced > Hive Service Configuration Safety Valve for hive-site.xml section to be able to execute certain HiveQL commands from the host command line.
To execute HiveQL from the Hue HiveUI add it to the hue1 service > Config > Beeswax Server (Default) > Advanced > Hive Configuration Safety Valve section.
Refs:
https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration
http://www.confusedcoders.com/bigdata/hive/hbase-hive-integration-querying-hbase-via-hive
http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/cdh4ig_topic_18_10.html
https://groups.google.com/a/cloudera.org/forum/#!topic/cdh-user/E8GfiwMOIPw
Wednesday, March 12, 2014
Sunday, March 9, 2014
Free disk space email alert on Windows Server
Schedule the following PowerShell script to run periodically to receive email alerts on low disk space:
# Schedule with: %windir%\SysWOW64\WindowsPowerShell\v1.0\powershell.exe -WindowStyle "Hidden" -File C:\path_to_scripts\freedisk_space_alert.ps1
$minGbThreshold = 10;
$computers = "localhost"
$smtpAddress = "localhost";
$toAddress = "someone@test.com";
$fromAddress = "someone@test.com";
foreach($computer in $computers)
{
$disks = Get-WmiObject -ComputerName $computer -Class Win32_LogicalDisk -Filter "DriveType = 3";
$computer = $computer.toupper();
$deviceID = $disk.DeviceID;
foreach($disk in $disks)
{
$freeSpaceGB = [Math]::Round([float]$disk.FreeSpace / 1073741824, 2);
if($freeSpaceGB -lt $minGbThreshold)
{
$smtp = New-Object Net.Mail.SmtpClient($smtpAddress)
$msg = New-Object Net.Mail.MailMessage
$msg.To.Add($toAddress)
$msg.From = $fromAddress
$msg.Subject = "Diskspace below threshold: " + $computer + "\" + $disk.DeviceId
$msg.Body = $computer + "\" + $disk.DeviceId + " " + $freeSpaceGB + "GB Remaining";
$smtp.UseDefaultCredentials = $false;
$cred = New-Object System.Net.NetworkCredential("\smtpuser", "<enter password>"); # Ensure smtpuser is authorised to send emails
$smtp.Credentials = $cred;
$smtp.Send($msg)
}
}
}
This script is slightly modified from the one found here to include credentials:
http://gavindraper.com/2012/09/22/automatic-low-hard-disk-alerts-for-windows-server/
Thanks Gavin!
Also you may need to enable script execution in PowerShell by running the following:
%windir%\SysWOW64\WindowsPowerShell\v1.0\powershell.exe set-executionpolicy remotesigned
Which will still restrict internet downloaded scripts and is better than setting this value to unrestricted
See: http://superuser.com/questions/106360/how-to-enable-execution-of-powershell-scripts
# Schedule with: %windir%\SysWOW64\WindowsPowerShell\v1.0\powershell.exe -WindowStyle "Hidden" -File C:\path_to_scripts\freedisk_space_alert.ps1
$minGbThreshold = 10;
$computers = "localhost"
$smtpAddress = "localhost";
$toAddress = "someone@test.com";
$fromAddress = "someone@test.com";
foreach($computer in $computers)
{
$disks = Get-WmiObject -ComputerName $computer -Class Win32_LogicalDisk -Filter "DriveType = 3";
$computer = $computer.toupper();
$deviceID = $disk.DeviceID;
foreach($disk in $disks)
{
$freeSpaceGB = [Math]::Round([float]$disk.FreeSpace / 1073741824, 2);
if($freeSpaceGB -lt $minGbThreshold)
{
$smtp = New-Object Net.Mail.SmtpClient($smtpAddress)
$msg = New-Object Net.Mail.MailMessage
$msg.To.Add($toAddress)
$msg.From = $fromAddress
$msg.Subject = "Diskspace below threshold: " + $computer + "\" + $disk.DeviceId
$msg.Body = $computer + "\" + $disk.DeviceId + " " + $freeSpaceGB + "GB Remaining";
$smtp.UseDefaultCredentials = $false;
$cred = New-Object System.Net.NetworkCredential("\smtpuser", "<enter password>"); # Ensure smtpuser is authorised to send emails
$smtp.Credentials = $cred;
$smtp.Send($msg)
}
}
}
This script is slightly modified from the one found here to include credentials:
http://gavindraper.com/2012/09/22/automatic-low-hard-disk-alerts-for-windows-server/
Thanks Gavin!
Also you may need to enable script execution in PowerShell by running the following:
%windir%\SysWOW64\WindowsPowerShell\v1.0\powershell.exe set-executionpolicy remotesigned
Which will still restrict internet downloaded scripts and is better than setting this value to unrestricted
See: http://superuser.com/questions/106360/how-to-enable-execution-of-powershell-scripts
Subscribe to:
Comments (Atom)