-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathREADME
120 lines (75 loc) · 3.52 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
HBasta
A Simple wrapper around Thrift HBase API
Introduction
------------
HBasta aims to do three things:
* Ease the pain of initially setting up all requirements to actually use the
hbase Thrift python API (e.g thrift, thrift python bindings, generated hbase bindings)
* Provide a thin wrapper on top of the generated hbase thrift API
that takes care of some of the boilerplate needed to setup a connection,
as well as dealing with most of the Thrift types, implicitly converting
them to native python structures (e.g dict, list, tuple) in the most natural
way possible.
* Provide some simple command-line tools to streamline working
with hbase. In particular, easy ways of creating data dumps/views
by scanning different ranges in tables
It was designed for lazy people (like me) in mind :)
This software is distributed under the Apache 2.0 license.
Prerequisites
-------------
* setuptools
If you've done any serious python, you probably have setuptools installed.
Easy way to check would be to see if you have the 'easy_install' executable in your path.
Otherwise, It can easily be installed with most linux package manages. on Ubuntu:
apt-get install python-setuptools
* Thrift
Thrift can be typically installed e.g by downloading Thrift from project homepage (http://thrift.apache.org/download/)
and using the Makefile:
tar xzf thrift-0.5.0.tar.gz
cd thrift-0.5.0
./configure && make
sudo make install
* Thrift python bindings
These should be installed automatically when using setuptools (see below on installation instructions)
* hbase generated library
Run the included 'generate_hbase_py.sh' to generate the hbase python module.
Afterwards, it should be automatically installed when using setuptools (see below on installation instructions).
NOTE: This library will by default reflect latest HBase Thrift API from the HBase trunk.
If you'd like to bundle a different hbase generated API, simply generate it yourself from the appropriate .thrift
file and then place all the generated files in a folder called 'hbase', under the hbasta project root before installing.
Installing
----------
Installation should be straightforward:
1. As mentioned above, use 'generate_hbase_py.sh' to get latest Hbase.thrift from Hbase trunk and generate the
Hbase python bindings on the fly:
./generate_hbase_py.sh
2. Now you can install with setuptools easily:
python setup.py install
Usage
------
The HBasta class exposes a simple programmatic interface to most of the common HBase operations:
* Create a table
>>> from hbasta import HBasta
>>> client = HBasta('localhost', 9999)
>>> client.create_table('tablename')
* Get a whole / partial row
>>> from hbasta import HBasta
>>> client = HBasta('localhost', 9999)
>>> row = client.get_row('mytable', 'row_key')
>>> print row
{'colfamily:col1': col1_val, ...}
>>> partial_row = client.get_row('mytable', 'row_key', colspec=['colfamily:col2'])
>>> print partial_row
{'colfamily:col2': col2_val}
* Add a new row
>>> client.add_row('mytable', 'row_key', cols = { 'myfamily:mycolumn' => 'mydata' })
* Use a scanner
>>> from hbasta import HBasta
>>> client = HBasta('localhost', 9999)
>>> scanner_id = client.scanner_open('mytable', 'row_0_id', colspec=('data',))
>>> for row, cols in client.scanner_get_list(scanner_id, num_rows=100):
...
>>> client.scanner_close(scanner_id)
For usage instructions on how to use the command-line utilities, use the --help flag.
Currently supported command-line tools:
* hbase-prefix-scan - Run a prefix-based scanner on a table