This page (revision-21) was last changed on 23-Apr-2022 17:06 by Harry Metske

This page was created on 23-Apr-2022 17:05 by Harry Metske

Only authorized users are allowed to rename pages.

Only authorized users are allowed to delete pages.

Page revision history

Version Date Modified Size Author Changes ... Change note
21 23-Apr-2022 17:06 11 KB Harry Metske to previous

Page References

Incoming links Outgoing links

Version management

Difference between version and

At line 1 changed one line
!!! NoSQL
!!! Cassandra
At line 5 added 3 lines
[{TableOfContents }]
At line 13 added 271 lines
* [Datastax Documentation|http://www.datastax.com/docs/1.2/index]
!! Install/config
! lxc
* /etc/default/lxc => change subnet from 10.0.3 to 10.0.4 (10.0.3 is already in use somewhere else)
* add static route in wireless router (10.0.4.0/8 => via 10.0.0.164)
* adjust {{ /etc/network/interfaces }} of container to:
{{{
auto eth0
#iface eth0 inet dhcp
iface eth0 inet static
address 10.0.4.10
netmask 255.255.255.0
network 10.0.4.0
broadcast 10.0.4.255
gateway 10.0.4.1
post-up route add default gw 10.0.4.1 dev eth0
# dns-* options are implemented by the resolvconf package, if installed
dns-nameservers 213.197.28.3 213.197.30.28
dns-search computerhok.nl
}}}
* lxc-host ubuntu1 => 10.0.4.11 (user=ubuntu, password=ubuntu)
! cassandra
* useradd cssndra && mkdir /home/cssndra && chown -R cssndra /home/cssndra + change {{/etc/passwd}} => sh => bash
* wget 'http://mirrors.supportex.net/apache/cassandra/1.2.4/apache-cassandra-1.2.4-bin.tar.gz'
* apt-get install openjdk-7-jdk
* sudo mkdir /opt/apache-cassandra-1.2.4 && sudo chown cssndra /opt/apache-cassandra-1.2.4 && sudo ln -s /opt/apache-cassandra-1.2.4 /opt/cassandra
* cssndra@ubuntu1:~$ cd /opt && tar -xf ~/apache-cassandra-1.2.4-bin.tar
* ls -l
{{{
cssndra@ubuntu1:/opt/cassandra$ ls -l
total 248
-rw-r--r-- 1 cssndra cssndra 152928 Apr 8 19:21 CHANGES.txt
-rw-r--r-- 1 cssndra cssndra 11609 Apr 8 19:21 LICENSE.txt
-rw-r--r-- 1 cssndra cssndra 47580 Apr 8 19:21 NEWS.txt
-rw-r--r-- 1 cssndra cssndra 1820 Apr 8 19:21 NOTICE.txt
-rw-r--r-- 1 cssndra cssndra 3569 Apr 8 19:21 README.txt
drwxr-xr-x 2 cssndra cssndra 4096 May 15 21:43 bin
drwxr-xr-x 2 cssndra cssndra 4096 May 15 21:43 conf
drwxr-xr-x 2 cssndra cssndra 4096 May 15 21:43 interface
drwxr-xr-x 4 cssndra cssndra 4096 May 15 21:43 javadoc
drwxr-xr-x 3 cssndra cssndra 4096 May 15 21:43 lib
drwxr-xr-x 3 cssndra cssndra 4096 May 15 21:43 pylib
drwxr-xr-x 4 cssndra cssndra 4096 Apr 8 19:21 tools
}}}
* sudo mkdir -p /var/lib/cassandra/data /var/lib/cassandra/commitlog && sudo chown -R cssndra /var/lib/cassandra
* sudo mkdir /var/log/cassandra/ && sudo chown -R cssndra /var/log/cassandra
* limit the heap size usage by editing conf/cassandra-env.sh : MAX_HEAP_SIZE="512M" HEAP_NEWSIZE="100M"
* edit {{./conf/cassandra.yaml}}:
** change the ''listen_address'' to 10.0.4.11 , necessary for multinode cluster communication (as OS hostname/ip is not properly configured)
** change the ''rpc_address'' to 10.0.4.11 , necessary to make it reachable from non-localhost (as OS hostname/ip is not properly configured)
** change the seeds parameter from 127.0.0.1 to 10.0.4.11 (this first node becomes the seed node for all nodes)
** configure ''endpoint_snitch: GossipingPropertyFileSnitch''
* edit {{./conf/log4j-server.properties}} : remove logging to stdout
* edit {{./conf/cassandra-rackdc.properties}} : see [Cassandra#Cluster config]
* edit {{./conf/cassandra-env.sh}} (at the bottom of the file) : uncomment and fill in : '' JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=10.0.4.12" '' %%small (this makes it possible to run nodetool against remote hosts) %%
* ==> now first clone the VM : ''lxc-clone -o ubuntu1 -n ubuntu2'' , ubuntu2 has address 10.0.4.12
* continue with the first node and there start the thing with : ''./bin/cassandra -f ''
* add keyspace:
%%small {{{
cssndra@ubuntu1:/opt/cassandra$ ./bin/cassandra-cli
Connected to: "Test Cluster" on 127.0.0.1/9160
Welcome to Cassandra CLI version 1.2.4
Type 'help;' or '?' for help.
Type 'quit;' or 'exit;' to quit.
[default@unknown] create keyspace DEMO with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy' and strategy_options = {replication_factor:1};
f243abc2-57dc-32b4-9390-beac4e988c5b
[default@unknown] use DEMO;
Authenticated to keyspace: DEMO
[default@DEMO] create column family Users with key_validation_class = 'UTF8Type' and comparator = 'UTF8Type' and default_validation_class = 'UTF8Type';
bb07d315-a824-3bbb-9b48-d759288e57b4
[default@DEMO] set Users[1234][name] = scott;
Value inserted.
Elapsed time: 63 msec(s).
[default@DEMO] set Users[1234][password] = tiger;
Value inserted.
Elapsed time: 3.96 msec(s).
[default@DEMO] get Users[1234];
=> (column=name, value=scott, timestamp=1368679313136000)
=> (column=password, value=tiger, timestamp=1368679322745000)
Returned 2 results.
Elapsed time: 58 msec(s).
[default@DEMO]
}}} %%
Now on the second node (ubuntu2) start the node with ''/opt/cassandra/bin/cassandra'' , and start creating the cluster.
First create the proper tokens for a 4 node cluster, by using the following py :
%%prettify
{{{
# Number of nodes in the cluster
num_node = 4
for n in range(num_node):
print int(2**127 / num_node * n)
}}}
%%
And execute it :
{{{
cssndra@ubuntu1:~$ python calcToken.py
0
42535295865117307932921825928971026432
85070591730234615865843651857942052864
127605887595351923798765477786913079296
}}}
First startup cassandra on all 4 nodes by executing from the host: ''ssh ubuntu@10.0.4.11 'sudo su - cssndra /opt/cassandra/bin/cassandra' ''
!! Cluster config
||IP||DC||RACK||seeder
|10.0.4.11|DC1|RAC1|Y
|10.0.4.12|DC1|RAC2|N
|10.0.4.13|DC2|RAC1|Y
|10.0.4.14|DC2|RAC2|N
!! Creating keyspace, tables, inserting, updating , querying
! Create keyspace
First create a keyspace. You can do that both with ''cassandra-cli'' and ''cqlsh'', but they have different syntaxes :-) .\\
Here's a cqlsh example:
{{{
[default@unknown] cssndra@ubuntu1:~$ cqlsh
Connected to Test Cluster at localhost:9160.
[cqlsh 2.3.0 | Cassandra 1.2.4 | CQL spec 3.0.0 | Thrift protocol 19.35.0]
Use HELP for help.
cqlsh> CREATE KEYSPACE demo_keyspace WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'DC1' : 2, 'DC2' : 2};
cqlsh> select * from system.schema_keyspaces;
keyspace_name | durable_writes | strategy_class | strategy_options
---------------+----------------+------------------------------------------------------+----------------------------
system_auth | True | org.apache.cassandra.locator.SimpleStrategy | {"replication_factor":"1"}
demo_keyspace | True | org.apache.cassandra.locator.NetworkTopologyStrategy | {"DC2":"2","DC1":"2"}
system | True | org.apache.cassandra.locator.LocalStrategy | {}
system_traces | True | org.apache.cassandra.locator.SimpleStrategy | {"replication_factor":"1"}
cqlsh>
}}}
! Create a column family
Create a columnfamily with the ''cassandra-cli'' utility:
{{{
cssndra@ubuntu1:~$ cassandra-cli
Connected to: "Test Cluster" on 127.0.0.1/9160
Welcome to Cassandra CLI version 1.2.4
Type 'help;' or '?' for help.
Type 'quit;' or 'exit;' to quit.
[default@unknown] use demo_keyspace;
Authenticated to keyspace: demo_keyspace
[default@demo_keyspace] create column family users with key_validation_class = 'UTF8Type' and comparator = 'UTF8Type' and default_validation_class = 'UTF8Type';
0b6b0010-fc89-35a1-ad05-77d53e5a4443
}}}
! Insert data
Again with the ''cassandra-cli'' utility insert some data in the {{users}} columnfamily:
{{{
[default@demo_keyspace] cssndra@ubuntu1:~$ cassandra-cli
Connected to: "Test Cluster" on 127.0.0.1/9160
Welcome to Cassandra CLI version 1.2.4
Type 'help;' or '?' for help.
Type 'quit;' or 'exit;' to quit.
[default@unknown] use demo_keyspace;
Authenticated to keyspace: demo_keyspace
[default@demo_keyspace] set users[1234][name] = scott;
Value inserted.
Elapsed time: 52 msec(s).
[default@demo_keyspace] set users[1234][password] = scott-secret;
Value inserted.
Elapsed time: 14 msec(s).
[default@demo_keyspace] set users[1234][length] = 185;
Value inserted.
Elapsed time: 8.76 msec(s).
[default@demo_keyspace] set users[1235][name] = harry;
Value inserted.
Elapsed time: 6.71 msec(s).
[default@demo_keyspace] set users[1235][length] = 181;
Value inserted.
Elapsed time: 13 msec(s).
[default@demo_keyspace] set users[1235][whatevercolumn] = skfkjdkfjdklsjfsjflkjldk181;
Value inserted.
Elapsed time: 6.2 msec(s).
[default@demo_keyspace] list users;
Using default limit of 100
Using default column limit of 100
-------------------
RowKey: 1234
=> (column=length, value=185, timestamp=1368883341707000)
=> (column=name, value=scott, timestamp=1368883316118000)
=> (column=password, value=scott-secret, timestamp=1368883330142000)
-------------------
RowKey: 1235
=> (column=length, value=181, timestamp=1368883368497000)
=> (column=name, value=harry, timestamp=1368883358461000)
=> (column=whatevercolumn, value=skfkjdkfjdklsjfsjflkjldk181, timestamp=1368883385475000)
2 Rows Returned.
Elapsed time: 42 msec(s).
}}}
And insert with the ''cqlsh -2'' utility :
{{{
cssndra@ubuntu1:~$ cqlsh -2
Connected to Test Cluster at localhost:9160.
[cqlsh 2.3.0 | Cassandra 0.0.0 | CQL spec 2.0.0 | Thrift protocol 19.35.0]
Use HELP for help.
cqlsh> USE demo_keyspace ;
cqlsh:demo_keyspace> INSERT INTO users ( key, name, password, whatevercolumn) VALUES ( '12345' , 'harry' , 'wachtwoordje' , 'blablabla fjkdsjf l4j 2k43u');
cqlsh:demo_keyspace>
}}}
And a stupid shell script to insert bulk data :
%%prettify
{{{
#!/bin/bash
#
num=$1
let n=0
TMPFILE=/tmp/$RANDOM.cql
echo "use demo_keyspace;" > $TMPFILE
while [ $n -lt $num ]
do
# echo $n
CQL="INSERT INTO users ( key, name, password, whatevercolumn) VALUES ( '99${n}' , 'harry${n}' , 'wachtwoordje${n}' , 'blablabla fjkdsjf ${n} ${n} ${n}2k43u');"
echo $CQL >> $TMPFILE
let n=n+1
done
echo "echoing inserts to cqlsh..."
cqlsh -2 -f $TMPFILE
echo "listing users..."
cat <<EOF | cassandra-cli
use demo_keyspace;
list users;
EOF
rm $TMPFILE
}}}
%%
!! Cassandra notes/questions
! Questions
* Can I share a cassandra cluster between multiple applications while still having some form of (security) separation ? (like having multiple databases in MySQL, and arranging access to them with grants).
* Security in general, how is the gossip protected, how to prevent "illegal nodes" from entering the cluster ?
* Security, how is access control arranged, and on what level ?
* How to change replica settings ?
** You set the number of replicas when you create a keyspace using the replica placement strategy.
** run through cqlsh: ''ALTER KEYSPACE "Excalibur" WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 3 }; ''
** On each affected node, run nodetool repair. Wait until repair completes on a node before moving to the next node.
** also see [http://www.datastax.com/docs/1.2/cql_cli/using/keyspace]
* How to take down (phase out) a node in a controlled way ?
* what snitch to use ?
! Notes
* Every node should have the same list of seeds. In multiple data-center clusters, the seed list should include a node from each data center.
* Use NetworkTopologyStrategy when you have (or plan to have) your cluster deployed __across multiple data centers__
* Use vnodes, see ''num_tokens'' in cassandra.yaml and [http://wiki.apache.org/cassandra/Operations]