Cloud Used: Rackspace
Load Balancer used: HAProxy
OS: Centos 5.5
Cassandra version used: apache-cassandra-0.6.5
Find below steps to cluster Cassandra through HAProxy on Rackspace Cloud:-
1. Install HAProxy on the any node currently I am using centos 5.5
2. Install cassandra as seed node on another machine. By default, Cassandra uses 7000 for cluster communication, 9160 for clients (Thrift), and 8080 for JMX.
3. Change cassandra clustering configuration on seed node in the file $CASSANDRA_HOME/conf/storage-conf.xml as follows
1) In the seed enter the IP of HAProxy Load Balancer node
<Seeds>
<Seed> HAProxy_Load_Balancer_IP</Seed>
</Seeds>
2) Enter ip of cassandra seed node in the ListenAddress and ThriftAddress
<ListenAddress>cassandra_seed_ip</ListenAddress>
<ThriftAddress>cassandra_seed_ip</ThriftAddress>
4. Open cassandra ports on seed node by running following commands on command prompt
iptables -I INPUT 1 -p tcp –dport 7000 -j ACCEPT
/etc/init.d/iptables save
/etc/init.d/iptables restart
iptables -I INPUT 1 -p tcp –dport 9160 -j ACCEPT
/etc/init.d/iptables save
/etc/init.d/iptables restart
iptables -I INPUT 1 -p tcp –dport 8080 -j ACCEPT
/etc/init.d/iptables save
/etc/init.d/iptables restart
5. Edit HaProxy configuration file /etc/haproxy.cfg on the HaProxy node to add Cassandra port configurations as follows
listen cassandraseed
bind *:7000
mode tcp
option tcplog
log global
balance roundrobin
clitimeout 150000
srvtimeout 150000
contimeout 30000
server server1 cassandraSeedNodeIP:7000 check
listen cassandrathrift
bind *:9160
mode tcp
option tcplog
log global
balance roundrobin
clitimeout 150000
srvtimeout 150000
contimeout 30000
server server1 cassandraSeedNodeIP:9160 check
listen cassandrajmx
bind *:8000
mode tcp
option tcplog
log global
balance roundrobin
clitimeout 150000
srvtimeout 150000
contimeout 30000
server server1 cassandraSeedNodeIP:8080 check
6. Open ports on Haproxy node by running following commands on command prompt
iptables -I INPUT 1 -p tcp –dport 7000 -j ACCEPT
/etc/init.d/iptables save
/etc/init.d/iptables restart
iptables -I INPUT 1 -p tcp –dport 9160 -j ACCEPT
/etc/init.d/iptables save
/etc/init.d/iptables restart
iptables -I INPUT 1 -p tcp –dport 8000 -j ACCEPT
/etc/init.d/iptables save
/etc/init.d/iptables restart
7. Install cassandra as non-seed node on another machine.
8. Change cassandra clustering configuration on non-seed node in the file $CASSANDRA_HOME/conf/storage-conf.xml as follows
1) In the seed enter the IP of HAProxy Load Balancer
<Seeds>
<Seed> HAProxy_Load_Balancer_IP</Seed>
</Seeds>
2) Enter ip of cassandra non-seed node in the ListenAddress and ThriftAddress
<ListenAddress> cassandra_non-seed_ip</ListenAddress>
<ThriftAddress> cassandra_non-seed_ip</ThriftAddress>
3) On AutoBootstrap on the non-seed-node
<AutoBootstrap>true</AutoBootstrap>
9. Open cassandra ports on non-seed node by running following commands on command prompt
iptables -I INPUT 1 -p tcp –dport 7000 -j ACCEPT
/etc/init.d/iptables save
/etc/init.d/iptables restart
iptables -I INPUT 1 -p tcp –dport 9160 -j ACCEPT
/etc/init.d/iptables save
/etc/init.d/iptables restart
iptables -I INPUT 1 -p tcp –dport 8080 -j ACCEPT
/etc/init.d/iptables save
/etc/init.d/iptables restart
10. Restart Start Haproxy on Haproxy node bye running following command:
/etc/init.d/haproxy restart
11. Start Seed Cassandra node by running following command
$CASSANDRA_HOME/bin/cassandra
12. Start Non-Seed Cassandra node by running following command
$CASSANDRA_HOME/bin/cassandra
That’s it your cassandra machines are cluster through HaProxy 🙂
You can verify by testing through cassandra cli. Run cassandra cli on both node by running following command and connect to their respective thrift address:
- On Seed node run following commands:
a. $CASSANDRA_HOME/bin/cassandra-cli
b. cassandra> connect seed_ip/9160
c. cassandra> set Keyspace1.Standard1[‘IIPL-1274’][‘name’]=’Sunil Kumar’
- On Non-Seed run following commands:
a. $CASSANDRA_HOME/bin/cassandra-cli
b. cassandra>connect non-seed_ip/9160
c. cassandra> get Keyspace1.Standard1[‘IIPL-1274’]
output should be
=> (column=6e616d65, value=Sunil Kumar, timestamp=1288196657949000)
Returned 1 results.
cheeeeeeeeeeeeeeeeeeers:)
Hi, its nice post about media print, we all be familiar with media
is a fantastic source of facts.