Skip to content
March 5, 2014 / binidas

Securing your site

Following few steps below to ensure your website is as secure as possible. There are no complete fool proof solution to it, however ensuring some best practices can help intruder to fiddle with you customers account and harm your website.


1. Use TLS/SSL on the website to provide secure https channel for communication between the client and server.

2. Use PBKDF2  or Bcrypt hashing function to store password. These functions have multiple interations on hashing which slows down the hacker. SHA and md hashing algorithm is supposed to be efficient at performance and hence susceptible to be hacked.

3. Cross Site Request Forgery – Add CSRF token on every web page to prevent hackers repost the request with pre-authenticated  auth cookie with little social engineering on the site you are accessing.

Few others from –

Account/Password Policy

1. Account disclosure – Do not reveal if user account exist in the system. Validation error message on login and forgot password page should state either username and / or password doesn’t match.

2. Brute force protection – lock the account after 3 or 5 attempt to login.

3. Enforce strong credential – password hint should state only minimum character and not maximum characters. Example, “password should be between 6 to 10 character ” is wrong. “Password should have minimum 6 characters”.

4. Never email password instead provide link to password reset token with time limit on the link.

5. You may use CAPTCHA to prevent automating the process of hacking

6. Providing a god secrect question is important , follow on (Ref- paypal secret question). Ensure to secure hash the secret question and answer as you do for password.

7. Use 2 factor authentication for password reset either by RSA SecureID or using SMS (see how google does 2FA. Fisrt it sends url to email with unique generated token, when user clicks on link it will then send code via SMS to you registered mobile. It will then redirect to password reset page). Not a fool proof if you lose your smart phone.

8. Notify the owner for password change.

9. Log every action of the account.

Thanks Troy Hunt for this beautiful flow chart.


References –

October 8, 2013 / binidas

Saga Pattern

While working on distributed system focus our effort on designing low latency and managing consistency across the systems. As the standard practise of SOA architecture the single responsibility principle is best implemented by breaking down the monolithic systems into multiple systems which has its own unit of work or services or business process.

This boils down to need for managing distributed system with various states which can have long running processes or workflows across various services. Such reactive systems with state machine have to manage fault tolerance ensuring at any point in time there is no inconsistency of data across and to reduce long running transactions. The way its done by keeping the communication short and handling failures with compensation. Saga architecture pattern can be introduced to design such state machine.

Jimmy Bogard explains it in pretty good example which can be implemented in real life scenarios. As Jimmy Bogard explains in his blog about ObserverController and Command patterns used in designing Saga implementation.

He also further details about how Process Manager pattern from Enterprise Integration pattern brings on resource contention and starvation.

Having said that route slips can be used when no centralised controller is required. This is applied when steps to next process is known, routine slip will provide better solution which prevent shared state as in process manager pattern.

October 2, 2013 / binidas

Hue vs hdfs users and groups

Exploring Hue and Hdfs found things to be aware of is user accounts, it is bit confusing when you try to upload files to hdfs. Hue user and hdfs users are different, so if you try to give file permission from hue interface, it will show permission denied as the privilages to do so is only by hdfs user.

Blog authorization-and-authentication-in-hadoop explains in detail about the users auth in hadoop echo system. So basically the concept of users created by super user & group(i.e hdfs or mapred etc..) is nothing but string of characters.

Ideally, following commands will create dummy_user and dummy_group to directory /test_dir created by super user ‘hdfs’.

[cloudera@localhost bin]$ sudo -u hdfs hadoop fs -mkdir /test_dir
[cloudera@localhost bin]$ sudo -u hdfs hadoop fs -chown dummy_user:dummy_group /test_dir

HDFS user Permission – Allows access to files in the HDFS
MapReduce user Permission – Defines which user can run the job, etc.
Priotrity of authorization is based on escalation, first verifies if user has permission or else checks for group permission. You can check the groups user belongs using command ‘id -Gn dummy_user’. System can be configured to use LDAP/Active Directory for users/groups.

So finally i got a simple MR application running on Cloudera. You can find it on Git hub .

October 2, 2013 / binidas

Running cloudera centOS and list of issues encountered

I downloaded full blown standard edition of Cloudera to install on 64-bit server CentOS 6.4. Installation of components was pretty straight forward than the vanilla Hadoop ecosystem thanks to cloudera manager that makes so easy to install each component.

However i must say it wasnt a one click installation of cloudera cluster,  had to tweak few configurations to get it working. Below are the issues i encountered so far and resolved.

1. Open ports as given below, else installation will error out.

[root@localhost ~]# vi /etc/sysconfig/iptables
-A INPUT -m state –state NEW -m tcp -p tcp –dport 7180 -j ACCEPT
-A INPUT -m state –state NEW -m tcp -p tcp –dport 7182 -j ACCEPT
-A INPUT -m state –state NEW -m tcp -p tcp –dport 7183 -j ACCEPT
-A INPUT -m state –state NEW -m tcp -p tcp –dport 7184 -j ACCEPT
-A INPUT -m state –state NEW -m tcp -p tcp –dport 7432 -j ACCEPT
-A INPUT -m state –state NEW -m tcp -p tcp –dport 9000 -j ACCEPT
-A INPUT -m state –state NEW -m tcp -p tcp –dport 9001 -j ACCEPT
-A INPUT -m state –state NEW -m tcp -p tcp –dport 8888 -j ACCEPT


[root@localhost ~]# /etc/init.d/iptables restart

iptables: Flushing firewall rules: [ OK ]
iptables: Setting chains to policy ACCEPT: filter [ OK ]
iptables: Unloading modules: [ OK ]
iptables: Applying firewall rules: [ OK ]
iptables: Loading additional modules: nf_conntrack_ftp [ OK ]
2. WARNING hostname localhost.localdomain differs from the canonical name localhost

Verify host name it resolves to “host -v -t A `hostname`”

Edit hostname mapping in /etc/sysconfig/network. Change localhost.localdomain to localhost or your domain name.

/etc/init.d/network restart
Hue currently requires that the machines within your cluster can connect to each other freely over TCP. The machines outside your cluster must be able to open TCP port 8888 on the Hue Server (or the configured Hue web HTTP port) to interact with the system.

3. Monitoring service donot work, errors out: Could not find a HOST_MONITORING nozzle from SCM
Sol: Re-run cloudera manager upgrade wizard as given in

4. Oozie monitoring webui not working – http://masternode:11000/oozie/
As per instruction ext-2.2 must be extracted /var/lib/oozie/libext (

But it did not work, its because symbolic on the deployed oozie-server pointed to different path /var/lib/oozie/ext-2.2, so make sure to unzip ext-2.2 files to this location.

/opt/cloudera/parcels/CDH-4.4.0-1.cdh4.4.0.p0.39/lib/oozie/oozie-server/webapps/oozie/ext-2.2 -> /var/lib/oozie/ext-2.2

July 2, 2013 / binidas

First step to explore hadoop

Having a baby is most wonderful experience which changes your perspective to life altogether. I gave birth to beautiful baby boy and taken an year off from work. Although having a baby takes all your time, i decided to utilise whatever spare time i had to hone my skills on big data computing.

Hadoop is what i decided try my hands on for data crunching for my personal project. Hadoop is distributed big data solution for vast volume of data to be crunched in a clustered machines working in parallel on same problem to obtain desired solution. Found a brilliant presentation on hadoop HDFS introduction by Sameer Farooqui at Intro to HDFS , this clearly explains the fundamental behind hadoop hdfs. This gave an overview of underlying hadoop architecture. Hadoop echo system has other projects, tools and frameworks which you can find in apache website.

Now i was thrilled to write some Hello world application in Hadoop and as usual, downloaded the application and set it up manually on single node and run the example application. Wasn’t very pleased with valina setup, i needed something that i configure a cluster environment, hence came across blog which mentions about cloudera.

To get started Cloudera has very simple setup which can bootup in VM environment using quickstart VM installable. Instruction on how to setup is available in INSTALLATION guide.
Now having VM up and running, i wanted quick guide on how to use cloudera without attending the training course. Found a presentation on youtube from Lynn Langit which is a very basic tutorial on mapreduce using cloudera. I must say she is an excellent presentor/tutor. Check her videos Hadoop MapReduce Fundamentals 1 of 5 on youtube.

Next is to setup enterprise version of cloudera on CentOS also get a good book which can guide in production application development which also suggest the best practices to follow.