Working knowledge of Linux systems and distributed systems is a must. Hands-on experience with writing Spark or Map-Reduce jobs and proficient understanding of distributed computing principles. Proficient in more than one of Python, Java, Scala & Shell-Scripting. Experience with integration of data from multiple data sources. Preferred: Understanding of Spark and HDFS internals. Experience with Big Data querying tools, such as Hive, Pig , and Impala Experience with NoSQL databases, such as HBase, Cassandra, MongoDB Experience with resource managers such as YARN, Mesos Experience with stream-processing systems such as Storm, Spark-Streaming, etc. Knowledge of Lucene, Solr, Elastic Search or any other similar technology is a plus. Prior experience in developing segmentation and recommendation systems is recommended. Experience with reporting tools Apache Zepplin. * Candidates should be from IIT and NIT only.