Working knowledge of Linux systems and distributed systems is a must. Knowledge of scripting language like python/scala Hands-on experience with writing Spark or Map-Reduce jobs and proficient understanding of distributed computing principles. Experience with Lambda Architecture and building required infrastructure. Proficient in more than one of Python, Java, Scala & Shell-Scripting. Experience with integration of data from multiple data sources Experience with various messaging systems, such as Kafka Preferred: Understanding of Spark and HDFS internals. Experience with Big Data querying tools , such as Hive, Pig , and Impala Experience with NoSQL databases, such as HBase, Cassandra, MongoDB Experience with resource managers such as YARN, Mesos Experience with stream-processing systems such as Storm, Spark-Streaming , etc. Knowledge of Lucene, Solr, Elastic Search or any other similar technology is a plus. Prior experience in developing segmentation and recommendation systems is recommended. Experience with reporting tools Apache Zepplin. ** Candidates should be from IIT and NIT only.