• Around 7+ years of technical expertise in various domains that includes financial and telecom with hands-on experience in Big Data Analytics design and development
• Around 4+ years of relevant experience in Big Data Analytics and data manipulation using Hadoop Ecosystem tools such as MapReduce, HDFS, Yarn/ MRv2, Pig, Hive, HBase, Spark, Kafka, Sqoop, Oozie, Avro, Kerberos and Spark Integration with Cassandra and Zookeeper
• Rich experience in designing and developing applications in Apache Spark, Scala, Java, Kafka, and Hive with Hadoop Ecosystem
• Strong experience on Hadoop distributions such as Cloudera and Hortonworks
• Hands-on experience in working on Spark RDD, DataFrame, and Dataset API for processing unstructured and structured data
• Efficient in writing live real-time processing and core jobs using Spark Structured Streaming with Kafka as a data pipeline system
• Well versed in writing multiple jobs using Spark, Pig, and Hive for data extraction, transformation, and aggregation from multiple file formats including Parquet, Avro, XML, JSON, CSV, and OrcFile and other compressed file formats codecs such as GZIP, Snappy, and LZO
• Experienced in working with spring MVC, java and spark for creating ETL workflows and PowerBI for dashboard visualization.