- Involved in extracting the data from various sources into Hadoop HDFS for processing.
- Experience in analyzing log files for Hadoop and ecosystem services and finding root cause.
- Worked Extensively with Linux platform to setup 40 plus nodes cdh clusters as part of customers’ requirements.
- Experience on Commissioning, Decommissioning, Balancing, and Managing Nodes and tuning server for optimal performance of the cluster.
- Integrated external vendor tools like Unravel and SAS viya with Hadoop for performance and monitoring.
- Involved in setting up the Kerberos authentication.
- Development in build pipelines, write python scripts and to perform transformation on existing tables and new sources.
- Tested new releases and Hadoop components in lab clusters.
- Configured Stream sets to store the converted data to SQL SERVER using JDBC drivers.
- Experienced using Ansible scripts to deploy Cloudera CDH 5.12.1 to setup Hadoop Cluster.
- Dry Run Ansible Playbook to Provision the OpenStack Cluster and Deploy CDH Parcels.
- Participated in requirement gathering of the project in documenting the business requirements.
Bachelor’s degree in Computer Science or a closely related field