Total experience 4.10 Years out of which, Hadoop Developer with 2.10 years of experience of executing data-driven solution to increase efficiency, accuracy and utility of external and internal data processing using MS-Azure. Hadoop/Data Engineer with 2.8 years into data migration (IBM Mainframe/ MS-SQL Server) and Azure Cloud. And perform various operation i.e data cleaning, data analysis, validation and ETL process. Worked on tools such as Airflow, InfluxDB, Snowflake, Hive, Sqoop, Oozie, Python, PY-Spark-SQL, PySpark, Git, GitHub, Informatica, IBM CDC and Microsoft Azure. While working with the numerous types of data I have handled more than 50+ terabyte of data. Holds strong command on Azure and Multiple file format to Store, analyze and process any type of data i.e Text data (Regular Text, Fixed length format text, Internet Text etc) CSV with multiple and variety of delimiter, XLS, Parquet, ORC. I have processed 4.2 Million web pages and approx. data I had deal with 30 GB raw internet text file. InfluxDB (Time Series Data Base) is used to manage all transactional data within SCB and Outside SCB then store into InfluxDB with time stamp, SSN, A/C number, Beneficiary Name to validate transaction authentication within and outside country for further Transaction AUDIT to prevent Fraud and Invalid transaction. Provide technical leadership and governance of the big data/Data Science team and the implementation of the solution architecture in following Hadoop ecosystem (Hortonworks/Cloudera/MapR). Configure and tune production and development Hadoop environments with the various intermixing Hadoop components. Develop technical presentations & proposals & perform customer presentations. Certification’s -> Python – Data Science -> Container & Kubernetes -> Big Data Hadoop -> Alteryx Core Designer - Alteryx -> AHM 250 – HealthCare
©