Lead Software Engineer (PySpark/Python)
Locations:
Noida, Uttar Pradesh, India
Gurgaon, Haryana, India
Hyderabad, Telangana, India
Indore, Madhya Pradesh, India
Bangalore, Karnataka, India
Experience: 8 to 10 years
Job Reference Number: 13025
Qualification
5–7 years of strong hands-on experience with Big Data technologies – PySpark (DataFrame and SparkSQL), Hadoop, and Hive.
Proficient in Python and Bash scripting.
Solid understanding of SQL and data warehouse concepts.
Strong analytical and problem-solving skills, with the ability to perform in-depth data analysis.
Innovative thinker who can operate independently of off-the-shelf tools.
Excellent communication, presentation, and interpersonal skills.
Good to Have:
Hands-on experience with AWS Big Data services (IAM, Glue, EMR, Redshift, S3, Kinesis).
Experience with workflow orchestration tools like Apache Airflow.
Experience in workload migration from on-premise to cloud and cloud-to-cloud.
Skills Required
Python
PySpark
AWS (S3, Glue, EMR, Redshift, IAM)
Role & Responsibilities
Develop efficient ETL pipelines tailored to business requirements while following best development practices.
Perform integration testing of pipelines in the AWS environment.
Provide time estimates for development, testing, and deployment tasks.
Participate in code reviews to ensure compliance with coding standards.
Design cost-effective AWS pipelines using services such as S3, IAM, Glue, EMR, and Redshift.