Extracting and combining data from various heterogeneous data sources, Building and migrating the complex ETL pipelines from sources, to S3, Redshift and/or Elastic Map Reduce to make the system grow elastically, Build an understanding of full project life cycle: analysis, design, development, test, release and support procedures, Manipulate, process and extract value from large datasets
Desired Profile (Experience, Key Skills)
Experience with Life Sciences or Pharma data is a significant plus, Demonstrated ability in data modeling, ETL development, and data warehousing ,
Demonstrated experience manipulating, processing, and extracting value from large datasets, Excellent problem solving, analytical, technical, interpersonal and communication skills and an efficient team player with an ability to take new roles, 4-5 years of hands-on experience with AWS Services such as AWS S3, Glue ETL, Glue Catalog, Athena, EMR with PySpark, Spectrum, etc. 2-3 years of Hands-on experience on components of Hadoop Ecosystem like HDFS, Hive, Spark, Sqoop, Map Reduce and YARN, Hands-on experience with ETL tools such as Talend and Amazon Glue. Good working experience with different databases like MS SQL Server/Oracle/MySQL,
Working Experience with AWS Data Warehousing and database platforms (RedShift, Athena, Aurora).
Hands-on experience in Python Function Programming language.Deep SQL coding experience