sr.Big data engineer with spark
Pearl Consulting Services
Sterling, Virginia, United States
Job type: fulltime
Job industry: I.T. & Communications
- The Sr Data Engineer should be a technical contributor who has hands-on knowledge of all phases in building large-scale cloud based distributed data processing systems and applications. You will be part of the Global Data & Analytics engineering technology team and will partner closely with a team of data scientists, business analysts & data engineers leading Discovery s cloud based Big Data & Analytics strategy.
- You ll work on implementing complex AWS based big data projects with a focus on collecting, parsing, managing, analyzing and visualizing large sets of data to turn information into insights using multiple technology platforms. Therefore, this role requires an understanding of how a secure big data cloud environment is architected to gain real insights faster, with less friction and complexity. The Sr. data engineer should be passionate about working with cutting edge technologies in solving problems and developing prototypes using different open source tools for the selected solutions.
- You ll need to be an innovative forward-thinker who will help lead end-to-end execution of data engineering initiatives and contribute directly to existing and emerging business strategies and goals. Creativity, Attention to detail and ability to work in a collaborative team environment are essential.
- The Sr. Data Engineer will work closely with the Data Engineering Manager to decide on needed infrastructure architecture and software design needs and act according to the decisions.
Key Areas of Responsibility
- Lead the design, implementation, and continuous delivery of pipelines using distributed AWS based big data technologies supporting data processing initiatives across batch and streaming datasets
- Responsible for development using Scala , Python languages and Big Data Frameworks such as Spark, EMR, Presto, AWS Athena, Kafka, Zepplin , and Kinesis
- Provide administrative support on deployed AWS platform components
- Identify, evaluate and implement cutting edge big data pipelines and frameworks required to provide requested capabilities to integrate external data sources and APIs
- Review, analyse and evaluate market requirements, business requirements and project briefs in order to design the most appropriate end-to-end technology solutions
- Process and manage high volume real time customer interaction streams
- Provide architectural support by building Proof of Concepts & Prototypes
- Self-Starter to deliver data engineering solutions to optimize both the cost and existing solution
- Stay current with emerging technologies and industry trends
- Bachelor's Degree or higher in Computer Sciences or similar
- Minimum of 5-6 years Software Industry experience
- 3+ years of development experience with AWS services Must have EC2, EMR , RedShift, Data Pipeline or Airflow, S3, Cloud Formation and CLI (must to have ) and Jenkins
- 3+ years of development experience with Apache Spark, Presto, SQL, notebook and NoSQL Implementation
- 4+ years of extensive working knowledge in different programming Scala ( Must ), Shell and Python (Must).
- Proficiency working with structured, semi-structured and unstructured data sets including social, web logs and real time streaming data feeds
- Able to tune Big Data solutions to improve performance and end-user experience
- Knowledge on Visualization and Data Science Tools.
- Expert level usage with Jenkins, GitHub is preferred
- Spark developer certification is a plus
- Ability and eagerness to constantly learn and teach others
- Experience in the media industry is a plus