Job Description
Contribute to the business value of Data-oriented products based on on-premise Data lake or on cloud environments, by implementing end-to-end data processing chains, from ingestion to API exposure and data visualization.
General responsibility : Quality of data transformed in the Data lake, proper functioning of data processing chains and optimization of the use of resources of on-premise or cloud clusters by data processing chains
General skills : Experience in the implementation of end-to-end data processing chains and Big data architectures, mastery of languages and frameworks for the processing of massive data in particular in Streaming Mode (Spark / Scala). Practice agile methods.
Role
You will set up end-to-end data processing chains in cloud environments and in a DevOps culture, you will work on brand new products, for a wide variety of functional areas (Engineering, connected vehicle, Manufacturing, IoT, Commerce, Quality, Finance), with a solid team to support you.
Main responsibilities
- During the definition of the project
- Design of data ingestion chains
- Design of data preparation chains
- Design of basic ML algorithms
- Data product design
- Design of NOSQL data models
- Data visualization design
- Participation in the selection of services / solutions to be used according to usage
- Participation in the development of a data toolbox
During the iterative realization phase
Implementation of data ingestion chainsImplementation of data preparation chainsImplementation of basic ML algorithmsImplementation of data visualizationsUse of ML frameworkImplementation of data productsExhibition of data productsConfiguration of NOSQL databasesDistributed processing implementationUse of functional languagesDebugging distributed processing and algorithmsIdentification and cataloging of reusable itemsContribution to the evolution of work standardsDuring integration and deployment
Participation in problem solvingDuring serial life
Participation in the monitoring of OperationsParticipation in problem solvingRequired Skills
Expertise in the implementation of end-to-end data processing chainsMastery of distributed developmentBasic knowledge and interest in the development of ML algorithmsKnowledge of ingestion frameworksKnowledge of Spark and its different modulesMastery of ScalaKnowledge of the use of SolaceKnowledge of the ecosystem of NOSQL databasesKnowledge in building data product APIsKnowledge of Data visualization tools and librariesEase in debugging Spark and distributed systemsPopularization of complex systemsControl of the use of data notebooksExpertise in data testing strategiesStrong problem-solving skills, intelligence, initiative and ability to resist pressureExcellent interpersonal skills and great communication skills (ability to go into detail)For the future architecture from next year :Nice to have knowledge of the GCP ecosystem DataProc, DataFlow, BigQuery, Pub-Sub, PostgreSQL / Composer, Cloud Functions, StackDriver)Nice to have knowledge of Beam and its different execution modes on DataFlowNice to have knowledge of JavaPlease send your CV and Address Letter (English) to :
Crystal System Group Ltd.
Attn. : HR Department
Str. Jiului 153 RO-013218 Bucharest
Email : talents@crystal-system.eu