#1 Job Board for tech industry in Europe

  • Job offers
  • Senior Data Engineer (AI and ML frameworks)
    New
    Data

    Senior Data Engineer (AI and ML frameworks)

    Type of work
    Full-time
    Experience
    Senior
    Employment Type
    B2B
    Operating mode
    Remote
    Sigma Software

    Sigma Software

    Sigma Software is a global software development company that enables enterprises, startups, and product houses to meet their technology needs through end-to-end delivery. We have been working since 2002, from all over the world.

    Company profile

    Tech stack

      English

      B2

      Python

      advanced

      Big Data

      advanced

      Kubernetes

      advanced

      GCP

      advanced

      English

      advanced

      AI

      regular

    Job description

    Online interview
    Friendly offer

    We are looking for a talented Senior Data Engineer with a strong background in developing or contributing to applications based on microservices using a Kappa architecture. The project aims to unify data sourced from different EHR systems in the healthcare domain, using the FHIR data format.


    Customer

    Our client is a leading analytics company operating at the intersection of technology, artificial intelligence, and big data. They support manufacturers and retailers in the fast-moving consumer goods sector, helping them better understand market dynamics, uncover consumer behavior insights, and make data-driven business decisions.


    Project

    The project aims to unify data sourced from various EHR systems in the healthcare domain using the FHIR data format. The company’s proprietary technology platform combines high-quality data, deep industry expertise, and advanced predictive algorithms built over decades of experience in the field.


    Responsibilities

    1. Data Standardization and Transformation:
    • Convert diverse data structures from various EHR systems into a unified format based on FHIR standards
    • Map and normalize incoming data to the FHIR data model, ensuring consistency and completeness
    1. Kafka Integration:
    • Consume and process events from the Kafka stream produced by the Data Writer Module
    • Deserialize and validate incoming data to ensure adherence to required standards
    1. Data Segmentation:
    • Separate data streams for warehousing and AI model training, applying specific preprocessing steps for each purpose
    • Prepare and validate data for storage and machine learning model training
    1. Error Handling and Logging:
    • Implement robust error handling mechanisms to track and resolve data mapping issues
    • Maintain detailed logs for auditing and troubleshooting purposes
    1. Data Ingestion and Processing:
    • Use LLMs to extract structured data from EHRs, research articles, and clinical notes
    • Ensure semantic consistency and interoperability during data ingestion
    1. Knowledge Graph Construction:
    • Integrate extracted data into a knowledge graph, representing entities and relationships for semantic data integration
    • Implement contextual understanding and querying of complex relationships within the knowledge graph (KG)
    1. Advanced Predictive Modeling:
    • Leverage KGs and LLMs to enhance data interoperability and predictive analytics
    • Develop frameworks for contextualized insights and personalized medicine recommendations
    1. Feedback Loop:
    • Continuously update the knowledge graph with new data using LLMs, ensuring up-to-date and relevant insights
    • Work Closely with Cross-Functional Teams
    • Collaborate with data scientists, AI specialists, and software engineers to design and implement data processing solutions
    • Communicate effectively with stakeholders to align on goals and deliverables
    1. Contribute to Engineering Culture:
    • Foster a culture of innovation, collaboration, and continuous improvement within the engineering team


    Requirements

    • Deep understanding of patterns and software development practices for event-driven architectures
    • Hands-on experience with stateful stream data processing solutions (Kafka or similar streaming platforms)
    • Strong knowledge of data serialization/deserialization using various data formats (at minimum JSON and Avro), and integration with schema registries
    • Proven Python software development expertise, with experience in data processing and integration (most of the software is written in Python)
    • Practical experience building end-to-end solutions with Apache Flink or a similar platform
    • Experience with containerization and orchestration using Kubernetes (K8s) and Helm, especially on Google Kubernetes Engine (GKE)
    • Familiarity with Google Cloud Platform (GCP) or a similar cloud platform
    • Hands-on experience implementing data quality solutions for schema-on-read or schema-less data
    • Hands-on experience integrating with Apache Kafka, particularly the Confluent Platform
    • Familiarity with AI and ML frameworks
    • Proficiency in SQL and experience with both relational and NoSQL databases
    • Experience with graph databases like Neo4j or RDF-based systems
    • Experience in the healthcare domain and familiarity with healthcare standards such as FHIR and HL7 for data interoperability


    WOULD BE A PLUS:

    • Experience with web data scraping


    Personal Profile

    • Strong problem-solving skills, with the ability to design innovative solutions for complex data integration and processing challenges
    • Excellent communication skills, with the ability to articulate complex technical concepts and work effectively with various stakeholders
    • Commitment to improving healthcare through data-driven solutions and technology
    • Stay abreast of the latest technologies and industry trends while continually improving your skills and knowledge
    • Ability to work in a collaborative environment, valuing diverse perspectives and contributing to a positive team culture


    Undisclosed Salary

    B2B

    Apply for this job

    File upload
    Add document

    Format: PDF, DOCX, JPEG, PNG. Max size 5 MB

    This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
    Informujemy, że administratorem danych jest Sigma Software sp. z o.o. z siedzibą w Warszawie, ul. Chmielna 134 (dalej ja...more

    Check similar offers

    Staff Frontend Engineer

    New
    Adverity
    79.8K - 96.9K USD/year
    Katowice
    , Fully remote
    Fully remote
    API
    React
    TypeScript