As a Big Data Administrator with a focus on Hadoop, you will be responsible for the deployment, configuration, management, and optimization of our Hadoop ecosystem. You will ensure the smooth operation of our big data platforms, manage data workflows, and provide support for data analytics initiatives.
Key Responsibilities:
-
Hadoop Administration: Deploy, configure, and maintain Hadoop clusters, including HDFS, YARN, and MapReduce components. Ensure high availability, performance, and reliability of the Hadoop environment.
-
System Monitoring: Monitor the health and performance of Hadoop clusters, diagnose and resolve issues, and perform regular maintenance tasks to ensure optimal operation.
-
Data Management: Manage data ingestion, storage, and processing workflows, ensuring data integrity and security. Implement data lifecycle management practices.
-
Performance Tuning: Optimize Hadoop cluster performance through tuning configurations, managing resources, and implementing best practices for efficient data processing.
-
Security: Implement and manage security policies for Hadoop, including user access controls, data encryption, and compliance with data protection regulations.
-
Backup and Recovery: Develop and maintain backup and recovery procedures for Hadoop data. Ensure data is regularly backed up and can be restored in case of failure.
-
Collaboration: Work closely with data engineers, data scientists, and application developers to support data-related projects and provide insights into system performance and data accessibility.
-
Documentation: Create and maintain comprehensive documentation for Hadoop infrastructure, including configuration details, procedures, and troubleshooting guides.
-
Upgrades and Patches: Plan and execute system upgrades and patches for Hadoop components to keep the environment current with the latest features and security updates.