Job Type:
Full Time
Education: B.Sc/ M.Sc/ B.E/ M.E./ B.Com/ M.Com/ BBA/ MBA/B.Tech/ M.Tech/ All Graduates
Skills:
Python, .net, React Native, Django, Javascript, HTML, CSS, Typescript, Communication Skills, Power Bi, Numpy Pandas, Sql, machine learning, Data Analysis, Coimbatore, Data Science, Java, Adobe XD, Figma, php, wordpress, Artificial Intelligence, Excel
Job Title: Database Architect Intern (ETL, Data Lakes, Graph SQL)
Location: Thiruvananthapuram, Kerala
Job Type: Full-time, Fresher, Internship
Job Description:
Join an innovative company at the forefront of developing cutting-edge AI-based decision intelligence products. We are seeking a talented Database Architect Intern to design, implement, and optimize data infrastructure for machine learning and AI applications. In this role, you will focus on building scalable ETL pipelines, managing large-scale data environments, and supporting complex relational data structures, including graph databases, to drive AI-driven solutions.
Key Responsibilities:
- ETL & Data Pipeline Design: Architect and implement efficient ETL pipelines to ingest, transform, and load data from diverse sources into structured, unstructured, and semi-structured formats.
- Graph SQL Modeling: Develop and maintain graph-based data models (e.g., Neo4j, Amazon Neptune) to represent data relationships supporting AI-driven recommendations and knowledge graphs.
- Data Lake & Warehouse Management: Manage large volumes of data in data lakes and warehouses, ensuring efficient organization and accessibility for analysis and ML model development.
- Data Modeling: Create flexible data models for machine learning needs, including feature stores and entity-relationship structures.
- ETL Optimization: Optimize ETL workflows to reduce latency and improve data transformation accuracy for ML applications.
- Collaboration with ML Teams: Work closely with data scientists, ML engineers, and analysts to structure data for model training and inference.
- Data Quality & Integration: Ensure data integration from multiple sources with robust validation and ongoing monitoring.
- Performance Tuning: Optimize database performance for graph SQL and relational databases for real-time ML/AI applications.
- Documentation & Data Governance: Develop documentation for ETL processes, data lakes, and graph models while implementing best practices for data governance.
Qualifications:
- Hands-on experience in ETL, data integration, and pipeline management.
- Familiarity with graph databases (e.g., Neo4j, Amazon Neptune, TigerGraph) and Graph SQL.
- Proficient in using ETL tools (e.g., Apache NiFi, Talend, Airflow) and automating workflows.
- Strong knowledge of big data frameworks (e.g., Hadoop, Spark, Kafka) and cloud-based data lakes (AWS S3, Azure Data Lake, GCP BigQuery).
- Skilled in both SQL and NoSQL databases (e.g., PostgreSQL, MongoDB).
- Programming skills in Python or SQL for ETL and data manipulation.
- Experience with cloud-based data services (AWS, GCP, Azure) and data orchestration.
Preferred Qualifications:
- Relevant certifications in data engineering, cloud data services, or graph databases.
- Experience with MLOps tools like MLflow or Kubeflow for integrating data transformation into ML workflows.
Application Questions:
- What DB architecture and platforms would you prefer when dealing with complex, highly customized, or large-scale continuously changing data? Provide examples from recent projects.
- In a startup environment, what role would you like to play in platform selection, infrastructure installation, and database design?
- What is the biggest challenge you have faced as a Database Architect, and how did you overcome it?
- What do you consider the most important value for an ideal Database Architect and why?
Job Types: Full-time, Fresher, Internship
Work Location: In person