Careers

Data Architect – M4

Description

Company Description:
Krish is committed to helping our customers achieve their technology goals and will always
emphasize on the success of our customers as our top priority and in building long term
and productive relationships. Krrish’s goal of adding the best value to its customers with a
combination of right technology, right people, and right costs is achieved through
experience and integrity of our consultants and our custom delivery processes.

Profile overview:
Player-coach role. Azure-primary. Builds the data foundation that AI runs on. We’re hiring a
Data Architect to design and lead the data foundations that make our AI work — both
inside our CoE and across the client engagements we deliver. You’ll own data modeling,
ETL/ELT design, migration strategy, and the Data Readiness for AI service offering our
consultants take to market. Azure is our primary cloud (Microsoft Fabric, Synapse, Azure
Data Factory, ADLS, Azure SQL, Purview), with hands-on open-source fluency where it earns
its keep (Spark, Airflow, dbt, Kafka). You’ll partner closely with the AI Architect on the data
layers behind RAG, agents, and ML systems; you’ll lead heterogeneous data migrations
including SQL Server and MariaDB transitions; and you’ll grow a small team of data
engineers along the way.

Job Description

Company Description:
Krish is committed to helping our customers achieve their technology goals and will always
emphasize on the success of our customers as our top priority and in building long term
and productive relationships. Krrish’s goal of adding the best value to its customers with a
combination of right technology, right people, and right costs is achieved through
experience and integrity of our consultants and our custom delivery processes.

Profile overview:
Player-coach role. Azure-primary. Builds the data foundation that AI runs on. We’re hiring a
Data Architect to design and lead the data foundations that make our AI work — both
inside our CoE and across the client engagements we deliver. You’ll own data modeling,
ETL/ELT design, migration strategy, and the Data Readiness for AI service offering our
consultants take to market. Azure is our primary cloud (Microsoft Fabric, Synapse, Azure
Data Factory, ADLS, Azure SQL, Purview), with hands-on open-source fluency where it earns
its keep (Spark, Airflow, dbt, Kafka). You’ll partner closely with the AI Architect on the data
layers behind RAG, agents, and ML systems; you’ll lead heterogeneous data migrations
including SQL Server and MariaDB transitions; and you’ll grow a small team of data
engineers along the way.

Roles and Responsibilities:
• Reference data architectures. Canonical Azure-based blueprints — medallion
lakehouse on Microsoft Fabric / Synapse + ADLS Gen2, operational stores on Azure
SQL / Cosmos DB / Azure Database for MariaDB and PostgreSQL, governance and
lineage on Microsoft Purview. Versioned, opinionated, with working sample
implementations.
• Data modeling end-to-end. Conceptual → logical → physical. Dimensional
(Kimball), Data Vault, medallion (bronze/silver/gold), and normalized OLTP schemas.
You pick the right one for the job and document the why.
• ETL/ELT and data pipelines. Design and oversee pipelines using Azure Data
Factory, Synapse pipelines, Fabric Data Engineering, and Azure Databricks — with
dbt, Spark, and Airflow where they’re a better fit. Streaming with Event Hubs, Kafka,
or Stream Analytics where the use case demands it.
• Migration leadership. Plan and execute heterogeneous data migrations —
including SQL Server ↔ MariaDB / MySQL / PostgreSQL — covering schema
conversion, CDC-based low-downtime cutover (Azure Database Migration Service,
Debezium), reconciliation, and post-migration validation. Build the migration
playbooks the CoE reuses.
• Data Readiness for AI — the service offering. Build out and run the productized
client offering: data quality assessments, schema rationalization, master/reference
data, lineage and cataloging on Purview, governance scaffolding, and the playbook
that gets a client’s data ready to power AI/ML workloads. This is something the
practice sells; you’re the technical owner.
• AI/ML data layer in the CoE. Partner with the AI Architect on the data layer behind
production AI: retrieval indices on Azure AI Search and Cosmos DB vector,
document chunking and embedding pipelines, feature stores, training and eval
dataset curation, and the data quality bar required for AI to actually work.
• Governance, quality, and lineage. Cataloging, classification, sensitive-data
handling, and lineage on Microsoft Purview. Data quality frameworks (Great
Expectations, Soda, or native Fabric/Synapse checks). The governance posture that
lets us operate in regulated client environments.
• R&D and team enablement. Track what’s shipping in Azure data (Fabric, OneLake,
Purview, AI Search) and the open-source ecosystem (Iceberg, Delta, Hudi, dbt, Spark,
Airflow), prototype what matters, run internal workshops and labs, and level up the
data engineering team.
• Presales partnership. Senior data voice on qualified pursuits — technical
discovery, scoping data engagements, shaping solution architectures, contributing
to SOWs and proposals, presenting to client data leaders and CDOs.
• Team leadership. Lead and grow a small team of data engineers. Run architecture
reviews, code reviews, and design sessions across the CoE, and partner closely with
client delivery teams.

What we’re looking for
• 12+ years in data engineering and architecture roles, with hands-on production
experience across OLTP, OLAP / analytics, and modern lakehouse workloads.
• Deep on relational systems. Production-grade experience with SQL Server (T-SQL,
indexing, query tuning, Always On / HA topologies, security models) and MariaDB /
MySQL. You’ve designed schemas, tuned slow queries, and recovered systems
under pressure.
• Migration experience. You’ve led at least one substantial heterogeneous data
migration — cross-RDBMS, on-prem-to-cloud, or platform — including schema
translation, data movement strategy, CDC, cutover planning, and validation.
• Deep on the Azure data stack. Microsoft Fabric (OneLake, Data Factory, Synapse
Data Engineering, Real-Time Intelligence, Power BI), Azure Synapse Analytics, Azure
Data Factory, Azure Databricks, Azure SQL / Managed Instance, Azure Database for
MariaDB / MySQL / PostgreSQL, ADLS Gen2, Cosmos DB, Microsoft Purview, Event
Hubs, Stream Analytics.
• Modeling depth. Dimensional (Kimball), Data Vault, medallion / lakehouse,
normalized OLTP. You can defend the choice in front of a client architect.
• Strong open-source fluency. Apache Spark, Airflow, dbt, Kafka / Debezium, open
table formats (Iceberg, Delta, Hudi), PostgreSQL. Data quality tooling like Great
Expectations or Soda.
• AI/ML data prep experience. Building training and eval datasets, feature
engineering, vector embedding pipelines, RAG-ready document processing, and
partnering closely with data scientists and ML engineers.
• Production discipline. Infrastructure-as-code (Bicep / Terraform), CI/CD for data
pipelines, observability, testing — the unglamorous things that separate prototypes
from production data platforms.
• Presales-capable. Comfortable in front of clients: discovery, whiteboarding
solutions, defending architecture decisions, contributing to proposals and SOWs.
• Experience leading small engineering teams. You’ve managed at least 2–3
engineers before, balanced shipping with people development, and know when to
step in versus step back.
• Clear technical communication. You can talk shop with a CDO and pair with a
junior engineer in the same afternoon.

What success looks like in year one
• A published set of CoE reference data architectures — Fabric lakehouse blueprint,
migration playbook, governance pattern — with working Azure sample
implementations.
• The Data Readiness for AI service offering is built, sold, and delivered on at least a
handful of client engagements.
• Ownership of data migration project (including a SQL Server / MariaDB transition or
similar heterogeneous move) executed under your architecture leadership.
• The data layer powering CoE AI systems — retrieval indices, feature data, training
and eval sets — is measurably more reliable, better-governed, and easier to extend
than it was when you started.
• A running R&D and training cadence: regular evaluations of new Azure data and OSS
releases, an internal workshop / lab program, and visible upskilling across the data
engineering team.
• Direct contribution to won deals — you’ve been the senior data architect on at least
a handful of presales pursuits that closed.
• The engineering bar for data work across the CoE is visibly higher because of the
patterns, reviews, and standards you’ve established.

Nice to have
Microsoft certifications (Azure Data Engineer Associate, Fabric Analytics Engineer Associate,
Azure Solutions Architect Expert) or equivalent depth. Open-source contributions in the
data / ML space. Experience operating data platforms under regulatory constraints (HIPAA,
SOC 2, PCI, financial services). Background in services or consulting environments where
the same architecture serves multiple clients.

Roles & Responsibility

Roles and Responsibilities:
• Reference data architectures. Canonical Azure-based blueprints — medallion
lakehouse on Microsoft Fabric / Synapse + ADLS Gen2, operational stores on Azure
SQL / Cosmos DB / Azure Database for MariaDB and PostgreSQL, governance and
lineage on Microsoft Purview. Versioned, opinionated, with working sample
implementations.
• Data modeling end-to-end. Conceptual → logical → physical. Dimensional
(Kimball), Data Vault, medallion (bronze/silver/gold), and normalized OLTP schemas.
You pick the right one for the job and document the why.
• ETL/ELT and data pipelines. Design and oversee pipelines using Azure Data
Factory, Synapse pipelines, Fabric Data Engineering, and Azure Databricks — with
dbt, Spark, and Airflow where they’re a better fit. Streaming with Event Hubs, Kafka,
or Stream Analytics where the use case demands it.
• Migration leadership. Plan and execute heterogeneous data migrations —
including SQL Server ↔ MariaDB / MySQL / PostgreSQL — covering schema
conversion, CDC-based low-downtime cutover (Azure Database Migration Service,
Debezium), reconciliation, and post-migration validation. Build the migration
playbooks the CoE reuses.
• Data Readiness for AI — the service offering. Build out and run the productized
client offering: data quality assessments, schema rationalization, master/reference
data, lineage and cataloging on Purview, governance scaffolding, and the playbook
that gets a client’s data ready to power AI/ML workloads. This is something the
practice sells; you’re the technical owner.
• AI/ML data layer in the CoE. Partner with the AI Architect on the data layer behind
production AI: retrieval indices on Azure AI Search and Cosmos DB vector,
document chunking and embedding pipelines, feature stores, training and eval
dataset curation, and the data quality bar required for AI to actually work.
• Governance, quality, and lineage. Cataloging, classification, sensitive-data
handling, and lineage on Microsoft Purview. Data quality frameworks (Great
Expectations, Soda, or native Fabric/Synapse checks). The governance posture that
lets us operate in regulated client environments.
• R&D and team enablement. Track what’s shipping in Azure data (Fabric, OneLake,
Purview, AI Search) and the open-source ecosystem (Iceberg, Delta, Hudi, dbt, Spark,
Airflow), prototype what matters, run internal workshops and labs, and level up the
data engineering team.
• Presales partnership. Senior data voice on qualified pursuits — technical
discovery, scoping data engagements, shaping solution architectures, contributing
to SOWs and proposals, presenting to client data leaders and CDOs.
• Team leadership. Lead and grow a small team of data engineers. Run architecture
reviews, code reviews, and design sessions across the CoE, and partner closely with
client delivery teams.

Apply Now

Thank you for contacting us. Our team will contact you shortly.