Responsibilities
- Architect and deploy code-based governance systems leveraging native Google Cloud Platform tools
- Develop automated pipelines for classifying data, tracking lineage, validating quality, and generating compliance reports aligned with LGPD, GDPR, and ISO 27001
- Construct modular Infrastructure-as-Code components using Terraform to standardize project layouts, IAM configurations, security rules, audit logging, and data retention settings
- Operationalize the Governance Seal framework by automating checks that ensure datasets and pipelines meet baseline requirements in quality, security, documentation, and cost before production deployment
- Integrate financial operations best practices into center-of-excellence templates, including data partitioning, clustering methods, storage tier selection, lifecycle rules, query efficiency, and compute scaling
- Create systems to monitor and alert on usage costs across Google Cloud data services
- Detect, resolve, and guide remediation of cost inefficiencies such as idle tables, duplicate workflows, suboptimal queries, and oversized resources
- Convert technical design documents, infrastructure modules, pipeline blueprints, and knowledge transfer sessions into standardized, tested, version-controlled, and reusable CoE assets
- Assemble a library of pipeline templates for frequent scenarios including batch imports from databases, APIs, and files; real-time streaming via Pub/Sub or Kafka; change data capture; data contracts; and serving layers optimized for speed and analytics
- Develop self-service tools for data stewards and engineers, including project initialization scripts, data contract generators, quality rule presets, lineage visualizers, and metadata enhancers
- Establish continuous integration and delivery workflows for data pipelines featuring automated testing for schema integrity, data quality, and security, along with deployment gates and rollback procedures
- Design data models with built-in governance, including defined ownership, semantic clarity, access controls, retention policies, and end-to-end lineage tracking
- Tune storage and compute layers for optimal performance and cost through partitioning, clustering, indexing, compression formats like Parquet and Avro, and modern table formats such as Delta and Iceberg
- Set specifications for data contracts covering schema, service-level agreements, ownership, and change processes, and implement them programmatically using Avro, Protobuf, dbt, or validation frameworks
- Develop and integrate automated data quality validation mechanisms
- Implement comprehensive data lineage tracking from source systems to downstream analytics and AI applications using tools like Dataplex, OpenLineage, or custom metadata extractors
- Build monitoring and alerting systems for governance metrics including policy adherence, data timeliness, quality violations, access irregularities, and budget overages
- Define and enforce service-level agreements and objectives for data delivery, accuracy, and uptime across raw, trusted, and curated data layers
- Serve as a technical advisor and thought partner for data stewards, engineers, and analytics teams adopting governance standards
- Lead technical design reviews and deep-dive sessions on governance, cost optimization, and platform architecture
- Create practical documentation such as playbooks, runbooks, and training labs covering topics like governed project setup, BigQuery cost management, and dynamic data masking
- Organize and facilitate community knowledge-sharing sessions to disseminate best practices, resolve challenges, recognize achievements, and incorporate feedback