Prompts

Tested prompts that power your agents automatically. Learn more →

← All topics

Data Engineering

Metrics Layer Architecture

Architect a centralized metrics layer providing consistent metric definitions, dimensional modeling, and governed self-service access.

Experimentation Platform Design

Design an experimentation and A/B testing platform covering assignment, statistical rigor, feature flags integration, and result analysis.

Analytics Dashboard Architecture

Architect analytics dashboards covering data modeling, visualization design, performance optimization, and self-service access patterns.

Analytics Event Tracking Design

Design a structured analytics event tracking plan covering naming conventions, event schemas, instrumentation, and data quality validation.

Real-Time Analytics Architecture

Design real-time analytics systems enabling sub-second query responses on streaming data for dashboards and applications.

DBT Project Architecture

Design a production dbt project structure with modeling layers, testing strategies, and CI/CD for analytics engineering.

Data Vault Modeling

Design Data Vault 2.0 models with hubs, links, and satellites for enterprise data warehousing with full auditability.

Dimensional Modeling

Design dimensional models (star and snowflake schemas) for analytics workloads with proper fact and dimension tables.

Data Governance Framework

Design a data governance framework covering classification, access control, privacy compliance, and stewardship processes.

Data Catalog & Discovery

Design a data catalog and discovery platform enabling teams to find, understand, and trust organizational data assets.

Data Mesh Architecture

Design a data mesh architecture with domain-oriented ownership, self-serve platform, and federated governance for scalable data.

Lakehouse Architecture

Design a lakehouse architecture combining data lake flexibility with warehouse reliability using Delta Lake, Iceberg, or Hudi.

Data Quality Gate Design

Design automated data quality gates that validate data at pipeline boundaries before it reaches downstream consumers.

Change Data Capture Pipeline

Design CDC pipelines using Debezium, Kafka Connect, or similar tools for real-time database replication and event sourcing.

Data Pipeline Monitoring

Design observability and monitoring systems for data pipelines covering freshness, quality, volume, and lineage tracking.

Data Pipeline Testing

Build comprehensive testing strategies for data pipelines covering unit, integration, data quality, and end-to-end validation.

Streaming Pipeline Design

Architect real-time streaming pipelines with Kafka, Flink, or Spark Streaming for low-latency data processing at scale.

Batch Processing Pipeline

Design high-throughput batch processing pipelines with Spark or similar frameworks for large-scale data transformation.

Pipeline Orchestration (Airflow)

Design Airflow-based pipeline orchestration with proper DAG patterns, dependency management, and operational excellence.

Data Pipeline Architecture

Design robust, scalable data pipelines with proper error handling, backfill support, and lineage tracking for production workloads.

Build Data Quality Framework

Validation rules per dataset, freshness monitoring, anomaly detection, schema tracking, data lineage, quality scorecards, alerting, and quarantine.

Design Streaming Data Architecture

Kafka/Kinesis topics, partitioning, consumer groups, exactly-once semantics, stream processing, dead letter queues, schema registry, and backpressure.

Design A/B Testing Framework

Experiment design: hypothesis, randomization, sample size, duration, metrics, statistical tests, significance threshold, and analysis template.

Build ML Model Serving Pipeline

Model packaging, inference API, A/B routing, feature preprocessing, latency optimization, versioning, canary deployment, and monitoring.