AIOpenLibraryAIOpenLibrary
Back to Software Architecture

Data Pipeline Architect

Design data pipelines and data platform architectures with ETL/ELT patterns, streaming vs. batch, and data quality frameworks.

Updated Mar 11, 2026

ShareLinkedIn

Customize Your Prompt

0/8 filled

Prompt

You are a data platform architect. Help me design a data pipeline architecture.

Data sources: [DATA_SOURCES]
Data volume: [VOLUME]
Data velocity: [VELOCITY] (batch intervals or real-time)
Use cases: [USE_CASES] (analytics, ML, operational, reporting)
Latency requirements: [LATENCY]
Current infrastructure: [CURRENT_INFRA]
Team skills: [TEAM_SKILLS]
Budget: [BUDGET]

Design the data platform:

**1. Architecture Pattern Selection**
- ETL vs. ELT: Recommendation with rationale
- Batch vs. Streaming vs. Hybrid (Lambda/Kappa architecture)
- Data Warehouse vs. Data Lake vs. Data Lakehouse
- Why this pattern fits your requirements

**2. Ingestion Layer**
- Source connectors for each data source
- Change Data Capture (CDC) strategy (if real-time)
- Schema registry and evolution
- Backfill strategy for historical data

**3. Processing Layer**
- Tool selection with trade-offs:
  | Tool | Best For | Complexity | Cost |
  | Spark | ... | ... | ... |
  | dbt | ... | ... | ... |
  | Flink | ... | ... | ... |
  | Airflow | ... | ... | ... |
- Transformation logic organization
- Orchestration and scheduling
- Dependency management between pipelines

**4. Storage Layer**
- Storage format (Parquet, Delta Lake, Iceberg)
- Partitioning strategy
- Data modeling (star schema, OBT, dimensional)
- Hot/warm/cold tiering

**5. Serving Layer**
- Analytics: BI tool connectivity
- ML: Feature store design
- API: Low-latency query serving
- Caching strategy

**6. Data Quality**
- Data contracts between producers and consumers
- Validation rules (schema, freshness, completeness, accuracy)
- Quality monitoring and alerting
- Data lineage tracking
- SLA definitions for each pipeline

**7. Governance & Security**
- Access control (row-level, column-level)
- PII handling and masking
- Audit logging
- Data catalog

**8. Operational Excellence**
- Monitoring dashboard (pipeline health, latency, data quality scores)
- Alerting strategy (don't alert on everything)
- Incident response for data pipeline failures
- Runbook for common failure scenarios

**9. Cost Estimation**
- Compute, storage, and egress cost modeling
- Cost optimization opportunities
- Build vs. buy analysis for key components

Powered by Hugging Face Inference API

Pro Tips

  • Data platform architecture requires balancing latency, cost, complexity, and team skills. A systematic design prevents the common mistake of over-engineering or under-investing in data quality.

References

Comments

Log in to leave a comment

More Software Architecture Prompts

🏗️Software ArchitectureNEW

Architecture Decision Record Writer

Write well-structured Architecture Decision Records (ADRs) that document the context, options considered, and rationale behind key technical decisions.

You are a principal software architect who believes that documented decisions ar...

Claude
IntermediateView prompt
🏗️Software ArchitectureNEW

System Design Document Generator

Generate comprehensive system design documents (RFCs/design docs) with component architecture, data flow, API contracts, and operational considerations.

You are a staff engineer writing a design document for a new system. Create a co...

Claude
AdvancedView prompt
🏗️Software ArchitectureNEW

Event-Driven Architecture Planner

Design event-driven systems with event sourcing, CQRS, message brokers, and eventual consistency patterns.

You are a distributed systems architect specializing in event-driven architectur...

Claude
AdvancedView prompt

You Might Also Like

✍️Writing & Content✦ Premium

Blog Post Architect

Create SEO-optimized, engaging blog posts with structured outlines, compelling hooks, and strategic keyword placement.

You are an expert content strategist and SEO specialist. Create a comprehensive ...

Claude Opus 4
IntermediateView prompt
📚Education✦ Premium

Socratic Method Tutor

Learn any concept through guided questioning that builds deep understanding instead of memorization.

You are a Socratic tutor. Your role is to help me deeply understand a concept th...

Claude Opus 4
BeginnerView prompt
📦Product Management✦ Premium

Product Requirements Document (PRD)

Generate comprehensive PRDs with user stories, acceptance criteria, technical requirements, and success metrics.

You are a senior product manager at a top tech company. Write a comprehensive PR...

Claude Opus 4
IntermediateView prompt