Build copy activity pipeline JSON with source and sink configurations.
Last verified: May 2026
Build copy activity pipeline JSON with source and sink configurations.
Required Fields
namepropertiesproperties.activitiesOutput will appear here...The Azure Data Factory Pipeline Builder helps you create Data Factory pipeline definitions with activities, datasets, linked services, and triggers. Data Factory orchestrates data movement and transformation across cloud and on-premises sources. This tool provides a structured interface for building pipelines with copy activities, data flows, conditional logic, and iteration, generating the JSON definition for deployment via ARM templates, Bicep, or the Data Factory SDK.
Your team is building a daily ETL that ingests 50 source tables from on-prem SQL Server into a Synapse data lake. The builder helps you generate a pipeline with: ForEach over a control table containing the 50 source/destination pairs, batchCount=10 (parallelism limited to 10 to prevent SQL Server overload), each iteration runs a Copy activity using a SHIR (self-hosted integration runtime) on the on-prem SQL side and AutoResolveIntegrationRuntime on the Synapse side. End-to-end pipeline definition: 30 minutes vs the 1-day estimate working from scratch.
Data Factory pipelines bill per activity run AND per integration runtime DIU-hour. Many teams optimize for fewer activities but ignore IR consumption — copying 1 TB through a 4-DIU IR takes longer (and costs more) than the same copy through a 16-DIU IR despite the higher per-hour rate. Right-size IR based on data volume, not activity count.
For ETL/ELT workloads where the source and destination are both Azure storage, Data Factory's mapping data flows on Spark are typically more cost-effective than copy activities + transformation. The ramp-up cost of starting a Spark cluster is the catch — only worth it for jobs >5 minutes of transformation.
ForEach activities default to PARALLEL execution (50 max). For source systems with rate limits, set sequential execution or batchCount to a small number — otherwise you'll hammer the source and trigger throttling errors that cost you both runtime and reliability.
The builder constructs Data Factory pipeline JSON with activities (Copy, DataFlow, ForEach, IfCondition, Switch, ExecutePipeline, Lookup, etc.), datasets (source and sink definitions referencing linked services), and parameters. Output is generated as the JSON definition you'd put under Microsoft.DataFactory/factories/pipelines in ARM/Bicep, plus the sequence of az datafactory pipeline create commands.
Was this tool helpful?
Disclaimer: This tool runs entirely in your browser. No data is sent to our servers. Always verify outputs before using them in production. AWS, Azure, and GCP are trademarks of their respective owners.