Azure Data Factory Power Query VS Power BI Dataflows
Can the new #PowerQuery tool embedded in #Azure_Data_Factory replace the Power BI #Dataflow available in Power BI service?
Here some differences between these two components:
💡Architecture difference:
*Power BI Dataflow uses Power Query Online Mashup Editor for data transformations.
It uses a parallel process into one single server. Deepeding on Power Bi capacity, we can scale up the processing power by increasing the capacity and size of the container memory.
*ADF Power Query also uses the Power Query Online Mashup and converts the generated M queries into Spark code at scale.
Spark is a distributed processing engine. So data is processed using multiple worker nodes. We can scale it out by selecting a cluster with a higher number of nodes .
💡Difference in Data processing:
*Power BI Dataflow is an ETL tool into a self-service BI experience, to Extract data from many sources, Transform and Load it in an Azure Data Lake Gen2 storage account.
*ADF Power Query runs an ELT approach and does not support on-premises data sources.
We need to use ADF copy actives to Extract and Load data to a supporting data sources and then use Power Query to Transform the loaded data, which can then be persisted into a storage account or a cloud data warehouse