How do i loop over the results of a data copy in data factory
Hi guys I'm struggling with a data pipeline.
I have a pipeline where I first fetch some data from an api. This data contains among other things a column of ids. I've set up a datacopy and I'm saving the json result in a blob.
What I want to do next is to iterate over all the ids and do an api call for those ids.
But I cant for the life of me figure out how to iterate over the ids. I've looked in to using a lookup and for-each but seems that lookup is limited to 5000 results, I have just over 70k.
Any pointers for me?
As a workaround you could partition and store the API call results into smaller JSON files. Then use multiple pipeline according to the number of files you got, and iterate over to achieve this.
As the ForEach activity can do maximum batchCount of 50 for parallel processing, and a maximum of 100,000 items. Follow workaround for just the Lookup part.
Design a two-level pipeline where the outer pipeline iterates over an inner pipeline, which retrieves data that doesn't exceed the maximum rows or size.
Example:
Here I would get details from API and store as a number of JSON
blobs to help feed small chunks of data to next LookupActivity.
Use GetMetadata Activity to get the know the number of partitioned files to iterate on and their name to pass to parameterized source dataset of LookupActivity going forward.
Use execute pipeline to call another pipeline, which would have the LookupActivity and WebActivity to call for the ids
Inside the child pipeline you have a LookupActivity which has parameterized source files to look at. When the ForEach activity iterates, for each file the child pipeline is triggered with one file at source of LookupActivity. This solves the limitation issue.
You can store the lookup result in variable or use as is dynamic expression.