I was recently tasked with migrating local integration jobs into the cloud. The first job I began to tackle was an integration job between Salesforce and a local financial system. The integration workflow is to first download the newest entries of a specific object from salesforce, then to push that into an on premise sql server staging table. Once the data is copied locally another job within the financial system will process the staged data.
In order to accomplish this task, I decided to use Azure Data Factory, with the on premise gateway to connect to the local sql database. If you have seen some of my other posts I have used Azure Data Lake Store to land my data, with this pipeline I decided to use Azure Blob Storage. Since the Azure Blob Storage api has the ability to store Append Blobs, I was able to follow a similar pattern as I followed using the Azure Data Lake Store and append the Salesforce data to a blob as I fetched the data.
Once I had some sample Append Blobs in my container my next step was to setup the Azure Data Factory copy activity to get that data transferred to the on premise sql server staging tables. This was where I began to run into issues. After a lot of verification and testing, it turns out Append Blobs are not supported in Azure Data Factory.
Here are a few things to look out for to rule out this issue:
- In the blob container blade, it will show the BlobType, check the type of the blobs you are trying to work with in Azure Data Factory.
- I also ran into an issue where the data set which was pointing to the AppendBlob would not validate.
- When running the Azure Data Factory copy activity against an Append Blob you will see the following error:
Copy activity met storage operation failure at ‘Source’ side. Error message from storage execution: Requested value ‘AppendBlob’ was not found.
This can be a bit misleading if you are not aware AppendBlobs are not supported.