Nightly process for a dataset of 150k entries

Hi,

I have a dataset of 150k record that needs to be updated nightly. We get this data through an API. From the payload we will parse the data and map it to 5 individual CDT's and then write to the db.

How should I design this whole process?

  Discussion posts and replies are publicly visible

Parents
  • Typically you'd get the data, load it to a "staging table" in its raw, unchanged form (so that you can decouple the receiving of the data from the subsequent processing). Once it's in a staging table you can then process it in small batches so that you don't create a memory spike in your environment. 

    There is the "Transaction Manager" application in the the AppMarket that is specifically designed to manage queues of work, to provide multi-queue processing and throttling of the transactions, and other patterns that you'd typically want in this type of scenario.

Reply
  • Typically you'd get the data, load it to a "staging table" in its raw, unchanged form (so that you can decouple the receiving of the data from the subsequent processing). Once it's in a staging table you can then process it in small batches so that you don't create a memory spike in your environment. 

    There is the "Transaction Manager" application in the the AppMarket that is specifically designed to manage queues of work, to provide multi-queue processing and throttling of the transactions, and other patterns that you'd typically want in this type of scenario.

Children
No Data