Best way to migrate a large amount of documents (70-80gbs)

Hi All,

Looking for the best way to move a large amount of documents from an external system into Appian. We are replacing an existing mainframe system, and with that we are doing a doc and data migration. These documents are broken out into a logical structure, and I tried to use the upload zip piece. But Appian will only allow for folders under 1GB. I tried to break out the folders into smaller folders, but even at 600mb a folder, the system was still getting choked up. They are in the cloud. Any ideas/approaches to executing a large document migration?

  Discussion posts and replies are publicly visible

Parents
  • It seems like there's some related discussion going on in this thread of scaling the document management facet of Appian. I'd recommend taking a look at the official responses in the threads here https://community.appian.com/discussions/f/data/12291/number-of-documents-effect-the-performance-in-system  and here https://community.appian.com/discussions/f/data/14324/how-many-documents-can-we-store-in-appian  . Both of those threads talk more in terms of number of documents than size, but I'll add to that information by saying that there are environments in production using Appian to manage upwards of 1TB worth of documents. 

    I think the advice ericg329 gave is an excellent place to start in terms of taking advantage of Appian's content management capabilities. 

  • Certified Lead Developer
    in reply to Eliot Gerson

    Now I will say that the majority of the time, the one little content engine we have still runs like the dickens even with several, several million documents and folders.   It seems that multiple concurrent time-consuming queries on the content engine can eventually cause slowdown and node-stoppage.  If your design limits how frequently multiple users might be looking for a document at the same time, or if you limit the number of times folders get moved, renamed, created, deleted, added to knowledge centers, removed from knowledge centers, or the number of times the security changes on those objects, you'll probably feel less heat from the content engine.

    To that end, I would take the time to migrate your documents one at a time.  I would avoid concurrency as much as possible, because tiny hiccups and delays might begin to compound over that many documents, and as slowdown increases, the likelihood of a node timing out and grinding your process to a halt goes up.  Slow and steady wins the race.

  • If you're running to performance issues, I would encourage you to create a ticket with our support team. They will be able to help give suggestions based on the symptoms you're seeing. For example, they may recommend that you add additional replicas of your content engine to improve throughput if you're seeing bottlenecks. 

Reply
  • If you're running to performance issues, I would encourage you to create a ticket with our support team. They will be able to help give suggestions based on the symptoms you're seeing. For example, they may recommend that you add additional replicas of your content engine to improve throughput if you're seeing bottlenecks. 

Children
No Data