Best way to get all data for user from table that contains 8 million rows

Certified Associate Developer

Hi,

What is the most efficient way to retrieve data from a large table (more than 4 million rows) in Appian?

When I use queryEntity to fetch data from this table, it takes a long time to execute, which negatively impacts Appian’s performance. How can I improve the data retrieval process?

My current approach to solve the issue:

I split the data retrieval into smaller batches. Specifically, I:

  1. Created multiple records that point to the same source table.

  2. Used the foreach function to query each record in batches of 5,000 rows.

  3. Used the union function to combine the results from each batch.

  4. Used the union function to combine the results from each query record.

  5. Then, combined the results across all the records to get the complete dataset.

I’m concerned about future system complexity.
Do I really need to create multiple records and change the system design just to retrieve all the data from a large table?
Or is there a better, more efficient way to handle large data retrieval in Appian without adding complexity?

Thanks,

 

 

  Discussion posts and replies are publicly visible

Parents
  • 0
    Certified Lead Developer

    What are you trying to accomplish? In pretty much every use case you should be paging through the data.

  • 0
    Certified Associate Developer
    in reply to Mathieu Drouin

    Thanks for the response!

    I completely agree that in most scenarios, paging is the ideal solution — especially for improving performance and user experience.

    To clarify, the goal here is to display a full task list for certain users who have over 100,000 tasks assigned to them. Due to business requirements, we need to load all of these tasks into a table for review and reporting purposes, not just page through a limited subset.

    When I try to use queryEntity to retrieve all tasks in one go, it becomes very slow and degrades system performance. To work around this, I’ve been splitting the queries into multiple 5,000-row batches using foreach, and combining results using union. This works, but I’m worried it’s not scalable and adds a lot of complexity — especially since it involves creating multiple records pointing to the same table.

    So my question is:
    Is there a better way to retrieve large datasets in Appian (like 100k+ rows) efficiently, without redesigning the system or creating multiple records?

    Would appreciate any suggestions — maybe something database-side or a smarter way to load/display the data?

    Thanks again!

  • 0
    Certified Lead Developer
    in reply to Mohammed Dyab

    1. How can a single person have 100K tasks?

    2. Why can't you paginate the data?

    3. Why are you using forEach (should never use that on big datasets)

    4. Same goes for union.

    If you need to crunch numbers and report on this data, I would suggest materialized views if performance is a concern. But if its just listing the tasks, there's not much more you can do.

  • 0
    Certified Lead Developer
    in reply to Mohammed Dyab

    To be honest, this feels more like a business problem than a technical problem.

    You could just connect something like PowerBI to that database.

  • 0
    Certified Associate Developer
    in reply to Mathieu Drouin

    Thanks for following up — let me clarify the use case in more detail:

    I’m working with two tables:

    • Tickets table – around 700,000 rows

    • Tasks table – over 8 million rows

    There’s a one-to-many relationship between them: each ticket can have multiple tasks. A single employee might be assigned to the same ticket multiple times through different tasks.

    What I'm trying to do:

    I need to retrieve all tickets that a specific employee has been involved with, based on their task assignments.

    So while an employee might not have 100,000 active tasks in the usual “to-do” sense, the join between these tables can result in very large result sets, especially for employees who have worked on many tickets over time.

    Why pagination is challenging:

    Normally, I would definitely use pagination. But in this case, business users want to see and analyze the full list of tickets for an employee — it’s more of a reporting/auditing/Follow-up-on-tickets scenario than a task management one. They often export this data or perform cross-referencing, so partial views aren’t helpful.

    Regarding forEach and union:

    Totally fair point — I know they’re not ideal for large datasets. I’ve been using them to break down the queries into 5,000-row chunks (which is Appian’s record limit per call), just to avoid timeouts. But I agree this isn’t a sustainable or clean long-term solution.

    Thanks again!

  • 0
    Certified Associate Developer
    in reply to Mathieu Drouin

    Thanks for following up — let me clarify the use case in more detail:

    I’m working with two tables:

    • Tickets table – around 700,000 rows

    • Tasks table – over 8 million rows

    There’s a one-to-many relationship between them: each ticket can have multiple tasks. A single employee might be assigned to the same ticket multiple times through different tasks.

    What I'm trying to do:

    I need to retrieve all tickets that a specific employee has been involved with, based on their task assignments.

    So while an employee might not have 100,000 active tasks in the usual “to-do” sense, the join between these tables can result in very large result sets, especially for employees who have worked on many tickets over time.

    Why pagination is challenging:

    Normally, I would definitely use pagination. But in this case, business users want to see and analyze the full list of tickets for an employee — it’s more of a reporting/auditing/Follow-up-on-tickets scenario than a task management one. They often export this data or perform cross-referencing, so partial views aren’t helpful.

    Regarding forEach and union:

    Totally fair point — I know they’re not ideal for large datasets. I’ve been using them to break down the queries into 5,000-row chunks (which is Appian’s record limit per call), just to avoid timeouts. But I agree this isn’t a sustainable or clean long-term solution.

    Thanks again!

Reply
  • 0
    Certified Associate Developer
    in reply to Mathieu Drouin

    Thanks for following up — let me clarify the use case in more detail:

    I’m working with two tables:

    • Tickets table – around 700,000 rows

    • Tasks table – over 8 million rows

    There’s a one-to-many relationship between them: each ticket can have multiple tasks. A single employee might be assigned to the same ticket multiple times through different tasks.

    What I'm trying to do:

    I need to retrieve all tickets that a specific employee has been involved with, based on their task assignments.

    So while an employee might not have 100,000 active tasks in the usual “to-do” sense, the join between these tables can result in very large result sets, especially for employees who have worked on many tickets over time.

    Why pagination is challenging:

    Normally, I would definitely use pagination. But in this case, business users want to see and analyze the full list of tickets for an employee — it’s more of a reporting/auditing/Follow-up-on-tickets scenario than a task management one. They often export this data or perform cross-referencing, so partial views aren’t helpful.

    Regarding forEach and union:

    Totally fair point — I know they’re not ideal for large datasets. I’ve been using them to break down the queries into 5,000-row chunks (which is Appian’s record limit per call), just to avoid timeouts. But I agree this isn’t a sustainable or clean long-term solution.

    Thanks again!

Children
No Data