I have handle two XML files daily, each with a size of 50MB or more. These files are transferred via SFTP or FTP. Our goal is to extract specific values from these files, check their existence in our database, and either update existing records or insert new ones as necessary.Requirements: File Retrieval: Automate the retrieval of XML files from an SFTP or FTP server. Store the files temporarily within Appian or an external file storage service.
Data Extraction: Parse the XML files to extract a list of values. Given the large file sizes, efficient parsing and processing strategies are needed.
Data Validation and Update: Compare extracted values against existing database records. Update existing records with the latest values or insert new records as required.
Performance Considerations: Handle large files efficiently, possibly by processing in chunks. Ensure database operations (updates and inserts) are optimized for performance.can any one suggest some best approach to do this.
Discussion posts and replies are publicly visible
Agree with Stefan Helzle and Mathieu Drouin .... with such data volumes it's not the best aproach..
It's not possible that you work with that files outside appian before read it in Appian, or receive smaller files via FTP?
I'll got the file from external system into Appian using FTP than I'll process that file
Which is fine, but you might want to look for some other software or service to perform the actual processing. The process controlling this, can easily be done in Appian.
Woudl it be possible by the external system to generate smaller files?
I think this would be key if you want Appian to process the files. Biggest issue is the larger files, you'll need something upstream to split the files into smaller files.
I'll have to read the list of data from xml file and compare extracted values against existing database records. Update existing records with the latest values or insert new records as requiredbut at one time I have receive around 1000 to 3000 list of xml string which i have read it from xml file
Exactly, with smaller files, it's easy in Appian to handle this situation
but what will be the approach for this
I can easily read the xml file. the issue is in process that reader data.
I suggest to use the xpathdocumentfunction to directly work with the file without first reading it into memory.
https://docs.appian.com/suite/help/24.2/fnc_scripting_xpathdocument.html