Humans have generated more data in the past two years than the whole previous human history combined. It is estimated that there are 2.5 quintillion bytes of data created each day and it is increasing. Data is commonly shared using large files from source to target applications. These large data files need to be processed quickly and efficiently without errors. Failures in transferring and processing of large files have caused significant information loss to the business, causing outages and delays to mission critical applications.
Large file processing is a common use case across verticals. While processing large files, a number of factors need to be considered.
With all these considerations in mind, we have developed an efficient solution to process large files in Mule 4. Mule 4 allows streaming and processing large files with out of the box capabilities and it is easier than ever!
Consider this use case: A large file (1GB+) with customer data needs to be picked from a file location and sent over to Salesforce Marketing Cloud (SFMC). If one tries to read this file, perform basic transformations and attempt to send it to SFMC without any processing strategy, the Mule application will most definitely run out of memory and crash. In order to achieve this, we must use streaming and design the process in a way to process the data in “chunks” rather than the full file.
For the solution, I have used a simple Scheduler to trigger the flow. I initially had an “On New or Updated File” to trigger the flow, but once I deployed the application to CloudHub, I noticed the trigger would intermittently throw errors. In order to solve this, I added in a Fixed Frequency Scheduler, then listed the available files in the directory I was polling as the first step in the flow. This strategy worked as the flow would only continue if there were files to process.
Once the files were ready to process, I read the files from the directory using a SFTP Read and chose the Repeatable file stream store option:
I have found depending on the use case, both the default Repeatable file store stream, which will stream the file and only keep a customizable chunk in memory and allows you to read the payload multiple times through the processing, and the Non repeatable stream, which will simply stream the file only once for you to do the processing, are the best options. Which option you choose will depend on individual cases. If you have logic prior to processing the file, you may want to use a non repeatable stream to trigger the event, save the file path and read the file when you are ready to do processing on the file data.
Once you are ready to process the file, your file is at this point in a stream. Once you trigger a transformation, the data is loaded in its entirety which can cause major issues trying to transform a large amount of data. There are several options to get around this however, if you must transform the data in its entirety, there is an option to stream the transformation as well, using @StreamCapeable in the transform:
This can bring a difference of being able to process 1GB of data in about a minute. We will discuss optimization a little later.
To then process this transformed data using an HTTP Request connector, depending on what the target system or System API will accept, we may not be able to send the data all at once. If you using the HTTP RQ object, there is an option to stream the data:
In addition, we must process the data in chunks by either using a “For Each” with an optimized collection size OR a Batch Process. I have found both to be useful, however, it depends on the use case. The magic of how fast and how efficiently the data is processed lies in the batch size or collection size.
If you choose to use a Batch Process, I highly recommend putting the transformation inside the batch process if the use case allows it:
At the end, each case is different, depending on the use case, however, streaming in Mule 4 and processing large files has become much easier with out of the box capabilities in most of the connectors provided.
** We are looking for a B2B Sales – Lead Generation Specialist that is highly motivated, energetic and coachable to work in a fast growing startup. This role is extremely critical to achieve continued growth and account penetration for the company.**
Position: B2B Sales – Lead Generation Specialist
Location: Bangalore, India
Twenty20 Systems is a technology solutions company with a specialized focus in delivering business outcomes through a connected enterprise where all systems, applications, processes, services and people are seamlessly working together. Our mission is to further your vision – our experienced and passionate consulting resources bring deep expertise in the field of enterprise integration, digital transformation, cloud initiatives and business analytics.
*CANDIDATES WOULD ALREADY BE SERVING A NOTICE PERIOD or IMMEDIATE JOINERS*
** HAVING WORKED IN A SMALL COMPANY OR STARTUP ENVIRONMENTS IS A BIG PLUS **
We are looking for highly motivated, energetic and coachable individuals seeking an opportunity in a fast growing startup with great benefits.
As an IT Sales Executive, you will be responsible for selling Twenty20’s professional services offerings from our technology practice areas (Enterprise Integration and Business Automation) to both existing customers and prospects. This is a quota carrying role focused on understanding our customers’ business challenges and positioning Twenty20’s solutions that drive value, adoption, and support customers’ successful transition/expansion of the technology platforms. You will be working closely with Practice Leads that will support you in creating and proposing innovative solutions to our customers through the sales cycle.
Twenty20 Systems is a technology services company with a specialized focus in delivering business outcomes through a connected enterprise where all systems, applications, processes, services and people are seamlessly working together. Our mission is to further your vision – our experienced and passionate consulting resources bring deep expertise in the field of enterprise integration, digital transformation, cloud initiatives and business analytics.
You will be hands-on, technical subject matter expert that will work closely with the sales team, customers and system integration/partners to identify business needs and propose effective solutions. You assume overall technical leadership and responsibility for designing solutions that leverage internal capabilities across the organization’s portfolio. You shall provide leadership of technical activities during the sales cycle, set a vision for customers, lead requirements gathering and analysis, design solutions, facilitate demos and presentations and work as an integral part of the delivery team. It will be a customer facing role and should be able to work effectively in a collaborative role.
In this fast-paced digital economy, businesses of today look for partners that bring a modernized approach to technology solutions that provides their customers with new and innovative experiences. Twenty20 has been this
partner for numerous businesses over the past two decades with their deep technology & industry expertise.
Enterprises of today are moving away from ESBs and Legacy Middleware platforms toward Cloud-native serverless and modern architectures powered by iPaaS platforms that provide accelerated agility and business results. We, at Twenty20 Systems, enable this journey!
Twenty20 Systems and Denodo have partnered to combine expert consulting services expertise with the
industry’s leading data virtualization platform, to provide companies with impactful, data-driven insights that
accelerate speed-to-market.
We are looking for a highly energetic IT Project Coordinator to administer, organize and coordinate the various project activities. You should have excellent time management and communication skills, as you’ll collaborate with clients and internal teams to deliver results on deadlines. Ultimately, the Project Coordinator’s duties are to ensure that all projects are completed on time, within budget and meet high quality standards. The position is a Contract to Hire role.
Twenty20 Systems is a technology solutions company with a specialized focus in delivering business outcomes through a connected enterprise where all systems, applications, processes, services and people are seamlessly working together. Our mission is to further your vision – our experienced and passionate consulting resources bring deep expertise in the field of enterprise integration, digital transformation, cloud initiatives and business analytics.
*CANDIDATES WOULD ALREADY BE SERVING A NOTICE PERIOD or IMMEDIATE JOINERS*
** Having worked in small companies or startup environments is a BIG plus **
** We are looking for highly motivated, energetic and coachable individuals seeking an opportunity in a fast growing startup with great benefits.**
Position: Senior Software Engineer
Location: Bangalore, India
Twenty20 Systems is a technology solutions company with a specialized focus in delivering business outcomes through a connected enterprise where all systems, applications, processes, services and people are seamlessly working together. Our mission is to further your vision – our experienced and passionate consulting resources bring deep expertise in the field of enterprise integration, digital transformation, cloud initiatives and business analytics.
*CANDIDATES WOULD ALREADY BE SERVING A NOTICE PERIOD or IMMEDIATE JOINERS*
** Having worked in small companies or startup environments is a BIG plus **
Position: Software Engineer
Location: Bangalore, India
Twenty20 Systems is a technology solutions company with a specialized focus in delivering business outcomes through a connected enterprise where all systems, applications, processes, services and people are seamlessly working together. Our mission is to further your vision – our experienced and passionate consulting resources bring deep expertise in the field of enterprise integration, digital transformation, cloud initiatives and business analytics.
We are looking for highly motivated, energetic and coachable individuals seeking an opportunity in a fast growing startup with great benefits.