150+ MB file gives "500: Internal Server Error"

1 post Page 1 of 1

Post Reply

Support_Julie Support Specialist

Posts: 91

Joined: Thu Mar 05, 2009 3:49 pm

Location: Ashland, NE USA

by Support_Julie » Mon Mar 23, 2009 1:08 pm

Question:
I am setting up a project that reads data from a large, pipe delimited CSV file (150+ MB). The project is failing and I am getting the message "null" on the screen. The joblog for that project is giving me the following message in the stacktrace.

Caused by: java.lang.OutOfMemoryError
at java.lang.Throwable.<init>(Throwable.java:181)
at java.lang.Error.<init>(Error.java:37)
at java.lang.OutOfMemoryError.<init>(OutOfMemoryError.java:25)

Also, after I receive this error, the GoAnywhere fails to function properly and I get a "500: Internal Server Error" message.

What does this mean?

Answer:
This means that the JVM or Java Virtual Machine has reached its memory limit. An error like this may require a restart of the GoAnywhere subsystem (System i), GoAnywhere service (Windows), or Tomcat Web Server (Linux/Unix), in order for GoAnywhere's web interface to function properly.

The likely cause of the issue in this case is the Record Delimiter setting in the CSV Read Task configuration.

How the Reader Tasks (Read CSV, Read Excel, Read Fixed-Width, & Read XML) work:
When a file is parsed, a RowSet variable is created for that file which defines how the data is formatted. The first row of the data is read into memory to ensure the Task's Data Options and Column settings are configured properly. No other rows are read into memory until the RowSet is used by another Task, such as the SQL or Writer (CSV, Excel, Fixed-Width,or XML) Tasks. Each line of data in the RowSet is read into memory just before it written, so only one record of data is in memory at any given time.

In this case, however, the Record Delimiter setting in the Read CSV Task is most likely incorrect. By default, GoAnywhere uses a record delimiter of CRLF (Carriage Return + Line Feed) to separate the records. If the file, however, has LF (Line Feed Only) as the end of line character, the data will fail to be separated into multiple lines and the first record in the RowSet will contain all 150+ MB worth of data. When the RowSet is created and a comparison is made between the Task configuration and the first record in the RowSet, all 150+ MB of data will be read into memory, potentially causing the JVM to reach its memory limit.

It is always best (and highly recommended) that, when setting up your GoAnywhere projects, you should always work with a small sample of the input file data, instead of all 150+ MB worth.[/b]

Julie Rosenbaum
Sr Support Analyst
e. [email protected]
p. 1.800.949.4696
w. HelpSystems.com

Post Reply

1 post Page 1 of 1

Return to “Knowledge Center”