Page 1 of 1

Can't write a single row from a RowSet to file

Posted: Mon Mar 24, 2014 9:45 am
by asmund
I'm trying to download a bunch of files, remove some lines from each file, and save them all in one file.

The code below works. It runs by echoing each line through bash and echo, but is extremely slow, order of one second per line. I tried using the writeFixedWidth task which is disabled in the code below, but it didn't work as expected. It consumed the entire rowset on the first iteration, so that my line-number checking never ran, and the lines I wanted to exclude were included in the file.

Is this a bug or a feature? How can I do this within Director?

I'm using Director version 4.1.1.

Code: Select all
<project name="Geosat solar F10.7 flux" mainModule="Main" version="2.0">
	<description>Download Geosat F10.7 solar flux data</description>

	<module name="Main">

		<ftp label="FTP to NGDC" resourceId="ftp.ngdc.noaa.gov" version="1.0" disabled="false">
			<get label="Get files" destinationDir=" /tmp" whenFileExists="overwrite" destinationFilesVariable="downloaded_files">
				<fileset dir="/STP/space-weather/solar-data/solar-features/solar-radio/noontime-flux/penticton/penticton_observed/tables/">
					<wildcardFilter>
						<include pattern="drao_noontime-flux-observed_199?.txt" caseSensitive="false" />
						<include pattern="drao_noontime-flux-observed_20*.txt" caseSensitive="false" />
					</wildcardFilter>
				</fileset>
			</get>
		</ftp>


		<print label="print ${downloaded_files}" version="1.0" disabled="true">
			<![CDATA[Downloaded:
${downloaded_files}]]>
		</print>


		<rename label="Move old merged file out of the way" inputFile=" /tmp/merged.txt" newName="merged 2.txt" whenFileExists="rename" version="1.0" executeOnlyIf="${FileInfo(" /tmp/merged.txt"):exists}" />

		<forLoop label="Loop over years" beginIndex="1992" endIndex="2001" step="1" currentIndexVariable="year" disabled="false">

			<readFlatFile label="Read a F10.7 file" outputRowSetVariable="input_file" recordDelimiter="LF" processedInputFilesVariable="filename" version="1.0">
				<fileset dir=" /tmp">
					<wildcardFilter>
						<include pattern="*${year}.txt" />
					</wildcardFilter>
				</fileset>
			</readFlatFile>


			<print label="Print processing file" version="1.0">
				<![CDATA[Processing ${filename}]]>
			</print>

			<forEachLoop label="Loop over file lines" itemsVariable="${input_file}" currentItemVariable="line" currentIterationVariable="lineno">

				<print label="print lineno" version="1.0">
					<![CDATA[lineno ${lineno}]]>
				</print>


				<setVariable label="line_deleted = False" name="line_deleted" value="False" version="2.0" />

				<if label="If line number is one to be deleted" condition="${lineno == 2 or lineno == 3 or lineno == 4 or lineno == 5 or lineno > 41}">

					<print label="print deteted line number" version="1.0">
						<![CDATA[deleted line ${lineno}]]>
					</print>


					<setVariable label="line_deleted = True" name="line_deleted" value="True" version="2.0" />

				</if>
				<if label="else" condition="${line_deleted == False}">

					<setVariable label="set linetext" name="linetext" value="${line[1]}" version="2.0" disabled="true" />


					<print label="print line" version="1.0" disabled="false">
						<![CDATA[using ${lineno}, ${line[1]}
line: ${line}]]>
					</print>


					<writeFixedWidth label="Write to merged file" inputRowSetVariable="${line}" outputFile=" /tmp/merged.txt" whenFileExists="append" includeHeadings="false" recordDelimiter="LF" version="1.0" disabled="true" />


					<exec label="bash: echo to merged file" executable="/bin/bash" version="1.0">
						<arg value="-c" />
						<arg value="/bin/echo &apos;lineno ${lineno}, ${line[1]}&apos; >>  /tmp/merged.txt" />
					</exec>

				</if>
			</forEachLoop>
		</forLoop>
	</module>

</project>

Re: Can't write a single row from a RowSet to file

Posted: Thu Apr 10, 2014 5:29 pm
by Support_Jon
asmund,

By design, the Write Fixed Width task consumes the entire rowset (as passed in) and iterates through it to write to the specified output file. If you want to write out the rows of a rowset one by one, then you should use our Print Task instead. Just specify the output file and you should see a nice improvement in your performance.

We have a new task that is coming out with our next release of GoAnywhere Director that I think you will like. It is called ModifyRowset and provides a means to modify an existing rowset and perform data translation, manipulation, and filtering. With this task you will be able to easily exclude the rows you don't want, and just process the ones you want be included. Our GoAnywhere Director 4.6.0 release that will include this new feature is slated to be released in May of this year.

Thanks - Jon

Re: Can't write a single row from a RowSet to file

Posted: Tue Sep 30, 2014 1:41 pm
by monahanks
Hi Jon,
I need to process a rowset one record at a time. I have added the Print task, but all I got in my output was
"com.linoma.dpa.tasks.converters.flatfile.FlatFileRowSet@77095432"
My xml looks like this:
Code: Select all
<project name="Build_Control_xml_new" mainModule="Main" version="2.0">
	<variable name="File_In" value="DXLG_DEMO_20140930.txt" description="File name passed in from calling project " />
	<variable name="defaultfront" value="casua2_ibe" />
	<variable name="defaultback" value="_in_.ctl" />
	<variable name="Control_Name" value="" />

	<module name="Main">

		<createWorkspace version="1.0" />

		<if label="If_filename_DEMO" condition="${Substring(File_In, 6,4) eq &apos;DEMO&apos;}">

			<setVariable label="Set control file name" name="Control_Name" value="${defaultfront}M${defaultback}" version="2.0" />

		</if>
		<if label="If_filename_FULL" condition="${Substring(File_In, 6,4) eq &apos;FULL&apos;}">

			<setVariable label="Set control file name" name="Control_Name" value="${defaultfront}Y${defaultback}" version="2.0" />

		</if>
		<if label="If_filename_NCOA" condition="${Substring(File_In, 6,4) eq &apos;NCOA&apos;}">

			<setVariable label="Set control file name" name="Control_Name" value="${defaultfront}Q${defaultback}" version="2.0" />

		</if>
		<!--D:\GoAnywhere\userdata\projects\Acxiom_ctl.txt-->

		<readFlatFile label="Read file" inputFile="resource:smb://CMRG_fs_Shared/FTPDATA/Acxiom_ctl.txt" outputRowSetVariable="lineread" recordDelimiter="CR" version="1.0" logLevel="debug" disabled="false" />

		<print label="Write Control file" file="${system.job.workspace}\${Control_Name}" append="true" version="1.0">
			<![CDATA[${lineread}]]>
		</print>

		<writeFixedWidth label="Write Control File" inputRowSetVariable="${lineread}" outputFile="${system.job.workspace}\${Control_Name}" whenFileExists="append" includeHeadings="false" version="1.0" disabled="true" />
		<deleteWorkspace version="1.0" disabled="true" />
	</module>
</project>



The input file is a text document with 8 lines, and I want to be able to modify (append a variable into) a couple of the lines as I write the output file.
We are running version 4.6.1

Re: Can't write a single row from a RowSet to file

Posted: Tue Sep 30, 2014 4:38 pm
by Support_Rick
Monahanks,

You need to put your OutputRowsetVariable (lineread) into a ForEach loop so that you can get access to the RowSet data by record. In pseudo terms...
Code: Select all
Main
  If...
  If...
  If...

  Read <MyFile> outputRowSetVariable="lineread"
  forEach itemsVariable="${lineread}" currentItemVariable="line"

    <!-- Process your Record here using ${line[1]} -->

    Print file="${Control_Name} append="true"
      <![CDATA[${line[1]}${system.carriageReturn}]]>
    /Print

  /forEach

  deleteWorkspace
/Main

Re: Can't write a single row from a RowSet to file

Posted: Thu Feb 02, 2017 5:19 am
by falak
Read <MyFile> outputRowSetVariable="lineread"
forEach itemsVariable="${lineread}" currentItemVariable="line"

<!-- Process your Record here using ${line[1]} -->

Print file="${Control_Name} append="true"
<![CDATA[${line[1]}${system.carriageReturn}]]>
/Print

/forEach

How if we use "insert into" statement instead of Print inside for each loop...with 400 column and 1000rows?

Re: Can't write a single row from a RowSet to file

Posted: Thu Feb 02, 2017 9:00 am
by Support_Rick
Searching for "Insert Into" in the search field above, you'll find the following link...

How to Insert data into a Database

You should find what you're looking for there.