Can't write a single row from a RowSet to file

Post any question you may have in regards to GoAnywhere Director and let our talented support staff and other users assist you.
6 posts Page 1 of 1

asmund

Posts: 2
Joined: Mon Mar 24, 2014 8:25 am

Post by asmund » Mon Mar 24, 2014 9:45 am
I'm trying to download a bunch of files, remove some lines from each file, and save them all in one file.

The code below works. It runs by echoing each line through bash and echo, but is extremely slow, order of one second per line. I tried using the writeFixedWidth task which is disabled in the code below, but it didn't work as expected. It consumed the entire rowset on the first iteration, so that my line-number checking never ran, and the lines I wanted to exclude were included in the file.

Is this a bug or a feature? How can I do this within Director?

I'm using Director version 4.1.1.

Code: Select all
<project name="Geosat solar F10.7 flux" mainModule="Main" version="2.0">
	<description>Download Geosat F10.7 solar flux data</description>

	<module name="Main">

		<ftp label="FTP to NGDC" resourceId="ftp.ngdc.noaa.gov" version="1.0" disabled="false">
			<get label="Get files" destinationDir=" /tmp" whenFileExists="overwrite" destinationFilesVariable="downloaded_files">
				<fileset dir="/STP/space-weather/solar-data/solar-features/solar-radio/noontime-flux/penticton/penticton_observed/tables/">
					<wildcardFilter>
						<include pattern="drao_noontime-flux-observed_199?.txt" caseSensitive="false" />
						<include pattern="drao_noontime-flux-observed_20*.txt" caseSensitive="false" />
					</wildcardFilter>
				</fileset>
			</get>
		</ftp>


		<print label="print ${downloaded_files}" version="1.0" disabled="true">
			<![CDATA[Downloaded:
${downloaded_files}]]>
		</print>


		<rename label="Move old merged file out of the way" inputFile=" /tmp/merged.txt" newName="merged 2.txt" whenFileExists="rename" version="1.0" executeOnlyIf="${FileInfo(" /tmp/merged.txt"):exists}" />

		<forLoop label="Loop over years" beginIndex="1992" endIndex="2001" step="1" currentIndexVariable="year" disabled="false">

			<readFlatFile label="Read a F10.7 file" outputRowSetVariable="input_file" recordDelimiter="LF" processedInputFilesVariable="filename" version="1.0">
				<fileset dir=" /tmp">
					<wildcardFilter>
						<include pattern="*${year}.txt" />
					</wildcardFilter>
				</fileset>
			</readFlatFile>


			<print label="Print processing file" version="1.0">
				<![CDATA[Processing ${filename}]]>
			</print>

			<forEachLoop label="Loop over file lines" itemsVariable="${input_file}" currentItemVariable="line" currentIterationVariable="lineno">

				<print label="print lineno" version="1.0">
					<![CDATA[lineno ${lineno}]]>
				</print>


				<setVariable label="line_deleted = False" name="line_deleted" value="False" version="2.0" />

				<if label="If line number is one to be deleted" condition="${lineno == 2 or lineno == 3 or lineno == 4 or lineno == 5 or lineno > 41}">

					<print label="print deteted line number" version="1.0">
						<![CDATA[deleted line ${lineno}]]>
					</print>


					<setVariable label="line_deleted = True" name="line_deleted" value="True" version="2.0" />

				</if>
				<if label="else" condition="${line_deleted == False}">

					<setVariable label="set linetext" name="linetext" value="${line[1]}" version="2.0" disabled="true" />


					<print label="print line" version="1.0" disabled="false">
						<![CDATA[using ${lineno}, ${line[1]}
line: ${line}]]>
					</print>


					<writeFixedWidth label="Write to merged file" inputRowSetVariable="${line}" outputFile=" /tmp/merged.txt" whenFileExists="append" includeHeadings="false" recordDelimiter="LF" version="1.0" disabled="true" />


					<exec label="bash: echo to merged file" executable="/bin/bash" version="1.0">
						<arg value="-c" />
						<arg value="/bin/echo &apos;lineno ${lineno}, ${line[1]}&apos; >>  /tmp/merged.txt" />
					</exec>

				</if>
			</forEachLoop>
		</forLoop>
	</module>

</project>

Support_Jon

Support Specialist
Posts: 62
Joined: Thu Jul 19, 2012 9:15 am
Location: Ashland, NE

Post by Support_Jon » Thu Apr 10, 2014 5:29 pm
asmund,

By design, the Write Fixed Width task consumes the entire rowset (as passed in) and iterates through it to write to the specified output file. If you want to write out the rows of a rowset one by one, then you should use our Print Task instead. Just specify the output file and you should see a nice improvement in your performance.

We have a new task that is coming out with our next release of GoAnywhere Director that I think you will like. It is called ModifyRowset and provides a means to modify an existing rowset and perform data translation, manipulation, and filtering. With this task you will be able to easily exclude the rows you don't want, and just process the ones you want be included. Our GoAnywhere Director 4.6.0 release that will include this new feature is slated to be released in May of this year.

Thanks - Jon

monahanks

Posts: 41
Joined: Wed Mar 30, 2011 10:19 am

Post by monahanks » Tue Sep 30, 2014 1:41 pm
Hi Jon,
I need to process a rowset one record at a time. I have added the Print task, but all I got in my output was
"com.linoma.dpa.tasks.converters.flatfile.FlatFileRowSet@77095432"
My xml looks like this:
Code: Select all
<project name="Build_Control_xml_new" mainModule="Main" version="2.0">
	<variable name="File_In" value="DXLG_DEMO_20140930.txt" description="File name passed in from calling project " />
	<variable name="defaultfront" value="casua2_ibe" />
	<variable name="defaultback" value="_in_.ctl" />
	<variable name="Control_Name" value="" />

	<module name="Main">

		<createWorkspace version="1.0" />

		<if label="If_filename_DEMO" condition="${Substring(File_In, 6,4) eq &apos;DEMO&apos;}">

			<setVariable label="Set control file name" name="Control_Name" value="${defaultfront}M${defaultback}" version="2.0" />

		</if>
		<if label="If_filename_FULL" condition="${Substring(File_In, 6,4) eq &apos;FULL&apos;}">

			<setVariable label="Set control file name" name="Control_Name" value="${defaultfront}Y${defaultback}" version="2.0" />

		</if>
		<if label="If_filename_NCOA" condition="${Substring(File_In, 6,4) eq &apos;NCOA&apos;}">

			<setVariable label="Set control file name" name="Control_Name" value="${defaultfront}Q${defaultback}" version="2.0" />

		</if>
		<!--D:\GoAnywhere\userdata\projects\Acxiom_ctl.txt-->

		<readFlatFile label="Read file" inputFile="resource:smb://CMRG_fs_Shared/FTPDATA/Acxiom_ctl.txt" outputRowSetVariable="lineread" recordDelimiter="CR" version="1.0" logLevel="debug" disabled="false" />

		<print label="Write Control file" file="${system.job.workspace}\${Control_Name}" append="true" version="1.0">
			<![CDATA[${lineread}]]>
		</print>

		<writeFixedWidth label="Write Control File" inputRowSetVariable="${lineread}" outputFile="${system.job.workspace}\${Control_Name}" whenFileExists="append" includeHeadings="false" version="1.0" disabled="true" />
		<deleteWorkspace version="1.0" disabled="true" />
	</module>
</project>



The input file is a text document with 8 lines, and I want to be able to modify (append a variable into) a couple of the lines as I write the output file.
We are running version 4.6.1

Support_Rick

Support Specialist
Posts: 590
Joined: Tue Jul 17, 2012 2:12 pm
Location: Phoenix, AZ

Post by Support_Rick » Tue Sep 30, 2014 4:38 pm
Monahanks,

You need to put your OutputRowsetVariable (lineread) into a ForEach loop so that you can get access to the RowSet data by record. In pseudo terms...
Code: Select all
Main
  If...
  If...
  If...

  Read <MyFile> outputRowSetVariable="lineread"
  forEach itemsVariable="${lineread}" currentItemVariable="line"

    <!-- Process your Record here using ${line[1]} -->

    Print file="${Control_Name} append="true"
      <![CDATA[${line[1]}${system.carriageReturn}]]>
    /Print

  /forEach

  deleteWorkspace
/Main
Rick Elliott
Lead Solutions Consultant
(402) 944.4242
(800) 949-4696

falak

Posts: 8
Joined: Tue Dec 27, 2016 3:03 am

Post by falak » Thu Feb 02, 2017 5:19 am
Read <MyFile> outputRowSetVariable="lineread"
forEach itemsVariable="${lineread}" currentItemVariable="line"

<!-- Process your Record here using ${line[1]} -->

Print file="${Control_Name} append="true"
<![CDATA[${line[1]}${system.carriageReturn}]]>
/Print

/forEach

How if we use "insert into" statement instead of Print inside for each loop...with 400 column and 1000rows?

Support_Rick

Support Specialist
Posts: 590
Joined: Tue Jul 17, 2012 2:12 pm
Location: Phoenix, AZ

Post by Support_Rick » Thu Feb 02, 2017 9:00 am
Searching for "Insert Into" in the search field above, you'll find the following link...

How to Insert data into a Database

You should find what you're looking for there.
Rick Elliott
Lead Solutions Consultant
(402) 944.4242
(800) 949-4696
6 posts Page 1 of 1