I have to select records from a file which has duplicates. The file layout is below:
Client no ID No Account no
===========================
123344 1 0123344
234455 1 0123344
324521 1 0123344
234531 2 1236789
456879 3 1236789
784322 3 1236789
The file is sorted based on Account no.
For the same Account no and ID no, I have to select the records.
As in the example for Account no '0123344' and ID no '1' there are 3 rows, so this should be written to the file. But in case of Account no '1236789' the ID no changes as '2' and '3' so this should not be written to output.
Only when the ID no is same with an Account no it should be written to output file. If the ID no changes within the same Account no it should not be written to output.
Please let me know if this be achieved with DFSORT. I will be grateful for your ideas.
Joined: 10 May 2007 Posts: 2455 Location: Hampshire, UK
I am not clear about what your output is like - is it one example of the duplicates of an account or is it all duplicates of the account or is it ALL records for an account with duplicates? I.E for account 0123344 do you want 1, 2 or 3 records written? If 1, which one? If 2, which one is not to be written?
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
Murali,
So basically are you saying that you only want to output the records for each account number that has only a single ID (e.g. every ID is 1 for that account number, or every ID is 2 for that account number, etc)?
What is the RECFM and LRECL of the input file?
What is the starting position, length and format of each field?
Is the ID always already in ascending order by account number? Or could you have IDs out of order, for example:
Yes, you are correct. I want to write the records to output for the same Account number that has single ID.
RECFM=FB, LRECL=80.
The fields start position,length and format :
CLIENT NO --> 3,9, Numeric
ID NO --> 29,7,Alphanumeric
ACC NO --> 38,14,Numeric
WIP NO --> 60,9,Numeric
The input file is sorted based on Account number, ID No. So within the same account number, the ID NO will be in sorted order.
In the input, for the Account number '00000000001329' as the ID No changes. This Account number records should not be written to output and remaining records should be written.
Almost I was able to understand the job provided by you.But could you please explain me 'SUM FIELDS=(81,3,ZD)'.
I ran the job without SUM FIELDS.For each record in input, there were two records written to the output file.
SUM Fields example :
--------------------
SORT FIELDS=(1,5,CH,A)
SUM FIELDS=(11,4,ZD)
This will sort the fields 1,5 in Ascending order and then If there are any duplicates in 1,5 it will SUM the values in 11,4. Please correct me if I am wrong.
As per the SUM fields in the job, position 81,3 will have '000' since overlayed,when 81,3 and 84,3 are equal.Please let me know how SUM FIELDS=(81,3,ZD) identifies the duplicates and eliminates it without using any SORT fields.
Joined: 07 Dec 2007 Posts: 2205 Location: San Jose
murali3955 wrote:
As per the SUM fields in the job, position 81,3 will have '000' since overlayed,when 81,3 and 84,3 are equal.Please let me know how SUM FIELDS=(81,3,ZD) identifies the duplicates and eliminates it without using any SORT fields.
Thanks,
Murali
murali3955,
You need to look at the JNF2JMSG output and you will see that it used the following SORT statement
Code:
SORT FORMAT=BI,FIELDS=(38,14,A)
for a better understanding of the job run the following steps individually and check out the contents in pos 81 and 85.