IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Hashing in PL/I


IBM Mainframe Forums -> PL/I & Assembler
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Casper656

New User


Joined: 10 Mar 2014
Posts: 8
Location: INDIA

PostPosted: Mon Mar 10, 2014 2:58 pm
Reply with quote

Hi,

I am trying to create Hash Table in Pl/I. I want generate a hash value which will be served as index to an array. The input to the hash generating function is a unique alpha numeric string and the output hash I require is an integer.

ex. My input is FXXB320 and output should be a number , any number that uniquely represent FXXB320, collision is acceptable.

What are your suggestion on how to achieve this ?
Back to top
View user's profile Send private message
enrico-sorichetti

Superior Member


Joined: 14 Mar 2007
Posts: 10873
Location: italy

PostPosted: Mon Mar 10, 2014 3:14 pm
Reply with quote

where are You facing problems ...
the algorithm or the coding ???
Back to top
View user's profile Send private message
Akatsukami

Global Moderator


Joined: 03 Oct 2009
Posts: 1788
Location: Bloomington, IL

PostPosted: Mon Mar 10, 2014 3:15 pm
Reply with quote

If the input is no more than eight characters, have you considered interpreting UNSPEC(input) as UNSIGNED FIXED BIN?
Back to top
View user's profile Send private message
enrico-sorichetti

Superior Member


Joined: 14 Mar 2007
Posts: 10873
Location: italy

PostPosted: Mon Mar 10, 2014 3:18 pm
Reply with quote

and ...
possibly modulus against the table size...
but how are You going to handle the collisions
( sequential search for a free slot after the hashed one )

how many entries in the table ?
Back to top
View user's profile Send private message
steve-myers

Active Member


Joined: 30 Nov 2013
Posts: 917
Location: The Universe

PostPosted: Mon Mar 10, 2014 6:46 pm
Reply with quote

Mr. Sorrichetti's advice is good. I do this in Assembler In two steps:
Code:
         PACK  AWORD(5),KEY(9)
         L     1,AWORD
         SR    0,0
         D     0,HTABSIZE
         LR    1,0
         SLL   1,2
         LA    1,HASHTAB(1)


AWORD    DC    F'0',AL1(0)
KEY      DC    CL8'FXXB320',C' '
I don't know how to compact the key to 4 bytes in PL/I, though I'm sure some PL/I expert can chime in. I handle collisions by making each hash table entry the address of a chained data area; a collision just adds a new entry to the chain. The number of entries in the table (in HTABSIZE in my example) is important, but to some extent it depends on the number of elements the hash table is pointing to. I'm not saying the hash table should be large enough so that there is one entry for each possible key as Mr. Sorrichetti implies, but you don't want the alias chains to get very long.
Back to top
View user's profile Send private message
enrico-sorichetti

Superior Member


Joined: 14 Mar 2007
Posts: 10873
Location: italy

PostPosted: Mon Mar 10, 2014 7:02 pm
Reply with quote

since hashing/randomizing is NOT a beginner's thing

there is no reason for the TS not to write a small simulation with a parametrized table size to find out the best table dimension
and the longest sibling chain for the input key distribution
Back to top
View user's profile Send private message
steve-myers

Active Member


Joined: 30 Nov 2013
Posts: 917
Location: The Universe

PostPosted: Tue Mar 11, 2014 4:31 am
Reply with quote

Quote:
... hashing/randomizing is NOT a beginner's thing
Hard to know. Done properly it is much better than a search of a complete chain, and it doesn't have the size constraints implied by a binary search. Even poorly done it is very helpful.
Quote:
there is no reason for the TS not to write a small simulation with a parametrized [sic] table size to find out the best table dimension and the longest sibling chain for the input key distribution
I've done studies like this after the fact, mainly to see how well the hash arithmetic worked, but even when I did it I was never very happy with the results. I've come to the conclusion my hash tables tend to be too big.

Over the past couple of years once the table has filled I tend to build a monolithic chain from the hash table and sort the chain by the key for reporting purposes. The way I do it clears the hash table, so I can't easily do post-mortem studies.
Back to top
View user's profile Send private message
Casper656

New User


Joined: 10 Mar 2014
Posts: 8
Location: INDIA

PostPosted: Tue Mar 11, 2014 11:58 am
Reply with quote

Akatsukami wrote:
If the input is no more than eight characters, have you considered interpreting UNSPEC(input) as UNSIGNED FIXED BIN?


How do you suggest to do this. I am sorry, I am not a PL/I programmer. I am new to it. It would be helpful if you provide me an example.

Let me explain, I have a flat file. I need to read the input from the file and build Hash Table. So how should I read the character string in to FIXED BIN ?
Back to top
View user's profile Send private message
Casper656

New User


Joined: 10 Mar 2014
Posts: 8
Location: INDIA

PostPosted: Tue Mar 11, 2014 12:03 pm
Reply with quote

enrico-sorichetti wrote:
where are You facing problems ...
the algorithm or the coding ???


Problem is with coding. I am not flexible with PL/I.
Back to top
View user's profile Send private message
Nic Clouston

Global Moderator


Joined: 10 May 2007
Posts: 2455
Location: Hampshire, UK

PostPosted: Tue Mar 11, 2014 2:40 pm
Reply with quote

With the wonders of Google this may be of help.
Back to top
View user's profile Send private message
prino

Senior Member


Joined: 07 Feb 2009
Posts: 1306
Location: Vilnius, Lithuania

PostPosted: Wed Mar 12, 2014 2:27 am
Reply with quote

Nic Clouston wrote:
With the wonders of Google this may be of help.

I believe SuperC uses a similar approach, simplified it takes three characters at a time, adds a zero byte in front of them and treats them as a PL/I fixed bin (31) that is summed. Something like:

Code:
dcl 1 * union,
      2 f   fixed bin (31) unal,
      2 *,
        3 * char       (1) init (low(1)),
        3 c char       (3);

hash = 0;
do i = 1 to length(c) by 3;
  c    = substr(whatever, i, 3);
  hash = hash + f;
end;
Back to top
View user's profile Send private message
PeterHolland

Global Moderator


Joined: 27 Oct 2009
Posts: 2481
Location: Netherlands, Amstelveen

PostPosted: Wed Mar 12, 2014 4:03 pm
Reply with quote

There is kind of an example in :

Enterprise PL/I for z/OS
Programming Guide
SC27-1457-09
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> PL/I & Assembler

 


Similar Topics
Topic Forum Replies
No new posts Cobol component calls MHASH lib for h... COBOL Programming 2
Search our Forums:

Back to Top