Raramorph The Ruby Arabic Morphological Analyzer

5

We have released Raramorph a Ruby 1.9 gem for a port of Aramorph, The Java Arabic morphological analyzer based on Buckwalter Arabic Morphological Analyzer Version 1.0 by Pierrick Brihaye. The first Release of Raramorph is a full morphological analyzer tool, providing both an executable and basic developer tools, that shall be extended in the next version.

Due to algorithmic and data structures optimization in Raramorph , we enhanced the performance of the library reaching an average of 4 - 5 seconds in loading the dictionaries which was the main bottleneck in the java raramorph reaching to 10-11 seconds and an average of 2-3 seconds in parsing meduim sized files and printing the output file. Installing Raramorph : gem install raramorph ( note that raramorph requires Ruby1.9 ) Usage : for exectuable :

raramorph input_file_name output_file_name  -v -a
  -v verbose mode ( optional )
  -a arabic output ( optional )

In Coding

require 'raramorph'

#For analyzing a file 


Raramorph.execute(input_filename, output_filename ,
verbose = false, not_arabic = true) 

Raramorph provides static methods like

  • analyze_token
  • tokenize
  • segement_word

 

In the next Release developer tools will be more stable including more features for analysis and finding the morphological solutions , which can be used in searching and mining engines.

Also Raramorph -ferret adapter is to be released soon , for helping in the morphological analysis in search engines.

Raramorph Source Code is avaliable at :

http://github.com/espace/raramorph/tree/master

 

Raramorph Documentation is availabe here.

Written By:

Moustafa A. Emara ( moustafaemara.wordpress.com )

Comments

1

I would like to see some detailed analysis. What is the tradeoff to use Raramorph. from the performance point of view?
large repositories may be dramatically affected by a single second degrade of performance in the parsing steps.

2

@Fizous: A detailed analysis will be posted soon with benchmarking and comparison between raramorph and its java old implementation.

3

Hi, guys,

I appreciate this new port of AraMorph that Tim Buckwalter developed originally. I am however concerned about the license under which you publish your software.

Your work is derived from the AraMorph implementation and in fact re-uses and re-distributes the original lexicon files. As the original work is published under the GNU General Public License, you are required to publish your work also under this very license. You are not free to publish it under the MIT license or any other one.

Please, read and follow the license and the guidelines to it, available at http://www.gnu.org/licenses/old-licenses/gpl-2.0.html.

It is important that the community of developers respect the open-source licenses!

Thank you very much for correcting this issue immediately.

Best wishes,

Otakar

4

Hi Otakar, point taken, we will review this and update the license accordingly.

Thanks for the note

5

Thanks Otakar , We updated the license file.

Post a Comment

eSpace podcast Prodcast

RSS iTunes