Thursday, November 20, 2014

Menhir



This blog post about  -  Menhir.


According to Wikipedia:

A menhir (French, from Middle Breton: men, "stone" and hir, "long"[1]), standing stone, orthostat, or lith is a large upright standing stone.
Coincidently, Menhir, is also the name for LR(1) parser generator for OCaml.
 
I followed the recommendation in the Real World OCaml to use it, rather than ocamlyacc:
"Menhir is an alternative parser generator that is generally superior to the venerable ocamlyacc, which dates back quite a few years. Menhir is mostly compatible with ocamlyacc grammars, and so you can usually just switch to Menhir and expect older code to work (with some minor differences described in the Menhir manual)

I found a few sources online that help you understand Menhir but it took me some time to get my head around it.

This blog (and this post in particular) is mainly for me to record my activities and a way to understand things better. Nevertheless, I hope that by going through and discussing the code I've written, will shorten the learning curve for some of you -  or the very least entertain you :)

On y vas! For my purposes, I started by parsing a simple n-tuple string, for the MoanaML code.
(?x,hasColor,?y,context)
Following the instructions in the Real World OCaml I knew that I had to create two files, a namely, Parser.mly and Lexer.mll

The parser file is used to construct and parse the grammar. You can define tokens and describe their required sequences.

For example, for the Moana tuple, I defined the following tokens:
%token STRING
%token VAR
%token LEFT_BRACE
%token RIGHT_BRACE
%token START
%token END
%token COMMA
%token EOF
 I the used them to define the required parsing sequence:
LEFT_BRACE; s = elem; COMMA; p = elem; COMMA; o = elem; COMMA; c  = elem; RIGHT_BRACE 
elem:
 | v = VAR {Variable v}
 | c = STRING {Constant c} 

The elem is there to differentiate constants and variables and consequently pass the parsed value into a relevant type constructor.

You also define the parsing function and its return type
 start < config .tuple > parse
Now once we have the parser, we can move to the lexer file.  Here we define rules using regular expressions in order to match, capture and convert strings into the previously defined tokens.

In my case:

rule lex = parse
  | [' ' '\t' '\n']      { lex lexbuf }
  | newline         { next_line lexbuf; lex lexbuf }
  | ","                 { COMMA }
  | "("                 { LEFT_BRACE }
  | "{"                {START}
  | eof                {EOF }

To do the actual parsing, we use:
Parser.parse_tuple Lexer.lex (Lexing.from_string s)
the Real World OCaml: helps us understand what's happening
[Lexing.from_string] function is used to construct a lexbuf [from a string], which is passed with the lexing function [Lexer.lex] to the [Parser.parse] functions.
That's all, folks!

Pitfalls.

I was expecting Menhir to provide me with nice exceptions to debug my code, as was promised in the Real World OCaml:
The biggest advantage of Menhir is that its error messages are generally more human-comprehensible, ...
but it didn't. At least, I couldn't find the way to invoke it.
In any case, the book does provide you with some code to make the debugging a bit easier.

Anyway, online regular expression editor is your friend - I used http://www.regexr.com. Use it to test your regular expression and adjust the lexer accordingly.

One more thing, the order of your lexer rules matters!

Finally, in addition to the the Real World OCaml book chapter, I also found very useful this example and this Mini Ocaml tutorial.

That's about it, very simple and elegant once you get your head around it.

Disclaimer: As I mentioned in the beginning, I am a newbie, so please let me know if I got some of the stuff wrong or there is a more efficient way of doing things. I am more than happy to hear from you guys!

Thursday, November 13, 2014

"Good news, everyone!" - OCaml.org teaching page is live!


I am helping out with a great initiative to create a Ocaml teaching page containing resources for anyone who would like to teach Ocaml.
 
The initial page is now up, however the work is far from over.

We would like to open it to the community to contribute any additional resources that we might have missed as well as providing feedback on the currently listed resources.

The best way to do that is by editing this wiki page or editing the page itself and creating a pull request.

Mailing list for teachers

We also created a mailing list for the teachers to exchange their experiences and common practices.

Look forward to hearing from you!


Friday, October 31, 2014

Irmin Irmin on the wall, who is the pretiest of them all - Moana

I am currently working on MoanaML.

My recent challenge was to code a backend based on Mirage/Irmin.

Ok, first thing first, what is Moana:

Moana, an information-centric middleware for distributed services and applications. Moana offers a shared graph-based storage abstraction through which applications communicate with each other by appending and observing the shared graph.
Moana supports two basic operations of ADD and MAP; add allows an application to persistently extend the global graph, and map provides a dynamically maintained mapping of the global graph to an application specific, internal sub-graph.
MoanaML is an implementation of the Moana primitives in OCaml.

Now, what is Irmin:

Irmin is a distributed database with built-in snapshot, branch and revert mechanisms. It is designed to use a large variety of backends, although it is optimized for append-only store.
 
Just  to recap, at the moment, MoanaML contains:

Here I wanted to summaries some of my initial progress on developing the Irmin backend for Moana.

To code Irmin-based backed we need to implement two Moana signatures, namely STORE and GRAPH

I am a noobie when it comes to Irmin, I am still trying to figure out all of its functionalities so proceed with caution :)

I recommend reading this blog post to get a better understanding on Irmin, also I found this short paper very insightful. For better code understanding, make sure you familiarise yourself with the lwt library.

So to use Irmin, I chose to use blocks of type String, just because it is provided; to have different types of values you need to implement your own IrminContents signature. This, from my understanding, involves letting Irmin know how to merge your values.

Ultimately, I want to use Irmin to store Moana tuples, not strings, so I will have to implement the Contents signature. For now, I convert Moana tuples into JSON string and store them in Irmin.

Now to the code...


Following Irmin examples, I first implemented t_of_view  and view_to_t functions to convert list of tuples to view and take list of tuples from the view, respectively. I later use these functions to store the views inside Irmin.

The cool thing about Irmin that all  my data is stored in git.

Thanks to this:

module Git =
  IrminGit.FS(struct let root = Some "/tmp/irmin/moana"
                        let bare = true
                           end)
 
(* Values are Strings * *)
module Store = Git.Make(IrminKey.SHA1)(IrminContents.String)(IrminTag.String)

You can also visualise it!