I discovered that programming in Point-Free Style can be addictive! Today, I wrote two perverse functions using PFS before discovering I had spent almost two hours just two add two lines to my code!!!
Here's the one of the two:
foo = uncurry (.) . (map . second . drop . length &&& (filter . (flip $ flip isPrefixOf . snd)))
Sorry if it didn't fit in your screen. I used the name foo purposedly: can you imagine what this function does?
If you can't come up with a solution in less than 1 minute, the above code is a failure as it means nothing to you (i.e. its meaning is not apparent).
Try with the following version:
foo p [] = []
foo p ( (k,l):os ) | p `isPrefixOf` l = (k,take (length p) l) : foo p os
| otherwise = foo p os
Sorry for the bad layout.
It took two hours to me to figure out how and where to put all those . and $ in the first version but it was fun, just like winning a game.
On the other hand, I wrote the second version as I was writing this post. I think the reason why I write PFS code much much slower lies both in my lack of knowledge of PFS programming and my way of reasoning about programs.
Nevertheless, PFS is appealing to me because I'm used to write shell scripts to automate things on the systems I manage: PFS is all about composing and piping functions, mostly like you compose and pipe shell programs.
I wonder if I'll be able to guess the meaning of the PFS functions I wrote after some months: we'll see.
As a side note, I noticed that the following pattern is coming up rather often when coding in FPS:
goo = uncurry g . (foo &&& bar)
which means:
goo i = foo i `g` bar i
So, I wonder whether the following combinator would help to making FPS code more readable:
f `interspersedBy` g = uncurry g . f
So, the above snippet would become:
goo = (foo &&& bar) `interspersedBy` g
December 05, 2007
December 04, 2007
Arrows first encounter
UPDATE: My post appeared on Reddit's programming feed. Thank you! Also, my post's title should've been "Point-free programming style first encounter", but I'm too lazy to change it ^_^
I was really busy lately so no post around for a while. Surprisingly, I wasn't busy wrinting code but actually thinking about the problem: this is feasible in Haskell as the code is more concise and the typing system keeps you from doing stupid design mistakes (or at least make them appear earlier than in other programming language).
Now, done with the ritual Haskell-awe prologue, here my post for today.
Lately, I've been working to a program manipulating objects which turned out to be monomials, which I represented as a pair of (Double,[Int]), i.e.:
(4.0,[1,2,-1,-2]) -> 4.0 x(1,1) x(2,2) x(3,-1) x(4,-2)
where x(i,j) are unknowns indexed by a pair. Since I assume that the first component of the unknowns' pairs will be the sequence 1,2,3,... I only need to specify the second component. I know, it's kinda weird, but it has to do with my domain problem which is related to finite difference operators for PDE.
Now I want to define a "product" operator which works like this:
(k,[b,c,d]) * (h,[e,f,g]) = (k*h,[a+e, c+f, d+g])
I can define this like follows:
prod (k,ll) (h,rl) = (k*h, zipWith (+) ll rl)
This is perfectly correct and good as it's very expressive.
Now, I want to apply the same function monadically to list arguments, i.e.:
prod :: [(Double,[Int])] -> [(Double,[(Double,[Int])] -> [(Double,[Int])]
To do this, I'll use liftM2 which take a function of two arguments and applies it to every pair formed taking an element from the first list and an element from the second. So:
prodM = liftM2 prod
But I'll never use prod by itself, so let's put this in a lambda abstraction and rename prodM to prod:
prod = liftM2 $ \(k,ll) (h,rl) -> (k*h, zipWith (+) ll rl)
Again, this is correct but my Haskell soul is becoming more strict and such a formulation sounds bad to me (I could've felt perfectly right with some month ago). I don't like lambdas when used this way, I see as an intrusive element in the perfect world of function combinators/operators (and liftM2 is such a combinator). More, I had to specify that prod requires two tuples while it'd be neat if the compiler could infere this automatically: I see this as code replication issue.
The solution to this dilemma is using point-free style programming. I'll abuse arrow combinators a bit. Specifically, there's the (***) operator which takes two function f and g and applies them to a tuple:
(f *** g) (a,b) = (f a, g b)
What if I need to combine two tuples using two separate binary operators? I.e.:
(f ***' g) (a,b) (c,d) = (f a c, g b d)
Arrows module doesn't provides such a combinator but we could build one. First, if we combine f and g using (***), we get a pair of two functions as f and g are partially evaluated. What we need now is to interpose (***) between these two functions. How to do this? Simple, we just need to uncurry (***). So, now:
(f ***' g) = uncurry (***) . (f *** g)
and we're done as f ***' g returns a function taking two tuples and apply f and g separately to each component. This is good as we can rewrite prod as:
prod = liftM2 $ uncurry (***) . ((*) *** (zipWith (+)))
which is essentially a smart combination of functions. Notice that no arguments of prod has been specified: this is where point-free programming style really lies. Changing the way prod is written will change the type of its arguments, in single place, without requiring you to rewrite them in two places (i.e. the argument list).
I was really busy lately so no post around for a while. Surprisingly, I wasn't busy wrinting code but actually thinking about the problem: this is feasible in Haskell as the code is more concise and the typing system keeps you from doing stupid design mistakes (or at least make them appear earlier than in other programming language).
Now, done with the ritual Haskell-awe prologue, here my post for today.
Lately, I've been working to a program manipulating objects which turned out to be monomials, which I represented as a pair of (Double,[Int]), i.e.:
(4.0,[1,2,-1,-2]) -> 4.0 x(1,1) x(2,2) x(3,-1) x(4,-2)
where x(i,j) are unknowns indexed by a pair. Since I assume that the first component of the unknowns' pairs will be the sequence 1,2,3,... I only need to specify the second component. I know, it's kinda weird, but it has to do with my domain problem which is related to finite difference operators for PDE.
Now I want to define a "product" operator which works like this:
(k,[b,c,d]) * (h,[e,f,g]) = (k*h,[a+e, c+f, d+g])
I can define this like follows:
prod (k,ll) (h,rl) = (k*h, zipWith (+) ll rl)
This is perfectly correct and good as it's very expressive.
Now, I want to apply the same function monadically to list arguments, i.e.:
prod :: [(Double,[Int])] -> [(Double,[(Double,[Int])] -> [(Double,[Int])]
To do this, I'll use liftM2 which take a function of two arguments and applies it to every pair formed taking an element from the first list and an element from the second. So:
prodM = liftM2 prod
But I'll never use prod by itself, so let's put this in a lambda abstraction and rename prodM to prod:
prod = liftM2 $ \(k,ll) (h,rl) -> (k*h, zipWith (+) ll rl)
Again, this is correct but my Haskell soul is becoming more strict and such a formulation sounds bad to me (I could've felt perfectly right with some month ago). I don't like lambdas when used this way, I see as an intrusive element in the perfect world of function combinators/operators (and liftM2 is such a combinator). More, I had to specify that prod requires two tuples while it'd be neat if the compiler could infere this automatically: I see this as code replication issue.
The solution to this dilemma is using point-free style programming. I'll abuse arrow combinators a bit. Specifically, there's the (***) operator which takes two function f and g and applies them to a tuple:
(f *** g) (a,b) = (f a, g b)
What if I need to combine two tuples using two separate binary operators? I.e.:
(f ***' g) (a,b) (c,d) = (f a c, g b d)
Arrows module doesn't provides such a combinator but we could build one. First, if we combine f and g using (***), we get a pair of two functions as f and g are partially evaluated. What we need now is to interpose (***) between these two functions. How to do this? Simple, we just need to uncurry (***). So, now:
(f ***' g) = uncurry (***) . (f *** g)
and we're done as f ***' g returns a function taking two tuples and apply f and g separately to each component. This is good as we can rewrite prod as:
prod = liftM2 $ uncurry (***) . ((*) *** (zipWith (+)))
which is essentially a smart combination of functions. Notice that no arguments of prod has been specified: this is where point-free programming style really lies. Changing the way prod is written will change the type of its arguments, in single place, without requiring you to rewrite them in two places (i.e. the argument list).
November 14, 2007
Random numbers (where it all started)
At first, Haskell seemed a reasonable language... until Monads appeared! Until then, purity was a clear plus of Haskell but... what about a random number generator (RNG)?
A pure function is deterministic: it depends only on its inputs and, as computers are deterministic as well, its output must be the same given the same input data.
Conversely, an RNG function (at least as it's known in the imperative world) must be non-deterministic and return a different outcome whenever it's called (without arguments).
So, how to code an RNG function in Haskell? Enter Monads.
Haskell can't give up on purity, so the solution is simple: instead of returning a value, an RNG function returns a computation.
A computation is like a machine with some pipes going in. By itself, the machine doesn't do anything. But, if you run it, providing fuel through the pipes, it will eventually produce something.
The machine is pure, while the fuel is impurity (or side-effects). The fuel comes from the environment under which the machine is run.
So, an RNG function is actually a machine which, when run under the special enviroment of the IO Monad, provides a number. More, you can build a special RNG machine (i.e. write an RNG computation) producing an infinite list of random numbers: just fill in sufficient fuel until more random numbers are needed!
In this way, Haskell let's you stay on the pure side when building the computation, letting you do all sort of things a computer theorist would do to a program (applying transformations, proving properties...), and getting you dirty when running your computation.
Sometimes it's hard to catch up with the meaning of Monads at first because the first Monad you encounter is the IO Monad, which is implicitly run by the Haskell run-time environment. So, you lose the concept that a Monad computation must be run to produce something useful.
For instance, the State Monad has runState function. But, as I said in one of my previous post, IO has its own run function: unsafePerformIO, from Foreign module.
So, let's see how to build an RNG computation:
randomList r = do g <- getStdGen
return $ randomRs r g
randomList takes a pair representing a range of numbers and return a computation that, when run, provides an infinite list of numbers. Indeed, randomList's type is:
Num a => (a,a) -> IO [a]
Specifically, when you state:
g <- getStdGen
you are saying something like: "g is a pipe. When you run this computation, please provide a StdGen value here". randomRs is a plain pure function taking a StdGen (which is basically a seed with a next function iterated to provide an infinite list of random number from it) and the range to which random numbers must be belong to.
How do we run this machine? Simple: put unsafePerformIO in front of it.
randomList r = unsafePerformIO $
do g <- getStdGen
return $ randomRs r g
Now randomList produces an infinite list of random numbers because it'll run the computation inside it.
But, beware! randomList is now and impure function masked as a pure function. Haskell could tricked by this masking and make the wrong thing misled by the purity assumption.
So, I promised myself not to write another Monad tutorial, but I did, at last!
A pure function is deterministic: it depends only on its inputs and, as computers are deterministic as well, its output must be the same given the same input data.
Conversely, an RNG function (at least as it's known in the imperative world) must be non-deterministic and return a different outcome whenever it's called (without arguments).
So, how to code an RNG function in Haskell? Enter Monads.
Haskell can't give up on purity, so the solution is simple: instead of returning a value, an RNG function returns a computation.
A computation is like a machine with some pipes going in. By itself, the machine doesn't do anything. But, if you run it, providing fuel through the pipes, it will eventually produce something.
The machine is pure, while the fuel is impurity (or side-effects). The fuel comes from the environment under which the machine is run.
So, an RNG function is actually a machine which, when run under the special enviroment of the IO Monad, provides a number. More, you can build a special RNG machine (i.e. write an RNG computation) producing an infinite list of random numbers: just fill in sufficient fuel until more random numbers are needed!
In this way, Haskell let's you stay on the pure side when building the computation, letting you do all sort of things a computer theorist would do to a program (applying transformations, proving properties...), and getting you dirty when running your computation.
Sometimes it's hard to catch up with the meaning of Monads at first because the first Monad you encounter is the IO Monad, which is implicitly run by the Haskell run-time environment. So, you lose the concept that a Monad computation must be run to produce something useful.
For instance, the State Monad has runState function. But, as I said in one of my previous post, IO has its own run function: unsafePerformIO, from Foreign module.
So, let's see how to build an RNG computation:
randomList r = do g <- getStdGen
return $ randomRs r g
randomList takes a pair representing a range of numbers and return a computation that, when run, provides an infinite list of numbers. Indeed, randomList's type is:
Num a => (a,a) -> IO [a]
Specifically, when you state:
g <- getStdGen
you are saying something like: "g is a pipe. When you run this computation, please provide a StdGen value here". randomRs is a plain pure function taking a StdGen (which is basically a seed with a next function iterated to provide an infinite list of random number from it) and the range to which random numbers must be belong to.
How do we run this machine? Simple: put unsafePerformIO in front of it.
randomList r = unsafePerformIO $
do g <- getStdGen
return $ randomRs r g
Now randomList produces an infinite list of random numbers because it'll run the computation inside it.
But, beware! randomList is now and impure function masked as a pure function. Haskell could tricked by this masking and make the wrong thing misled by the purity assumption.
So, I promised myself not to write another Monad tutorial, but I did, at last!
November 07, 2007
Again a post about Haskell performances
I've found an interesting article describing how to make an Haskell program run as fast as the OCaml version, which is often regarded as the workhorse among FP languages. Three points grabbed my attention the most:
- ghc 6.8 seems to boost the performances of Haskell programs: in some cases the benchmarks show programs run twice as faster as those compiled with ghc 6.6.
- Faster Haskell programs are written similarly to OCaml version.
- OCamls versions are no longer than their Haskell versions.
November 05, 2007
"Making Haskell programs faster and smaller"
November 02, 2007
October 31, 2007
The typical Haskell programmer's customs
When learning a new programming language, it's very important to learn the best practices arising from the community. For example, in Java you learn pretty soon to name your stuff in CamelCase: you'll make your code more readable by others and, at the same time, you're likely to avoid the mistakes others did before.
Haskell is no different and I found a good article by Bryan O’Sullivan, somewhat related to this.
Haskell is no different and I found a good article by Bryan O’Sullivan, somewhat related to this.
October 25, 2007
Bye bye, Ubuntu
A couple of years ago, I started using a new and (at that time) unknown GNU/Linux dsitribution. It was called Ubuntu Warty 4.10. The things I liked in that distro were the hardware detection module and the fact it stemmed from Debian, which I always wanted to give a try.
Two days ago, I decided to upgrade my system (a Feisty Fawn) to the latest version, Gutsy Gibbon.
The upgrade process died at 60%, more or less, due to a problem in the nvidia-driver install script which managed to give up after seeing a diversion from the base package (ah! total functions!).
Now, I have a half-upgraded system and I can't finish the upgrade as apt-get says everything is current, and have xorg doing all sort of funny things, the Gnome applets not working anymore, mplayer segfaulting... I even filed a help request in the Ubuntu forums, like any other newbie out there, but got nothing in return: perhaps the community has become so large that a rather common help request now pass unobserved...
So, time to change. It happened in the past and is likely going to happen again in the future: I switched from Slackware to Red Hat to Slackaware to Gentoo to Ubuntu. Now it's time for ArchLinux. I've been using it for a while now, on a VMWare VM and, apart from being quite satisfied, I'm impressed as well.
pacman works like a charm: simple and effective. Everything gets configured neatly and almost everything worked out of the box.
Of course, this means that I'll have to configure my devices and xorg myself, insted of relying on some Ubuntuish automatic tool. But, at least, I'd know what's going on.
So, bye bye Ubuntu. Welcome ArchLinux.
Two days ago, I decided to upgrade my system (a Feisty Fawn) to the latest version, Gutsy Gibbon.
The upgrade process died at 60%, more or less, due to a problem in the nvidia-driver install script which managed to give up after seeing a diversion from the base package (ah! total functions!).
Now, I have a half-upgraded system and I can't finish the upgrade as apt-get says everything is current, and have xorg doing all sort of funny things, the Gnome applets not working anymore, mplayer segfaulting... I even filed a help request in the Ubuntu forums, like any other newbie out there, but got nothing in return: perhaps the community has become so large that a rather common help request now pass unobserved...
So, time to change. It happened in the past and is likely going to happen again in the future: I switched from Slackware to Red Hat to Slackaware to Gentoo to Ubuntu. Now it's time for ArchLinux. I've been using it for a while now, on a VMWare VM and, apart from being quite satisfied, I'm impressed as well.
pacman works like a charm: simple and effective. Everything gets configured neatly and almost everything worked out of the box.
Of course, this means that I'll have to configure my devices and xorg myself, insted of relying on some Ubuntuish automatic tool. But, at least, I'd know what's going on.
So, bye bye Ubuntu. Welcome ArchLinux.
October 22, 2007
Is Haskell really expressive?
My idea of an expressive program is one whose meaning (or at least the meaning of its functions when taken apart) is kind of intuitive.
Haskell is often remarkably pointed out as an expressive language. Nevertheless, I'd say that a language is simply a weapon in the hand of the developer and that expressiveness mostly depend solely on him.
Indeed, consider the following Haskell piece of code:
cartesian = (foldl (liftM2 $ flip (:)) [[]]).(map (\n -> [0..n-1]))
Apart from its name, would you tell that this function evaluates a multiple cartesian product? Actually you need to have an intermediate knowledge of Haskell and its libraries to catch the meaning of this function:
Haskell is often remarkably pointed out as an expressive language. Nevertheless, I'd say that a language is simply a weapon in the hand of the developer and that expressiveness mostly depend solely on him.
Indeed, consider the following Haskell piece of code:
cartesian = (foldl (liftM2 $ flip (:)) [[]]).(map (\n -> [0..n-1]))
Apart from its name, would you tell that this function evaluates a multiple cartesian product? Actually you need to have an intermediate knowledge of Haskell and its libraries to catch the meaning of this function:
- You should know that foldl is like a generic "summation"
- You should know that liftM2 will lift a function taking two arguments into a Monad (the List Monad actually).
- You should know that semantics of the List Monad has to do with non-determinism, i.e. passing two lists to a lifted function will result in applying that function to each and every pair built from them.
Writing programs this way is really concise but I sometimes feel like it's not really intuitive. My understanding of Haskell programs improves everyday and lots of Haskell snippets I've read in the past are now less obscure. But I think my most importat breakthrough in understanding the language will happen when I'm able to make the above snippet's meaning apparent even for the almost occasional (and non-Haskell) programmer.
Haskell concurrency
Here's an article introducing Haskell concurrency constructs for the two paradigms out there: message passing (à-la Erlang) and shared-memory (the rest of us).
October 16, 2007
Algebraic rules for GHC
As I'm still experimenting with Haskell, I'm not into Haskell code optimization yet.
Anyhow, I think the intermediate ghc user might find the following document interesting:
Playing by the rules
It sort of explains how to get advantage of the equational-like form of Haskell programs to teach ghc simple rules to optimize an Haskell program.
Anyhow, I think the intermediate ghc user might find the following document interesting:
Playing by the rules
It sort of explains how to get advantage of the equational-like form of Haskell programs to teach ghc simple rules to optimize an Haskell program.
October 15, 2007
Boilerplates
An interesting application of generic programming and data structures:
Haskell and Web Applications
Here is the documentation about Data.Generics.Basics which is useful to understand what the above article is all about.
Haskell and Web Applications
Here is the documentation about Data.Generics.Basics which is useful to understand what the above article is all about.
October 09, 2007
Control.Monad.Fix: recursive lambda abstractions
http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Monad-Fix.html
October 08, 2007
Tracing your code... the dirty way!
Problem: you want to trace your Haskell program but you're scared that touching anything might thrash your type hackery.
Solution: Use Debug.Trace.trace wherever you want to print something on the console:
Before: map f (x:xs) = (f x):(map f xs)
After: map f (x:xs) = trace "Here I am" $ (f x):(map f xs)
Explanation: I was trying to understand how HAppS worked and I really missed the possibility to put a harmless print statemente here and there in the code just to get the feeling of what was happening in the behinds, as I'd do in an imperative language like python.
The problem is that we have to deal with Haskell purity so if you want to print something in a Haskell program, you have to do it inside the IO Monad.
So far so good, but IO is not available everytime you need it... at least not the "official" one. In fact Haskell allows you to enter the IO Monad anytime you need it and perform an IO action straight away using the unsafePerformIO function.
"Unsafe" means that you use it at your own risk, since you're doing something impure in a pure environment. Of course, trace has nothing to do with the real meaning of the traced funcion, so it's perfectly safe to use it that way.
I spent almost an hour to figure out how to do this, so I hope this trick will save you time.
Solution: Use Debug.Trace.trace wherever you want to print something on the console:
Before: map f (x:xs) = (f x):(map f xs)
After: map f (x:xs) = trace "Here I am" $ (f x):(map f xs)
Explanation: I was trying to understand how HAppS worked and I really missed the possibility to put a harmless print statemente here and there in the code just to get the feeling of what was happening in the behinds, as I'd do in an imperative language like python.
The problem is that we have to deal with Haskell purity so if you want to print something in a Haskell program, you have to do it inside the IO Monad.
So far so good, but IO is not available everytime you need it... at least not the "official" one. In fact Haskell allows you to enter the IO Monad anytime you need it and perform an IO action straight away using the unsafePerformIO function.
"Unsafe" means that you use it at your own risk, since you're doing something impure in a pure environment. Of course, trace has nothing to do with the real meaning of the traced funcion, so it's perfectly safe to use it that way.
I spent almost an hour to figure out how to do this, so I hope this trick will save you time.
A new programming blog
I start this new blog for the following purposes:
- I'm learning new programming paradigms. Actually, I'm focusing on Haskell but I take a peek to other langs from time to time, like Lisp, Erlang and so on. Having a blog is a great way to get support from the community (provided I write interesting posts) and have a place to store my findings so I can look them up when needed.
- A place to post things I find on the web and worth mentioning.
- I don't know yet, but I'll add something in lazy-haskelish way as soon as I find it.
Subscribe to:
Posts (Atom)