Archive for python

Psycho x64, GIL

Posted in Uncategorized with tags , on December 19, 2008 by Bartosz Radaczyński

So why exactly is psycho not available for x64 architecture? Because it was meant to be incorporated into the pypy project…. Strange, since:

  1. pypy is twice as slow as cpyton is
  2. the main goal of the pypy project is not to provide a faster python implementation

Can anyone explain this to me? I’d really like to see some of the psycho improvements on my 64 bit machine.

Another thing – why the GIL? Is it on purpose to make some market room for jython with the native threads? Is there some other motivation behind that? Is it to force people to use processes (as the post and pythons 2.6 multiple processes model suggests). How are they better than threads? For me it seems that they do add more overhead of forking than threads ever do (I guess that is why they call them lightweigh processes). Is there some explanation for this phenomenon?

Parsing and stuff

Posted in Uncategorized with tags , , on December 1, 2008 by Bartosz Radaczyński

So, for the past two weeks or so I’ve been trying to get this small python thingy up and running. But (as they always do) this “small thingy” suddenly turned into piles and piles of code. I guess that this is what they mean, when they make out rules like “when a programmer gives you an estimate, add 1 and take the unit immediately larger than the one given” (so that two weeks means actually three months).

So, anyway, this little project started off as a code analyser for cobol programs + db2 sql. It was meant to provide some sort of data flow analysis. I all seemed pretty straightforward and the idea of making that kind of analysis on 300 programs blew my mind, so I figured that an automated tool would do a much better job than I even would doing it by hand. I sort of wandered around to see what choices are there to make your own custom parsers in python. As it turns out there are at least two good ones there. The first one is called pyparsing. This is the one I started off with. But after carefully converting the COBOL grammar from EBNF to pyparsing model it turned out that parsing just one program took like forever to complete. On the other hand it turned out to work pretty well on the sql, but after a while I decided to throw that out and reimplement it… I know, worst idea ever, but still, I was not much of a time-constrained, so I could afford that. And mind you, I’ve thrown away roughly three days of work, so not much harm was done there.

On the second take to the parsing issue I thought that actually being able to write the grammar as EBNF, since these are really much more readable than the pyparsing representation and they are also easier to change. After all, we cs guys are used to math-like symbols… So, with the application od simpleparse things really took off now. It took me about 10 workdays to get the cobol grammar to parse, some 2 more days to add the db2 sql (maybe not complete, but good enough for the programs here). So, anyway, the main thing was, that simpleparse is really a simple parser thingy. It does not support maximum length/most successful match but the first match only. This is crucial to defining grammar, you’ve gotta make the grammar list the longest expression first. The main problem was in the relational conditions, which in COBOL make the form of

IF ABC=1 OR 2 OR 3  OR DEF <= 123 AND WS-SOME-VAR IS NOT GREATER THAN ‘ABC’

now this is really strange to parse, especially the abbreviated condition (ABC=1 OR 2 OR 3, which actually means ABC=1 OR ABC=2 OR ABC=3). But you can get by somehow – at least I did. Anyway, the performance increase is dramatical. On my dual-core laptop the pyparsing stuff took two days to parse a simple program! With simpleparse it takes several seconds… Well naturally this is due to the parser’s implementation being way simpler (first match cuts the further comparisons), but if you’re carefull enough this thingy is capable of doing soooooooo much!

So in the end I guess that Steve Yegge was rigth when telling to learn that stuff about compilers. It definitely pays off to be aware that it is easier to make a parser that use regexes… Or at least it seems so 😉

Ruby vs Python

Posted in languages with tags , , on October 14, 2008 by Bartosz Radaczyński

For a while now I’ve been looking for a thorough comparison of the two (I know that Ruby has blocks, believe you me). I finally found one here. Another extremely interesting thing is the speed comparison of various programs written in both. As it seems (at least for now), Python is about 3 times faster than ruby (not to mention psyco). On the other hand, ruby 1.9 is supposedly faster (benchmark is not supporting this 😦). Yeah, I know that the processor cycles are cheaper than developer’s brain cycles (and it is true), but since coding in Python is about as productive as it is in Ruby, and the code can run three times faster, why not use python?

Elixir (and sqlalchemy) models validations

Posted in web development with tags , , on October 13, 2008 by Bartosz Radaczyński

For those of you that miss rails-like validations in  pylons –  there is a solution. Current version is available from the cheeseshop:

easy_install sqlalchemy_elixir_validations

and then follow the description.

First bitbucket fork

Posted in web development with tags , , on October 3, 2008 by Bartosz Radaczyński

For a while now I’ve been trying to get around to use a dvcs (distributed version control system). I’ve somehow convinced myself that git is great (cool name and rails guys use it and so on), I even created a gihthub account, but somehow could not find the time to check it out… And eventually I took the mercurial for a test drive. It is written in python and, as Mike Orr said, there is probably no python project that uses git instead of mercurial, so there it was – mercurial. There’s a great tutorial for mercurial over here. And as it turns out, dvcs rock! You don’t need a server, easy branching and merging, which encourages code experimenting. And after an hour or two you can just merge the final working version and noone will know that you had sixty two gazillion of broken versions in your local repo. This is really cool! It’s like svn on steroids.

At the same time Graham Higgins sent me a link to his project shabti, which is a bit too much for what I need right now. For a while now I’ve been playing around with pylons and I really missed the scaffoldingm so I created that for shabti. The code and a tutorial are here.