Tuesday, April 29, 2008

Special guest for next SoCal Piggies meeting

We'll have the SoCal Piggies meeting this Thursday May 1st at the Gorilla Nation office in Culver City. Our special guest will be Ben Bangert, the creator of Pylons, who will give us an introduction to his framework. We'll also have a presentation from Pablo Noego from Gorilla Nation on a chat application he wrote using Google App Engine. We'll probably also have an informal discussion on Python mock testing tools and techniques.

BTW, I am putting together a Google code project for mock testing techniques in Python, in preparation for a presentation I would like to give to the group at some point. I called the project moctep, in honor of that ancient Egyptian deity, the protector of testers (or mockers, or maybe both). It doesn't have much so far, but there's some sample code you can browse through in the svn repository if you're curious. I'll be adding more meat to it soon.

Anyway, if you're a Pythonista who happens to be in the L.A. area on Thursday, please consider attending our meeting. It will be lots of fun, guaranteed.

Tuesday, April 22, 2008

"OLPC Automated Testing" project accepted for SoC

I'm happy to say that Zach Riggle's application for this year's Google Summer of Code, "OLPC Project Automated Testing", was accepted. I'm looking forward to mentoring Zach, and having Titus as a backup mentor. There's some very cool stuff that can be done in this area, and I hope that at the end of the summer we'll have some solid automated testing techniques and tools that can be applied to any Python project, not only to the OLPC Sugar environment. Stay tuned for more info on this project. BTW, here is the list of PSF-sponsored applications accepted for this years' SoC.

Thursday, April 17, 2008

Come work for RIS Technology

We just posted this on craigslist, but it never hurts to blog about it too. If you're interested, send an email to techjobs at ristech.net. You and I might get to work together on the same team!

Open Source Tech Top Guns Wanted

Are you a passionate Linux user? Are you running the latest Ubuntu alpha release on your laptop just because you can? Are you wired to the latest technologies -- things like Amazon EC2/S3 and Google AppEngine? Are you a virtuoso when it comes to virtualization (Xen/VMWare)?

Do you program in Python? Do you take hard problems as personal challenges and don't give up until you solve them?

RIS Technology Inc. is a rapidly growing Los Angeles-based premium managed hosting provider that hosts and manages internet applications for medium to large size organizations nationwide. We have grown consistently at 100% each of the past four years and are currently hiring for additional growth at our corporate operations center near LAX, in Los Angeles, CA. We have immediate openings for dedicated and knowledgeable technology engineers. If the answer to the questions above is YES, then we'd like to extend an invitation to interview with us.

We are an equal opportunity employer and have excellent benefits. We realize that one of the main things that makes us excellent are the people we choose to work with. We look for the best and brightest and our goal is to make work less "work" and more fun.

Wednesday, April 16, 2008

Google App Engine feels constrictive

I've been toying a bit with Google App Engine. I was lucky enough to score one of the 10,000 developer accounts. I first went through their tutorial, which was fine. Then I tried to port a simple application that I used to run from the command line, which queried a range of IP addresses for their reverse DNS names. No luck. I was using the dnspython module, which in turn uses the Python socket module -- and socket is not available within the Google App Engine sandbox environment.

Also, I was talking to MichaƂ on rewriting the Cheesecake service to run on Google App Engine, but he pointed out that cron jobs are not allowed, so that won't work either... It seems that with everything I've tried with GAE I've run into a wall so far. I know it's a 'paradigm change' for Web development, but still, I can't help wishing I had my favorite Python modules to play with.

What has your experience been with GAE so far? I know Kumar wrote a cool PyPI mirror in GAE, but I haven't seen many other 'real life' applications mentioned on Planet Python.

Friday, April 11, 2008

Ubuntu Gutsy woes with Intel 801 graphics card

I just upgraded my Dell Inspiron 6000 laptop to Ubuntu Gutsy last night. My graphics card is based on the Intel 810 chipset. After the upgrade, everything graphics-related was dog-slow. Scrolling in Firefox was choppy, IM-ing was choppy, even typing at the console was choppy. Surprisingly, I didn't find a lot of solutions to this problem. But many people on Ubuntu forums suggested disabling compiz/xgl, so that's what I ended up doing. In fact, I uninstalled all compiz and xgl-related packages, rebooted, and graphics became snappy again. Now back to trying to write an application to run on THE GOOGLE.

Thursday, April 10, 2008

Meme du jour: shell history

Here's mine from my Ubuntu laptop:

$ history|awk '{a[$2]++ } END{for(i in a){print a[i] " " i}}' |sort -rn|head
121 cd
91 ssh
82 ls
46 vi
28 python
26 scp
16 dig
12 more
7 twistd
6 rm

Thursday, April 03, 2008

Steve Loughran on 'Farms, Fabrics and Clouds'

Yesterday I and my colleagues at RIS Technology had the pleasure of attending a remote presentation given to us by Steve Loughran, who works as a researcher at HP Labs and is also a committer on the Ant project. I had seen Steve's slides from a presentation he gave at the University of Bristol on 'Farms, Fabrics and Clouds' back in December 2007, and I have been pestering him via email ever since, hoping to have him release a screencast. After much back and forth, Steve offered to simply present for now directly to us via Skype. He did it out of the goodness of his heart, but both he and I realized that there's a nice little business opportunity in this type of presentation: you release the slides with no audio, then you get hired to present to interested parties in person, remotely, via Skype and a shared set of slides, with a Q&A session at the end. Everybody wins in this scenario. Filing it in the 'ideas worth trying' category.

To come back to Steve's presentation -- here are the slides from a previous version. I hope he will soon post the updated version we saw yesterday, but the differences are not major. The co-author of the talk is Julio Guijarro. Their area of interest within HP Labs is the deployment of large applications across distributed resources and the management of these apps/resources with an eye to maximizing their output and minimizing their cost. A familiar (and hard) problem for everybody who works in the hosting industry.

Steve talked about how the infrastructure architectures have changed over the years from a single web server talking to a single database server, to clustering, and finally to server farms and computing-on-demand. The challenge for us 'server farmers' is to figure a way to manage thousands of servers, heaps of storage, a myriad of network infrastructure devices, and large distributed applications on top of that -- all while keeping everything purring and happy, running to their maximum potential. Sounds impossible, but Amazon seems to be doing a decent job at it. And in fact Steve spent quite some time talking about how Amazon changed the game by their S3 and EC2 offerings. Even though they're not quite ready for prime time in terms of production deployments, Amazon will soon get there. As a proof, see their recent introduction of static IP addresses in EC2, and of the possibility of running your application in different data centers.

In my opinion, the best of Steve's slides are the 'Assumptions that are now invalid' ones. They really turn the 'established facts and best practices' of infrastructure and application design on their heads. Here are some examples of assumptions that don't hold anymore in our day and time:
  • it is expensive to create, deploy and duplicate a new system, running a Linux image of your choice (see Instalinux as a counter-example)
  • system failure is unusal and 100% availability can be achieved
  • databases are the best form of storage
  • you need physical access to the data center
  • a single server farm needs to scale to infinity
My other favorite part, which is not in the online slides yet, is the concept of 'agile infrastructure'. I haven't seen this concept before applied to server hosting, but Steve has a great point here. If you look at something like Amazon EC2, where you can pay as you go, you can test you application in a smaller environment and then scale it up, you can move your application between data centers -- this is indeed an agile environment that also imposes some new demands on your application.

I really recommend that you check out Steve's slides. There's a lot to chew on, but you can't afford not to chew on it, if you have anything to do with the IT industry these days.

Here are a couple more links that might prove useful:
  • Anubis: a tuple-space implementation that uses multicast to share information between hosts within a site
  • SmartFrog: a technology from HP used to distribute and manage applications (think puppet but geared towards application deployment); see also Google video
Thanks again to Steve for presenting to us. Now, as a server farmer, I need to go back to my plow and try to improve it (maybe buy a tractor?)

Update: Steve has some more thoughts on the Agile Infrastructure concept. Intriguing. This is something I'll definitely keep a very close eye on and tinker with.

Wednesday, April 02, 2008

For you students interested in GSoC

If you're a student and you want to apply for a Python-related project for Google Summer of Code 2008, Matt Harrison has just the project for you. The project has to do with branch coverage analysis and reporting. Matt is willing to mentor too. It's a really good opportunity, so don't hesitate to apply. Hurry up though, the deadline is April 8th.

Tuesday, April 01, 2008

TurboGears and Pylons finally merging

This has been a long time coming, and fans of both projects have been eagerly waiting for it, but it's finally happened. Not sure if you've seen the announcements from Kevin Dangoor, Mark Ramm and Ben Bangert on their projects' mailing lists, but basically they boil down to "we feel like after the sprints at PyCon we made enough progress so that we can pull the trigger on merging the source code from the 2 projects in one common trunk." They make it sound like it was purely a technological problem, but I have my doubts about that. I think it was driven in part by the increasing popularity of Django. Unifying TurboGears and Pylons is a somewhat desperate measure to chip away at the Django market share. We'll see if it works or not. Check out the brand new page of the TurboPylons project.

Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...