Capturing Output from PUTS in Ruby
When writing unit tests for my simplesem interpreter, one test in particular was problematic. In simplesem, the set write instruction prints output to the screen.
// place Hello World! on the 'write' buffer set write, "Hello World!"
Internally, the interpreter is just passing the second parameter, “Hello World!”, to the puts method in Ruby. This makes it difficult to use traditional test/unit assertions to check that the simplesem instruction is working.
I eventually found two solutions for this. The first, suggested by David Stevenson at Pivotal Labs, is to use mocha to check that puts was called on the object.
require 'test/unit' require 'rubygems' require 'mocha' class SimpleSemParserTest < Test::Unit::TestCase def test_set_stmt_write parser = SimpleSemParser.new parser.expects(:puts).with("Hello World!") parser.parse('set write, "Hello World!"').execute end end
This solution is fine for most situations; Mocha will throw an exception if puts is not called. However in my case, it was unsuitable because puts was not being called on the SimpleSemParser object but instead on a Treetop syntax node that I did not easily have access to within the unit test.
I knew that if I could capture the output from the puts method into a variable I would be able write the test using a standard assert_equal. After some googling I discovered that this functionality is built into the ZenTest gem. After rewriting the test looked like this.
require 'test/zentest_assertions' class SimpleSemParserTest < Test::Unit::TestCase def test_set_stmt_write out, err = util_capture do parser = SimpleSemParser.new parser.parse('set write, "Hello World!"').execute end assert_equal "Hello World!\n", out.string end end
This works great. However, should we decided that we do not want to use an external gem, with a little effort we can bypass ZenTest and implement util_capture ourselves.
A SIMPLESEM Interpreter
In my Programming Languages course, taught by Shannon Tauro we have been using a fake assembly language of sorts called SIMPLESEM to gain experience translating the semantics of a high level programming language, to a simple processor.
Since SIMPLESEM is a made up language that was created just for our textbook, there was no way for me to execute the SIMPLESEM programs that I was writing for the homework assignments. This was annoying because SIMPLESEM is a low level language which makes it hard to notice mistakes. Of course, it is very easy to make a mistake any time you are programming but it is even harder to catch those mistakes if you are working at close to assembly level.
As a fun exercise I implemented an interpreter for SIMPLESEM using Ruby and published it as a RubyGem. Fortunately, I choose to use Nathan Sobo’s Treetop gem to aide in the development. Using Treetop, I wrote a parsing expression grammar to parse SIMPLESEM commands. This resulted in my SIMPLESEM interpreter being a lot more flexible than I had originally anticipated. After I familiarized myself with the basics of writing Treetop grammars I found it very easy to make changes to my grammar definitions to add language features one by one.
The Case For Replacing Java With Python In Education
Around 2003 all of the colleges and high schools in the United States switched from teaching Computer Science courses in C++ to teaching them in Java. The intention was to make it easier for students to pick up programming. Schools were finding that many students were struggling to cope with low level tasks in C++ like manual memory management and pointer references. Instead of learning algorithms, data structures and object oriented programming, students were stuck for hours trying to track down incorrect pointer references.
About four years after the education system made the switch to Java from C++, the whole software industry started complaining about the degrading quality of the Computer Science graduates in the US; which is a topic I explored recently. Ultimately, schools have made the switch away from C++, and they are unlikely to go back, nor do I think they should. Instead, what I want to discuss is why did we have to replace C++ with Java? I can see that at the time, it may have seemed the obvious choice, but looking around the language landscape now there are several choices that I think are better suited for the task. Namely, Python.
Java > Python How???
First, I want to really think about what advantage does learning programming with Java have over using a modern scripting language like Python? Python equally hides the things that make programming in C++ laborious and has many of the nice features that the JVM provides like garbage collection, unicode strings, and threads. The difference is Java is miserable for web programming (Java EE) and equally overly complex for building GUIs (Swing). As a scripting language, Python is far easier to pick and learn.
What CS students would lose from switching from Java to Python
- Static Typing – The difference between an Integer and a Float is harder to understand when you do not have to declare variable types.
- Compilation – In my opinion this is not a big deal. They are still going to learn how to debug runtime errors which is where the substance is. Advanced IDEs like Eclipse are making dealing with compiler errors almost a thing of the past in Java. Additionally, the upper division compilers class will still be there to teach them the ins and outs of compilers.
- Performance – This is a moot point. I only put it down because it is what most people think of first but very rarely do the projects assigned in a classroom require serious computation. I would also wager that even the ones that do can be solved in a satisfactory manor with Python. If anything it will force students to be more creative about their solutions instead of just relying on brute force.
What CS students will gain from switching from Java to Python
- Dynamic Typing – There are tradeoffs to both approaches but as nearly everyone who programs dynamically typed language for a while finds out that the type safety provided by static typing is unnecessary and more often gets in your way than it helps you (ex. Casting).
- Interpretation – Becoming familiar with interpretation is just as important as compilation. Interpretation opens several doors for you when programming and is becoming more common. In addition it gives you helpful tools like an interactive console.
Overall, there is no big loss in Computer Science concepts when moving from Java to Python like there was when we moved away from C++. You trade static typing for dynamic typing and compilation for interpretation but everything else is just about the same and you gain Python’s simplicity.
One of the real problems with Java is that many students do not like to use it when programming for fun. Since the majority of students only become competent in Java, they only code they write is for their homework assignments. These are type A CS students. There is a second, type B, group of CS students. Type B students pick up another language like PHP, Python, Ruby, Clojure, etc. and are ones who spend time coding and creating cool things outside of their schoolwork. These students find programming on the side to be the most enriching and also the most educational. Employers often cite type B students, the self starters, as the ones they are most interested in hiring. If the only programming a student does while attending school is for their class projects, it is more than likely that they will continue this practice once leaving school, only writing code for their job. By making the switch away from Java in education, more type B CS students would emerge from American universities; enormously benefiting the software industry.
What if I went to a Java school Joel?
Way back in 2005, I read an essay by Joel Spolsky titled The Perils of JavaSchools. When I read the essay the first time, I suspected Joel was right about Java trivializing several aspects of the traditional Computer Science education, but I didn’t really care. At the time I was just starting my second quarter of college in the CS program at the University of California at Irvine which definitely falls under the “Java school” classification. At that point I had only ever really programmed in Java and I liked it a lot. Joel was right, but I was happy with my relative Java proficiency at a Java School so I largely indifferent.
Fast forward to the present. Last week, I borrowed Joel’s new book “More Joel on Software” from Michael Arrington which contains the “Perils of JavaSchools” essay. Reading the essay again I was blown away. What a difference a few years of CS education makes! This time instead of just feeling that Joel is right, I know he is right. Learning Computer Science completely in Java instead of C with a healthy dose of functional languages like Scheme is two different worlds.1
All CS degrees are not created equal
Is the value of my CS education less than that of the traditional CS education Joel reminisced about? All other things being equal, I would say absolutely say “Yes”. I can not help but agree with Joel when he says that an all Java education can never be of the same caliber as the CS programs that preceded the Java “revolution”. People like Guido van Rossum, Paul Graham, Steve Yegge, Linus Torvalds and many other great hackers all received their degrees before the notion of a Java school existed. And as Joel puts it, they all went “stark, raving mad trying to pack things into bits”. The bit level is considered foreign territory for students at Java schools; a place we dare to venture only once or twice and will quickly return to the safety of the Java virtual machine.
Joel’s essay is best summarized by this paragraph:
You used to start out in college with a course in data structures, with linked lists and hash tables and whatnot, with extensive use of pointers. These courses were often used as weedout courses: they were so hard that anyone that couldn’t handle the mental challenge of a CS degree would give up, which was a good thing, because if you thought pointers are hard, wait until you try and prove things about fixed point theory.
I have never even heard of fixed point theory!
Finally we come to my dilemma. I have one year left at my Java school and I desperately want to avoid mediocrity. Since adopting Ruby as my primary programming language last summer, I have experienced several small victories. Ruby is a very powerful language which is gradually breaking me away from the ridged programming practices I picked up from programming in Java for 3 years. Ruby has introduced me to things like metaprogramming, reflection, DSL’s, anonymous methods, and several aspects of functional programming. All are things that I never would have been able to fathom had I stayed inside my Java bubble. I will say that CS students at Irvine take a Programming Languages class which introduces unfamiliar languages like Haskell. Unfortunately, the class doesn’t make up much ground. Like all classes it is only 10 weeks long and students only get a brief look at the various languages they are introduced to. To top it off most of the projects are still done with Java.
Ruby is not the Answer to Life, the Universe, and Everything
Ruby is great but its not going to teach me any of the low level knowledge I am lacking. I can program in Ruby for another decade and still not achieve a full understanding pointers. The only thing I can do to understand why while (*s++ = *t++); copies a string, is to actually program in C.
Why I am not learning C now
Learning C is something that you generally need to be forced into. In today’s world you are not going to able to write very much software if you are coding the entire thing in C from top to bottom. It is a very anti “Get Things Done” programming language. In addition, there are very few things that actually need to be written in C. Operating systems and compilers are the two big areas where use of C is nearly always required. Both are territory that I am not interested in venturing into at this point. The most common use for C among software developers is to optimize slow chunks of code by rewriting them in C. However, everything I write does not need to “scale” so while I always do my best to not write inefficient code, I can not be bothered to rewrite any of it in C when it is “fast enough” already.
What I am going to do about it
This is a problem that I have not thought of a solution for. I can not bring myself to sacrifice productivity in order to use C. At first I thought I would learn Objective-C which is based on C in order to create Cocoa applications for Mac OS X. That solution is flawed however. Just like Joel says learning C++ is not a substitute for learning C, the same is true for Objective-C. My best chance is if something in my school work for next year comes up that requires the use of C. If it does happen it will be an elective; the chance of a required course using C at Irvine is very slim. I am just one person, what I am experiencing is true for Computer Science students across the county at Java schools. We are collectively being handed a disadvantageous education and in the long run it will have a direct impact on the level of software engineering that is being done in the United States. For instance, I would guess that a graduate from a Java school is far less likely to ever contribute to the Linux kernel, GCC, or a similar project. This is of great consequence because we need innovation at the lowest level of software in order to continue innovating at the top.
Obviously, if American Universities are going to keep up, they need to switch back to the “middle ages” of Computer Science and resume using C in the classroom. For those of us that are already in the system, or recently graduated, individual crusades are required to attain the level of understanding that is obligatory for Computer Science graduates. For myself, this will likely include working through the exercises in Structure and Interpretation of Computer Programs and the accompanying lectures. I am still looking for a good, practical way to learn C. Just reading Kernighan and Ritchie is not going to be sufficient. There is a big distinction between learning to program and learning a language. I need to learn to program—for real this time.
-
Read “The Perils of JavaSchools” for an explanation of why they are different. ↩
Rails-doc.org is my new Rails Reference
When working with a framework as large as Ruby on Rails its necessary to have a reference close by for… well just about everything. Until recently I was a big fan of gotAPI.com because I really appreciated the Ruby and Rails reference tied together. However, the Javascript autocomplete on their search box is broken in Firefox 3 so I decided to try the new Rails reference site, Rails-doc.org.
Rails-doc.org is an fairly ambitious project to create a community driven Rails documentation site. Basically they let users sign up and contribute notes to the existing Rails documentation. This certainly has the potential to be very useful, especially for new Rails hackers because sometimes the people who have been around the framework for a while just take things for granted.
Take the documentation for strftime in the Date class for example. There is no documentation listed for that method. Despite the fact that you clearly need documentation of the strftime options in order to use that method. Instead you have to know to look under strftime in the Time class for the documentation. While this is an example specific to Ruby documentation, these are the kinds of obvious problems that a community documentation website can help solve.
Right now Rails-doc.org has only been live for a month the so amount of community documentation feels very low. In the mean time I’ll be using the site for the official documentation that is already in place. Plus its the best looking rendering of the Rails documentation site out there. The Nodeta guys did a good job with the design. They also have a nice looking blog.
It will be interested to see how many of the notes contributed to Rails-doc get ported to the official Rails documentation. This certainly feels like the easiest and most straightforward way to contribute to Rails documentation and I can see it becoming a testing ground for future contributions to the official docs.
