The Problem With MINASWAN

The most harmful thing in Ruby is the MINASWAN meme: Matz is nice and so we are nice. I propose we put that meme to rest.

I originally wrote this in the very–not–nice aftermath of my post about RubySpec over a year ago. Now that the Ruby community is attempting to have a debate about adopting a code of conduct, the MINASWAN meme is being trotted out as either a reason why a code of conduct is not needed, or that MINASWAN is sufficient as a code of conduct.

Why do I say MINASWAN is the most harmful thing in Ruby? There are many reasons that I’ll enumerate here. After reading this, you may notice other reasons I’ve missed.

  1. Why are we nice? We are nice to people because a) they deserve respect, and b) we make ourselves less human when we treat people disrespectfully.

    It’s good to have role models. I’m not discounting that at all. It’s important to emulate someone who displays attributes we value and admire. But Matz is a person, fallible just like everyone else. Matz is mean to people at times, just like everyone else. The reason we are nice is not because Matz is nice.

    We are nice because we value treating people with respect and if Matz is ever not nice, we’ll hold him accountable.

    I realize some people do think MINASWAN means we emulate Matz being nice, not we are nice because Matz is nice. However, the meme puts it backwards and permits people to justify their behavior based on emulating Matz. When they emulate Matz not being nice, they aren’t really responsible for their own behavior, they’re just emulating Matz.

    Overall, this first point is probably the least harmful aspect of MINASWAN. The others are serious.

  2. The cult of personality. People are imperfect, fallible, petty, confused, even mean. Normal, average people, who also do heroic, kind, wonderful things.

    Cults of personality distort people into unrealistic false images. This has two very harmful consequences: a) questioning the cult leader is discouraged or punished because no one wants to see their idol knocked from their pedestal, and b) those who are not popular either feel themselves to not be as worthy, or are treated that way by others.

    The first consequence means that we disconnect from reality. We don’t require our esteemed personality to justify their opinions. In fact, we freely substitute their wrong opinions about reality for facts.

    The second consequence means that people either strive to join the cult of personality, actively perpetuating the disconnect from reality, or they recede, and the community is deprived of their valuable interaction.

  3. Disagreeing is not considered nice. Telling someone their code is poor quality and needs improvement, their idea is logically unsound, their effort is deficient; these are not “nice” things. But they are necessary and we need to get better at doing them.

    In fact, being genuinely concerned and honest with people, even when it’s something that may hurt their feelings, is the nicest thing you can do. You can craft your message with care, but in some cases, no amount of sugar makes the medicine go down. In other words, it does not matter how you give feedback to some people, they will take offense regardless. That’s not to say don’t try to be kind, but to recognize that every interaction has two responsible parties. No one makes you upset.

    This is absolutely not to say that you can harass someone and then claim to just be disagreeing. Disagreement and harassment, or making personal attacks, are completely different things.

    The consequence of “nice is agreeable”, spoken or unspoken, is most harmful to bystanders. They may know something isn’t right, but they silence themselves out of fear of being excluded for “not playing nice”.

  4. It erases or hides actual not-nice things. There are real, heart-breaking problems in the Ruby community and in technology generally.

    When racist, sexist, misogynist, homophobic, transphobic, etc. actions and comments are made by community leaders or members, we struggle to deal with the events in a healthy manner. We over-, or more usually, under-react because we resist being disabused of our comfortable and cherished view of a “nice” community. We don’t have the hard discussions.

    The erasure of the reality of harmful and abusive acts, by plastering “we are nice” on top, can be extremely painful for the community members who experience them.

    Again, we should aspire to be nice to others, where nice means respectful. A nice community will be one that shows visible and earnest support for the many diverse people who inhabit it. There will be loud and emotionally fulfilling debate and arguments from the wealth of differing opinions and perspectives involved. A nice community won’t be parading a charade of “nice” but instead elevating and amplifying the many voices that are silenced in other communities.

  5. It masks the bad behavior by people who are “nice”. It is a fallacy to assume that a person we label as nice wouldn’t be mean or abusive to someone else. They may not, but it’s a fallacy to assume they would not. Unfortunately, this reaction is far too common: “This person is nice, they would never do that.”

    Perpetuating this fallacy can be severely damaging to people, especially when they have been victimized by a popular person and that victimization was partially or completely private. Examples of this include sexual abuse or harassment by a popular or prominent member of a community. But the behavior does not need to be so severe as to be criminal. There are many ways to treat people abusively that are not criminal.

    The best way to avoid these damaging situations is to treat every person, regardless of how high or low their position, simply as fallible people deserving of both our respect and our skepticism. Never substitute an attribute we may assign to them, be it “nice” or “grumpy” or “weird” or any other thing, for them as a person. Matz may behave nicely toward some people, and we can say, “I saw Matz treat this person nicely”. But if we say, “Matz is a nice person”, we start substituting this attribute for Matz, who is simply a person that behaves this way and that way, sometimes in a way we like, and sometimes in a way we don’t.

  6. It enfeebles discourse in the community. Conflicts are inevitable. If we paper over conflicts with “being nice”, then when the conflicts inevitably happen: a) people don’t know how to argue well, and b) bystanders do not learn how to evaluate arguments.

    Arguing a point is a skill to be learned like any other. The terrible behavior we witness daily on the internet amidst some conflict is a consequence of not practicing how to argue. Which is not to say there is a magic formula that prevents people getting carried away with emotion. But it’s foolish to neglect to practice for something that is as inevitable, common and important as arguments. The way to practice is, of course, to argue.

    I separate this point from the one above about disagreeing because you can disagree with someone and never tell them about it. Disagreement can be entirely passive. Arguing is an active engagement. You’re putting yourself on the line, your thoughts, your emotions, and maybe even your reputation. You’re making yourself vulnerable to attacks from people who misunderstand, and sometimes willfully misunderstand.

Those are the reasons MINASWAN is harmful.

There are a lot of hard problems that we need to solve. To solve hard problems requires a lot of discussion and, inevitably, conflict. That’s because all hard problems involve some tension between opposing forces. The solution is a balance point. It’s unattainable without conflict. The goal must be to have productive conflict.

Similar to the idea that the tech industry is a meritocracy, the MINASWAN meme seems positive on the surface. When you start to dissect it, though, its harmful aspects become visible. Ultimately, it impoverishes Ruby. Rather than improving the atmosphere, it limits our ability to learn about and engage in useful conflict.

So, let’s use this occasion to study and practice having an argument. First, we must consider who the audience is. Second, we need some guidelines for engaging in an argument.

The Importance Of Audience

Every argument has an audience. To be effective, you must argue for your audience. On the other side, as a member of an audience, you must use your skepticism to the fullest extent.

There are four important aspects to an argument. The purpose of arguing is to express a truth or dispute a falsehood. Never enter an argument with the intention of changing the belief or opinion of your opponent. Your audience is always the people who are open-minded and curious, and who have something to gain from understanding your position.

Without open-minded people, you spend time futilely trying to change someone’s mind. It’s impossible to change someone’s mind. You can provide facts and concepts, but people make up their own minds. Having the goal of changing someone’s mind virtually ensures you will argue ineffectively.

Without people who have something to gain from understanding your position, expressing a truth or disputing a falsehood simply doesn’t matter. People are busy. They don’t have extra time to care about every falsehood perpetuated in the world. If they have something to gain, they may listen to you. Even if they have a lot to gain, they may not.

To repeat: Your audience are the people who are open-minded about the topic and have something to gain from understanding your message (even if they disagree with it).

Arguments Are Healthy

Arguments are healthy if we abide by some very simple rules: treat others with respect, and try to discover facts.

Don’t make personal attacks, don’t assume you know their motives, don’t contradict their stated intentions or motives, and try to describe behavior as an event, not a personal characteristic. Express your own emotions, not the other person’s.

Likewise, search for facts. Attempt to show independent evidence for facts. “All my friends know that X is true” does not make X a fact.

Having arguments is fine, important and healthy. It’s not drama and it should not be dismissed. Thinking someone is mistaken is not being “un-nice”. If the audience considerations above are met, you should consider having an argument. In fact, it may be the most helpful action you can take.

Adopt a Code of Conduct and Help the Community Grow

The MINASWAN meme is simplistic, reductionistic, and insufficient to encourage a healthy and welcoming community. It doesn’t provide guidance for how people should interact. It reduces people to some simple attribute, like nice or not nice. And it doesn’t provide any means to ensure that people treat each other respectfully.

A community of an arbitrary number of people from different cultures, beliefs, desires and goals is complex and messy. Some people usually behave kindly and respectfully. Some people act selfishly and some people harass others. People are also complex and change, for better or worse, over time.

A code of conduct sets the basis of interactions in a community and provides guidance and procedures to deal with deviance from that basis.

It’s time to let go of the MINASWAN fallacy and let the community grow up.

Banning Mr. Nutter For Repeated Harassment

Due to repeated harassment, personal attacks, and unwanted contact directed at me over nearly 10 years, I have decided to ban Charles Nutter from any participation in any forum or space, online or in-person, that is primarily devoted to the Rubinius project or any project primarily maintained by the Rubinius project.

I'm writing this post to tell you what happened, why I am taking this action, and to answer some of the questions I've received about it when discussing this with various people.

I'm posting this publicly because of the nature of Mr. Nutter's position in the Ruby community and the fact that repeated requests over many years to stop making personal attacks, and even to stop making contact, have been treated dismissively and have not been respected.

This post is not for Mr. Nutter's benefit. This post is for two different groups of people participating in the community: 1. for people who may observe Mr. Nutter's behavior and think it is acceptable: this post is to make clear that such behavior _is not_ acceptable and _will not be_ tolerated; 2. for people who observe Mr. Nutter's behavior and fear they may also be subjected to it: this post is to make clear that we do not accept harassment and we will take action against it.

What happened?

On January 2nd, Mr. Nutter inserted himself into a public Twitter conversation among some people who were discussing, and who disagreed with or questioned, the Rubinius versioning scheme and release process.

 

https://twitter.com/headius/status/683510909902393348

 

One of the conversation participants had mentioned the Rubinius Twitter account in a tweet that was not obviously related to any Rubinius tweet. Clicking into the tweet to view the conversation, I discovered Mr Nutter's tweet above.

I then sent Mr. Nutter the following email:

 

Charles,

I have many, many times requested that when you disagree with [me]1, you not resort to personal attacks. Such behavior is unwanted, unprofessional, and unacceptable in any community. Repeatedly, you have agreed to stop making personal attacks, yet continue to do so.

Recently, you inserted yourself into discussions about the Rubinius versioning and release process, something that has absolutely nothing to do with you, and to which you were not invited, and have again resorted to making personal attacks.

https://twitter.com/headius/status/683510909902393348

(Attached is a scree shot if you delete the tweet.)

I have reported your behavior to Evan Phoenix and Sarah Mei. I'll also take the following actions:

1. I will write a Rubinius blog post calling out your behavior as a violation of the Rubinius Code of Conduct as referenced in this post ( http://rubinius.com/2014/11/10/rubinius-3-0-part-1-the-rubinius-team/) and shortly linked on the website. *You are banned from participating in any Rubinius-related project, space, and event. This includes any thread on any public forum, mailing list, or issue that is specifically related to Rubinius.* I'll be publishing this email in that post.

2. As I have a company that is intimately related to the Rubinius project, and on which your actions can have deleterious economic consequences, I will investigate any legal actions that may be available against you now or in the future.

You have engaged in these unprofessional and unacceptable behaviors for nearly ten years. You've repeatedly acknowledged they are unacceptable, promised to cease, yet continue to do so.

Regards,

Brian

 

Mr. Nutter's response to my email was the following:

Don't email me again.

Some questions and answers

Q. Why are you making such a big deal about this? Isn't this a relatively minor incident?

Yes, there is a vast distance between the sort of behaviors Mr. Nutter has displayed over the years and the level of abusive harassment that exists online.

I am well aware of the privilege that I enjoy. I am aware of a significant number of women on Twitter who receive astonishing, brutal, despicable harassment every day. I am not in any way suggesting that these different types of harassment are equivalent.

However, we are not going to tolerate the standard, "Well, if it's not something like a rape threat, it's not really harassment". Harassment of any kind absolutely and unquestionably needs to stop. Period.

Q. You only recently updated the website and project README to include a Code of Conduct. Isn't banning Mr. Nutter unfair?

No, it is not unfair. I have had numerous conversations with Mr. Nutter about this for years. His repeated behavior makes it clear that he thinks his behavior is justified and acceptable. More importantly, Mr. Nutter readily plays the civility card when he desires. He's not confused about what is and is not appropriate behavior.

Q. Won't taking this action scare other people away from contributing to the project? Won't people fear harsh consequences for a "minor" conduct infraction?

No, it won't. No one should be "scared" of contributing to a project that expects everyone to be treated with respect. If anyone thinks they may have trouble maintaining respectful interactions with others, they are not a good "culture fit" for the Rubinius project.

Q. Why did you report this to Evan Phoenix and Sarah Mei? What do they have to do with this?

Evan Phoenix and Sarah Mei are directors of Ruby Central, the organization that runs the two most important Ruby-related conferences in the world: RubyConf and RailsConf. They take responsibility for establishing the codes of conduct at those conferences. Mr. Nutter is a frequent speaker at Ruby conferences, and I think they need to be aware of situations like this.

Q. Why did you threaten legal action?

The biggest reason that so much harassment exists online is because the people doing it don't think there are consequences for their behavior.

It is true that often the harassment does not violate criminal codes. However, civil damages are a different area of law with different remedies. The criminal and civil legal systems exist to make society as a whole better for everyone. Harassment doesn't make society better and if it had legal consequences, it would be far less common.

Q. What can I do to contribute to a more respectful community?

Simple, demonstrate respect for others and expect to be treated with respect in return.

There are lots of problems to solve, and lots of things to have disagreements about, and conflict can be sustained without harassment and personal attacks.

Consider this tweet on Principles of Conversation that I read in Andrew Zolli feed:

  1. Together, we know more.
  2. Be tough on ideas, gentle on people.
  3. Avoid jargon.
  4. Threads beat points.
  5. Proceed with generosity.

1 Edited for clarity.

What, How, Why? - Rubinius Logs, Metrics & Analysis

Knowing how your application is functioning when it is running is critically important. How much memory is it using? What part of the code is running? How much time is it taking? What was the context of an error that occurred?

To answer these questions effectively requires a system specifically built for inspection. Just tossing in set_trace_func (or its “object-oriented” incarnation TracePoint) is not sufficient.

For Rubinius, we’ve been building in the system components for inspectability over the past year. These including logging, metrics, and the essential components for always-on analysis tools. These capabilities form the foundation for understanding what your application is doing, how well the application is doing it, and why the application is behaving a particular way.

An Inspectable System

An inspectable system is one that is designed to provide system information in a usable and efficient manner. Inspectability is not something that can be retrofit to an existing system.

DTrace is an example of an inspectable system. It is comprised of two parts: the DTrace subsystem and the application probes that provide an interface between the application and the DTrace subsystem.

The DTrace subsytem includes a kernel component that executes a very specifically constrained language to gather application runtime data. The application probes are added to provide the DTrace system with specific access to application data.

DTrace separates monitoring the system from the system’s functionality, but still provides sufficient flexibility to gather relevant information in a specific context.

Three principles that guided the design of DTrace were: zero cost when not running, stability and security (or it would not be allowed in production), and reasonable cost when running (ie not just turning on a firehose that requires expensive post processing, but rather emiting only the specifically desired information and possibly doing initial data processing at the collection site.)

Logs

If there were negligible overhead, all we would need are logs.

From logs, we can understand exactly what an application is doing at a partiular point in time. A typical log line gives 1. a timestamp, 2. some categorization of the event, and 3. some description.

From these we can derive metrics: periodicity, duration, frequency, and other measures. On these measures we could layer some sort of alerting system.

We could also derive sequence (in what order events occurred), adjacency (which events occurred together), and direct causality (which events state they triggered other events). This gives the ability to analyze the system behavior at a later time.

Unfortunately, the overhead of logging is definitely not negligible for most applications. This requires us to make trade-offs between knowing enough to understand basic application behavior and still getting reasonable application performance.

The first trade-off is one of content versus runtime expense. By eliminating most of the information in the logging of an event, we can replace many log entries with a single value.

For example, to know how many items are in a queue, we could log every event adding an item to the queue, or we could simply count every time an item is added to a queue.

In this way, metrics give us extremely valuable insight into an application’s performance, but at the expense of a loss of generality and loss of information (only specific events are counted and we lose the distribution of the events within a reporting interval).

The second trade-off is one of system complexity versus runtime expense. Logging events is probably the simplest (next to doing nothing) way to make a system inspectable. However, when the runtime cost is too great, replacing the analysis capability of logging requires many other, more complex, mechanisms.

We’ll look at metrics and analysis in Rubinius in the next two sections.

Metrics

Rubinius builds in a set of performance counters on various subsystems. You can view the counters currently defined in the source code.

The Rubinius metrics are monotonic, which makes the data more robust under sampling.

The counters are contained in thread-local data to improve performance by eliminating data contention. A background thread runs asynchronously with the application running on Rubinius, constantly aggregating the counters and possibly emitting their values to either StatsD or the file system. (Writing an emiter is relatively easy and more emitters are planned.)

Emitting and consuming these metrics is extremely easy.

To emit the metrics, either use a Rubinius command line option, or put the option in the RBXOPT environment variable. In this case, I’m turning on the file emitter and setting the reporting interval to 500 milliseconds (see rbx -Xhelp):

$ rbx -Xsystem.metrics.target=./metrics.dat.1 \
      -Xsystem.metrics.interval=500 \
      some_script.rb

I’ve written a quick script that I’ll be packaging up as the Rubinius Grapher gem to create a quick terminal ASCII graph from the data.

There’s a lot of work still needed on the grapher to improve the utility of the graphs, but it’s a great tool to quickly visualize the behavior of your application (eg for getting a quick handle on some issue you are debugging).

For a more useful system in production, check out a previous post by Jose Narvaez, Rubinius Metrics meet InfluxDB part II on using InfluxDB and Grafana in a Docker container.

(If anyone would like to write the third edition of this post that updates the Docker instructions and fixes up the Docker image to use the current Rubinius metrics, that would be a much-appreciated contribution.)

So, Rubinius has built-in, easily accessible metrics on system components. We’re now focused on building out the analysis capability.

Analysis

From the above discussion of inspectable systems, we have a clear view of what is required for Rubinius to provide good analysis support: no cost when not use, safe and secure, and reasonable overhead when in use.

We’ve started to build this system and it will be the topic of many future posts, so I’m only introducing it briefly here.

One of the major aspects of analysis in Rubinius is rooted in how we have built our object memory. Typically, when we think of memory, we think of our Ruby objects, but not usually of how our Ruby methods are running.

In Rubinius, the object graph (ie the Ruby objects and the graph created by these objects referencing each other) also contains information about the ruby methods. The inline cache objects that record the type of object at a method call site are just plain Ruby objects. For example, in method meow below, the type of repeat is Fixnum (because we pass in 5 below), so when meow runs, the inline cache for repeat.times records that repeat is a Fixnum, and also where the times method is found.

1
2
3
4
5
def meow(repeat)
  repeat.times { puts "Meow" }
end

meow 5

Consequently, analyzing the execution of our example script is possible by analyzing the object graph. There is extremely powerful application analysis possible just by looking at which methods call other methods. If you’re interested to learn more about this, take a look at Elemental Design Patterns.

We hope this very brief introduction to application analysis in Rubinius will spark your curiosity. There’s lots more to come.

nil Is Not NULL, and Other Tales

With the 2.3 release, Ruby has introduced a new operator. Designated the “lonely operator”, this new Ruby syntax (&.) adds unnecessary complexity, inconsistency, and additional confusion for developers.

Edit: Here is the Ruby 2.3 News file describing the “lonely operator” (aka the “safe navigation operator”). And this is the Ruby Redmine feature ticket.

Ruby is often criticized for code that developers cannot easily understand or reason about. This new operator creates a second way to call a method that doesn’t improve code, nor does it improve the ability of a developer to reason about the code.

This post dissects some of the confusion about nil and explains the significant downsides of this new operator.

Fundamentally, there are two problems to solve: 1. Developers want their programs to be reasonably deterministic, and random runtime exception from an errant nil value are one of the most obvious things that interferes with that determinism; 2. Understanding the behavior of a program written with a dynamically-typed language requires specific tools.

The “lonely operator” partially and inconsistently addresses the first problem, but with significant and unnecessary complexity that does not pay for itself. It does nothing to address the second problem.

nil is an Object, NULL is a memory pointer

Sometimes developers from Ruby visit a mysterious and magical land, called Static Typing, and return with deliriously happy tales of programs that always work, are easy and inexpensive to write, and run for decades. Sometimes developers from that land visit Ruby and laugh. This makes some Ruby developers sad and they start thinking that Maybe if Ruby had Option “types”, they could easily write perfect code that runs for decades, too.

Sadly, when this happens, Ruby developers are confusing a simple little Ruby object for something that’s usually radically different in “blub” language. Often, this other thing is a memory pointer, sometimes called NULL, which traditionally has the value 0. When 0 is used as a memory pointer, most computers very sternly complain, abort your program, and send you packing to the principal’s office.

But in Ruby, nil is an object. You can use it as a value, you can call methods on it, you can define methods for it. It’s not NULL and it doesn’t make your programs vulnerable to things that NULL makes your program vulnerable to.

Even when Ruby developers understand that nil is not NULL, they still often perceive it as some evil thing to be destroyed. This perspective usually comes from the fact that calling a method on nil that it doesn’t implement raises a runtime exception, which causes your program to abort.

To be absolutely clear, a segmentation fault and a runtime exception are radically different things. But the consequence, your program aborts, makes them seem quite similar.

In a statically-typed language, even if the implementation doesn’t treat its “nil-like” value as NULL, it still must resolve, at compile time, that a particular function or method will be called, and if its “nil” doesn’t support that, something must be done. This characteristic of a statically-typed language makes nil seem dangerous.

In Ruby, the fact that calling a method on nil that it doesn’t understand results in a runtime exception is not a fundamental aspect of the language. Instead, it’s a simple decision that was made, and it’s possible to make a different decision, and you can do this in your own program at any time. We’ll explore that aspect of nil later.

nil is mathematically realistic

In mathematics, and specifically with functions and sets, the idea of nil is both necessary and useful (just as it is in Ruby, but that’s discussed later).

A partial function is a function where every input may not map to an output. A value like nil signals that “no mapping is available for that input”. It’s a very useful concept.

With sets, we also know how to compute with such a concept. For example, consider set intersection, the operation that produces from two input sets, the set of members that both have in common. For example, { 1, 2, 3 } intersect { 3, 9, 12 } would yield the set { 3 }. In the case of two sets that don’t share any members, the result of the intersection operation is the null set, often denoted by { }.

For some operation like, A intersect B union C, an exception is not raised if the result of A intersect B is the null set. There’s nothing particulary odd or special about this; it’s entirely natural.

So why is nil such a big deal in Ruby? Ah, now that’s a good question.

nil is mistreated in Ruby

There are two problems with the way nil is treated in Ruby. One problem, listed above, is that nil is misunderstood and considered an unwanted nuisance. The other, bigger problem, is that nil is ill-defined in Ruby. It’s handicapped for no good reason.

In the first case, developers tend to see nil as an error, and hence, it is quite unwelcomed. “Oh look, that method returned a nil, something must be wrong!”

No, not at all! Nothing need be wrong. The method returned a nil to say, “no value here!”

Related to the problem of seeing nil as an error condition, some Ruby APIs treat nil as a marker (or sentinel) value. In Rubinius, this is such a problem that we had to introduce a special value we called “undefined” to be able to implement the Ruby core library in Ruby. The problem is that some methods take nil as a value, but also have a default value. So, we were unable to distinguish between some_method() and some_method(nil).

Finally, because nil is ill-defined, and because developers are inclined to see nil as an error condition, when a method is called on nil that it doesn’t understand and a runtime exception is raised, it reinforces this dysfunctional relationship with nil.

This brings us to the ill-conceived “lonely operator”.

The “lonely operator” is an unnecessary mistake

The “lonely operator” supposedly solves the problem of calling a method on nil that then raises a runtime exception causing the program to abort.

Unfortunately, it only partially solves this problem, further requiring the newly introduced #dig methods (also unnecessary). It isn’t necessary, and doesn’t even solve the problem. What a mess.

Essentially, the “lonely operator” adds unavoidable complexity to syntax and programs, adds cognitive load to developers, and is an incomplete solution to the problem. Let’s look at each of these in turn.

  1. It increases system and code complexity: At every place a method is called, there must now be a decision about whether to use . or &.. The lonely operator doubles the complexity of making a method call and due to interaction with other aspects of coding, significantly more than doubles overall complexity.
  2. It makes communication about code difficult: How do we communicate to other developers when to use . and when to use &.? Why is it acceptable to ignore exceptions in one area of the code, but not in another? What happens when that decision is distant from the code you are looking at? What happens when those assumptions change?
  3. It doesn’t solve the underlying problem: The real, and legitimately painful, problem that developers need help with is understanding what their programs are actually doing when they run them. We’ll look at this problem below.

nil is A Good Thing™

The object nil in Ruby is neither a dangerous thing, nor a bad thing. It’s actually a good thing!

If we focus on behavior, and objects inter-operating based on the behaviors they support, nil is a useful concept. It corresponds to “nothing”, “no behavior here”. It doesn’t need to interfere with code functioning, and is only relevant when delivering a result to the user. We need to be able to say, “Hello there, that thing you requested doesn’t actually have any representation”. If that even matters. Sometimes it doesn’t matter at all, and nil is just a blank.

The simple alternative to the “lonely operator”

The only thing that the “lonely operator”, and the new #dig methods, provide is the ability to ignore runtime exceptions from calling a method on nil.

A very simple alternative with no special syntax has existed in Ruby forever. Let’s see how that works.

First, we recall that nil is a singleton value of NilClass. Nothing special, just an object that already responds to a few methods, like #to_s, #to_h, #to_c, #to_a.

In Rubinius, you can find where a method in the core library is defined by calling #inspect on the Method object:

irb(main):001:0> nil.method(:method_missing)
=> #<Method: NilClass#method_missing (defined in Kernel at kernel/delta/kernel.rb:46)>

We can see that NilClass has inherited #method_missing from Kernel. Simple, we’ll just open NilClass, define our own #method_missing, and see how that works.

1
2
3
4
5
class NilClass
  def method_missing(*)
    self
  end
end

Here’s the way to look at it: nil is the value (or object) that turns every method into the identity method.

The concept of an identity function is fundamental in math. A function f(x) = x for all x is the identity function; it returns its input unchanged.

We could reverse this perspective and talk about the value instead of the function. We could say, NIL is the value that when passed to any function, the result is NIL: f(NIL) = NIL for all functions f.

In Ruby, we can do this with nil: nil.m => nil for (almost) any method m.

So, with the very simple addition above, let’s compare some code, first on Ruby 2.3.0 and then on Rubinius 3.5:

$ ruby -v
ruby 2.3.0p0 (2015-12-25 revision 53290) [x86_64-darwin15]
$ irb
irb(main):001:0> a = nil
=> nil
irb(main):002:0> a&.+2 * 3 + 5
=> nil
irb(main):003:0> h = {a: 1}
=> {:a=>1}
irb(main):004:0> h[:b][:c][1]
NoMethodError: undefined method `[]' for nil:NilClass
	from (irb):4
	from /Users/brianshirai/.rubies/ruby-2.3.0/bin/irb:11:in `<main>'
irb(main):005:0> h.dig(:b, :c, 1)
=> nil

Now, for Rubinius:

$ ruby -v
rubinius 3.5 (2.2.0 1453a0a5 2016-01-10 3.5.1 JI) [x86_64-darwin15.2.0]
$ irb
irb(main):001:0> class NilClass
irb(main):002:1> def method_missing(*)
irb(main):003:2> self
irb(main):004:2> end
irb(main):005:1> end
=> :method_missing
irb(main):006:0> a = nil
=> nil
irb(main):007:0> a + 2 * 3 + 5
=> nil
irb(main):008:0> h = {a: 1}
=> {:a=>1}
irb(main):009:0> h[:b][:c][1]
=> nil

We can see that we need to use undeniably more complex syntax (a&.2), and inconsistent syntax (#dig), to achieve the same thing on Ruby 2.3.0. In contrast, with no syntax changes on Rubinius 3.5, by merely using fundamental Ruby features (ie defining a method), we have consistent syntax and the same result.

Avoiding runtime exceptions when calling methods on nil is both easy and natural in Ruby. No special syntax and no extra confusion required. But solving this part of the problem isn’t that important. It’s the second part of the problem that is way more important to solve for developers.

Ruby developers need to see where nil is

The real problem that developers need Ruby to solve is the ability to know where nil values come from when they are not desired. As demonstrated above, the simple solution to computing with nil without causing runtime exceptions already exists in Ruby and has since forever.

This problem of knowing where a value comes from is much bigger than nil. It is a result of the fundamental tradeoff that a late-bound language (usually called a dynamically-typed language) makes relative to an eagerly-bound language (usually called a statically-typed language).

The tradeoff has a massive benefit that is under-appreciated. Late binding provides a malleable system that easily manages the complexity of high-uncertainty contexts. Objects can interact with other objects that provide certain behaviors. They do not have to be specific kinds of objects. The lessening of the constraints that the developer’s assumptions impose on the system can increase the utility and resilience of the system.

Unfortunately, as is often the case, we are seeing the world in black and white, picking one side, and missing half the picture. Late bound languages make simple code able to manage a lot of runtime complexity, but also potentially add a lot of confusion for the developer trying to understand what the runtime behavior actually is in a particular case.

The solution to the problem of understanding runtime behavior is a system that provides rich analysis features for the developer. Rubinius is building a system like this. Having a general solution is great, but we can get a lot of benefit from focusing specifically on nil.

Traceable nils in Rubinius

In Rubinius, there are two types of “objects”. There are objects like an Array, Object, or Hash instance, or an instance of some class in your code. These objects have two parts: 1. the object’s data, which lives somewhere in memory, and 2. the object’s reference, or pointer to the location of the object’s data.

There’s another kind of object in Rubinius. These are called immediate values because their data and their “reference” are the same thing. These are also called tagged pointers because the value is essentially a memory pointer where we’ve set one or more “tag” bits.

Values like 1, 0xcafe, true, false, and nil are immediate values, or tagged pointers, in Rubinius. This is what nil looks like as a (binary formatted) pointer value 0b11010. If the least significant five bits of a pointer match that value, the value is considered by Rubinius to be nil.

In the past, we have only ever used that precise value. In other words, all the other bits are zero. But nothing requires this, and those other bits don’t need to be wasted. On 64bit architectures, this gives Rubinius approximately 2^59 values of “nil”. That’s more than enough for a typical Rails app, I’m sure.

So, how can we use this abundance of nil values? Easy! When a method returns nil and that nil is not a value that was propagated from a value passed to the method, we can return a nil that is tagged for that specific method. The value behaves exactly as nil, when nil was a singleton value. But now we can find the source of the nil when we encounter it later. We can trace various paths of specific nils through code and help the developer understand why a particular value is nil.

This is the problem that developers need Ruby to solve. It’s already possible with zero extra complexity of syntax, communication, and cognitive load to avoid runtime exceptions when calling methods on nil. But the system needs to help developers understand how their code is functioning, and why it is functioning that way.

There’s only one other feature we need to add to Rubinius, return value type caching. We already cache the type of values at method call sites, and those caches help the JIT generate more efficient machine code. With return value type caching in place, the JIT will improve and Rubinius runtime analysis tools will be even more powerful.

A system must support writing AND running code

Everything that we’ve been looking at in this post points to a fundamental concern with programming: Our systems have been woefully incomplete. There are extremely few problems that we know everything about up front. Most “interesting” problems have significant novelty or they would already have simple solutions. To support writing programs for these problems, we need to help the author write code and we need to help the author understand the code as it runs.

This is the fundamental problem that Rubinius is focused on. If that interests you, come hang out with us and talk about it in our Gitter chat.

Rubinius 3.0: The Third Epoch

Happy New Year!

A little more than a year ago, I published a series of posts about the focus for Rubinius 3.0 (part 1, part 2, part 3, part 4, part 5). We spent a lot of time last year on some architecture improvements, so now it’s time to get busy on those 3.0 goals. This post will catch you up to date.

What’s In A Version Number

I wrote two posts recently on the Rubinius versioning scheme and release process. Review those posts for details, but I’ll summarize the main ideas here.

First, Rubinius uses a versioning scheme that associates a “version number”, in the form of EPOCH.SEQ, with a particular git commit SHA via a git tag. The first part of the version number, the EPOCH, signifies “a period of time marked by notable events or particular characteristics” (see the dictionary definition). This post explains the Rubinius 3.x epoch.

The SEQ is a monotonically increasing number that has no other meaning than to signal that newer code is available. The Rubinius versioning scheme is emphatically not SemVer.

A new Rubinius release should only do some combination of these three things: 1. introduce completely new code; 2. add a deprecation notice; 3. remove previously deprecated code.

Second, the Rubinius release process is fully automated and initiated by pushing a git tag of the form “vX.Y”. The best part is that any Rubinius contributor is now empowered to create a Rubinius release.

Since the git tag does not exist independent of the git SHA that it references, a Rubinius release is a function from git tags (or the “release version” or label) to git SHAs. From math class, we may remember that a function can be represented by a set of pairs of the form (input, output). In the case of Rubinius releases, that would be (version label, git SHA).

The Rubinius versioning scheme makes trade-offs in favor of simplifying and accelerating the delivery of new features and fixes. The Rubinius release process makes trade-offs in favor of distributed, collaborative work without forcing people to synchronize and agree in advance. If a bad commit is tagged, any contributor can remove the git tag and push a new one. This provides resilience in the process and facilitates constant improvement.

Finally, the delta from pre-3.0 to 3.0 is inconsequential. Everyone should immediately update to the current 3.x and continue updating as quickly as we release new versions. As part of 3.x, we’ll be introducing the facility to automatically update. I’m so confident that this is the correct way to deliver software today that I’ll be dog-fooding automatic updates in production. If you haven’t heard of working this way, I recommend checking out Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation and Designing Delivery: Rethinking IT in the Digital Service Economy.

All-in On LLVM

With Rubinius 3.x, we are focusing exclusively a leveraging the amazing existing, and constantly improving, technology in LLVM. We will also only support clang/clang++ for building Rubinius.

Rubinius has been using LLVM for a long time. Initially, it was complicated to do so because clang/clang++ were not yet mature and passing the C++ spec, and almost no platform had LLVM packages. That’s changed dramatically in the past several years with clang/clang++ making huge strides, LLVM packages being nearly ubiquitous, and a massive on-going investment in LLVM by Apple to support the Swift programming language.

The top things we’re focused on with LLVM are these:

  1. Leveraging the full LLVM feature set to rewrite FFI to more completely interoperate with foreign code and to do so more naturally from Rubinius features (like the Rubinius instruction set).
  2. Expanding the artifacts that we can produce with LLVM from only JIT-compiled methods to full ahead-of-time (AoT) executables. A lot of people really like Go because it generates executables, but there are not many actual differences between Go and Ruby. With a little help, we’ll be generating executables for the same sorts of use cases that Go serves well. Don’t expect to compile your massive Rails monolith, but if you’re already working with simple, small services, you may be in luck.
  3. Integrating LLVM functionality more completely into Rubinius. For example, integrating lldb to fluidly switch between Ruby source level debugging and machine code debugging (either external libraries or JIT-compiled code).
  4. Creating a more useful set of intermediate representations (IRs) for describing the semantics of dynamic code before reducing to the semantics of LLVM’s IR.

These areas of focus will help make Rubinius more useful across problems that often appear to be in opposition, like having an executable versus a flexible managed runtime that supports quick prototyping and experimentation. These usually represent different phases in a program’s evolution, and they tend to be cyclical, not linear. Being forced to choose up-front to use a compiled language or managed runtime is unnecessary.

Eliminating The Ruby

Two priorities for Rubinius 3.x are removing Ruby as a build dependency and de-coupling Ruby from the Rubinius core systems.

Building Without Ruby

Using Ruby for the Rubinius build system was convenient and usually helpful, but it has a massive downside: installing Ruby is a major pain. So much so that Homebrew kicked Rubinius out as a project primarily because the Ruby build dependency was so difficult to manage. It has also frustrated the lives of numerous other package maintainers, from Gentoo to Arch to FreeBSD.

I’ve already started rewriting the build machinery to use Bash instead, which is a reasonable enough common denominator. The deploy automation is already using Bash. Once that is all working, we can look at supporting something like zsh, but for now Bash is way more ubiquitous and way easier to install than Ruby, so we’re going to start there.

Running Without Ruby

The second area we’re removing Ruby is from the core components of Rubinius itself. It is already possible to boot Rubinius using a completely different core library instead of the Ruby one. However, there are still Ruby assumptions baked into the instruction set, managed object system (object memory and garbage collectors), and especially in the hundred or so primitives, C++ code that implements operations that are undefined in Ruby semantics (like adding two machine integers or doubles). As I’ll explain below, these primitives will be completely removed.

Performance, Memory Use, Startup Speed

This is the part that I’m most excited about. We will finally be seriously focused on performance and demonstrating the capabilities of Rubinius based on the years of work we’ve done to build a good architecture for efficiently running dynamic code with full support for multi-core parallelism.

New Object Types

Things like bytes and primitive sequences of things don’t need to “interoperate” and don’t need to be “object-oriented”. So we’re adding non-object-oriented objects to the managed object system. This will support using “data” with “functions”, enabling us to write better performing code in places and, more importantly, supporting languages with semantics that differ from Ruby.

New Instructions

As mentioned previously, we’re removing all the “primitives” in favor of an instruction set that completely maps to the language semantics we aim to support. This lets us work with those semantics across the entire compiler tool chain. Even more important, this lets us transfer all our investment to another implementation of the instruction set. Two very promising targets for this are WebAssembly and the IBM J9 system.

Ultimately, we are re-imaging the instruction set from an internal implementation detail to a protocol for communicating language semantics between the system supporting authoring the program and the mechanism executing the program.

As part of evolving the instruction set, we’ll be splitting method dispatch, inline caching, and method invocation to support things like static method invocation, multi-method dispatch, and arbitrary method dispatch semantics. The inline caching mechanism is already the basis of our type-profiling JIT compiler and generalizing the caching will enable building more powerful, always-available code analysis.

Better Garbage Collection

The Rubinius garbage collector is already generational and precise, but the young generation semi-space collector has two downsides: 1. it requires a full stop-the-world phase to execute, and 2. it wastes half the memory allocated to it.

We’ll be replacing the single young generation semi-space collector with a set of Immix collectors per Thread. (Immix is the collector we use in the mature generation.) The Immix-based collector has fast allocation, supports concurrent marking, can optionally compact, and compares very favorably in overall overhead with the semi-space collector.

Evented Semantics

Rubinius already does threading very well, supporting excellent parallelism on multi-core hardware. However, threading isn’t the only way to manage concurrency and as we’ve added more functionality, like the built-in performance counters that can stream out StatsD, we’ve seen a need for evented concurrency. So, we’ll be building a first-class event loop into Rubinius with the ability to run multiple event loops, each on their own Thread.

Blurring Writing and Running Code

One of the key components for supporting powerful runtime analysis, as well as the ability to AoT compile code is the CodeDB we’re building. There’s a branch with some inital code, as well as a proof-of-concept I implemented almost five years ago.

The CodeDB assigns a unique ID to every executable context (in Ruby terms, scripts, class/module bodies, method bodies, block bodies). These IDs provide a “foreign key” of sorts to associate arbitrary dimensions of data (bytecode, LLVM IR, profiling data, coverage data, type profiles, etc.) with every executable context. This data can be used by other tools or analyzed to provide insight into program operation or performance.

The CodeDB also gives the ability to lazily load code and even to evict already loaded code that hasn’t been used for some period of time, since reloading the code is possible at any time if it were to be needed.

We’ll be building the CodeDB incrementally, focusing initially on lazy loading and basic program analysis features, then expanding the tooling support to enable more powerful analysis.

Your Alternative for Ruby

When I started contributing to Rubinius over nine years ago, the alternatives for writing programs were much fewer. We’ve all witnessed the rise of languages and systems like Go, Rust, Clojure, Scala, Elixir, Crystal, Julia, Node.js and others. Each of these is driven by some need that existing systems did not adequately address (in the author’s experience anyway).

Unfortunately, while these newer systems often have compelling features, using them requires discarding the previous investment in systems and applications. None of these systems encourage or support incrementally building better functionality in one portion of your existing system (unless you’ve already done the often significant work of building a microservice architecture).

In Rubinius, we want to provide a solid foundation for your existing investment, while opening up a world of possibilities to fix the most critical issues and continue to deliver value to your users or customers. Preserving your existing investment to the greatest extent possible maximizes the new value you can create.

So, this is what the Rubinius 3.x epoch looks like. If you’re interested, keep an eye on the Rubinius releases and drop into our Gitter channel for a chat.

Rubinius Compute - Programming for the Internet

Today, I’m announcing Rubinius Compute, a platform for computation inspired by Amazon Lambda that builds on the Rubinius language platform.

We see two major trends converging: First, more devices are being connected and those devices need to communicate, driving a constant rise in network use. Second, more data is being created, processed, and stored.

The problem is, there’s a massive asymmetry in the network. A relatively tiny number of “smart” nodes do all the work and receive most of the communication, while an enormous number of nodes mostly just add load to the network.

The solution requires changing the way we build apps, moving towards distributed, resilient networks of collaborating nodes where computation and data co-exist.

IPFS is working on distributing content. We need to pair this with distributing computation. To do so, we need to move beyond the dominant abstraction of a “desktop computer”, with operating system, libraries and packages, disk drives, etc. out in “the cloud”.

Rubinius Compute is a foundation for building apps in a familiar way, leveraging Rubinius Analyst to understand and evolve them, and distributing them to the network without messing with irrelevant details from leaky “system” abstractions.

None of these ideas are new: distributed content, raw compute nodes without operating system abstractions, and apps as distributed networks of collaborating agents. But combined, they represent the most important shift yet seen in how we use computation.

Let’s build the future.