What’s merge? It’s what you do after a “diff”. What’s diff? It’s something that shows you the differences between two files in a human-readable way. More specifically, suppose that you and I are both working on a program. We’re sitting in front of different machines, trying to fix different bugs or add different features, and it just so happens that we both need to change
graphics.java. After we’ve both made our changes, the world looks like this:
At this point, we need to combine our changes. We could scroll through two copies of the file side by side, copying edits from one to the other, but we’d almost certainly miss something or make a mistake. What we should do is use a program like
diff to highlight the changes for us. Or better still, we should use a tool like
merge to show your version of the file on the left, mine on the right, and the merge in between:
When we’re done merging, what we have is the best of both worlds—the best of your ideas combined with the best of mine. The biological term for this is recombination, and it’s at least as important to evolution as its more famous cousin, mutation, because it lets good genes (or ideas) cooperate.
Diff and merge make open source possible. They let dozens, hundreds, or thousands of people remix their work—not just take what others have done and build on it, but give back their own changes and ideas to be stirred back into the original for further remixing:
When remixing is hard, open collaboration doesn’t take root [mail order viagra from canadaviagra next day delivery1]. Education is a prime example: at some point in their career, every teacher has picked up someone else’s PowerPoint slides and used it as a starting point for their own lecture on the subject, but hardly anyone ever gives their changes back to the author of the slides they started from. It’s easy to say that’s because remixing isn’t part of educational culture, but there’s a reason it isn’t: PowerPoint decks can’t be diffed and merged [cialis from canadacost of cialis daily dose2]. If it takes me an hour to scroll through my slides, comparing them one by one with yours and copying changes back by hand, I’m not going to use what you send me, so you’re not going to send it in the first place [order viagra online without scriptcialis canada3]. Going back to our biological metaphor, people who can’t merge are stuck in a universe that has mutation but not recombination, and that’s a really inefficient way to improve fitness.
I’m thinking about all of this now because of the cialis onlineviagra patent expirationIPython Notebook and Mozilla Thimble. They’re both really exciting tools, but neither makes collaboration easy . If I want to merge your changes to a project into my copy, I can’t view them side by side in the browser and pick the pieces I want from each. Instead, I have to merge two JSON files if I’m using the Notebook and—well, I’m not sure what I’d do with Thimble. I could view the differences in the text of the HTML and CSS, but anyone who can do that can build web pages without Thimble in the first place.
More to the point, people shouldn’t have to drop down a cognitive level or two in order to collaborate this way. Lots of graphic design tools can highlight and merge the differences between two photographs; DiffEngineX does it for Excel spreadsheets (though you need a pretty wide screen to use it effectively), and so on. There’s no technical reason we can’t diff and merge all our files; it’s just that programmers mostly work with text, so they haven’t built merging tools for other formats. (And increasingly, I believe they work with text because it’s what they can diff and merge in version control…)
We’re smarter when we work together. It’s more fun, too, so I think tools ought to make collaboration as easy as adding a caption to a picture of a cat:
And that, my friends, would be a revolution.
- The exception is systems like Wikipedia that have just one copy of the document which everyone edits simultaneously, but like Google Docs and Etherpad, that clearly doesn’t work for programming, slide decks, or other situations in which people want to try different things at the same time.
- PowerPoint “merging” tools like these two just concatenate multiple presentations into one, or generate a specialized deck from a template by filling in blanks with names and dates (rather like spam generators).
- At this point programmers often say, “Then write your slides Markdown or LaTeX or HTML5 or some other text-based format so that merging is easy,” but that’s like saying, “If you take all the pictures out of your book, it’ll compress much better.” PowerPoint, LibreOffice, Keynote, and other WYSIWYG presentation tools have survived and thrived because they make it easy for people to mix graphics and text however they want, just as they would on a whiteboard. As this blog post shows, it’s a lot harder to do this with text-based tools: I had to switch from my editor to a drawing package to create the diagrams included above, then upload them, and if you ask your browser to search for “Original Version”, it still won’t find that label in either of the diagrams. Given the choice between whiteboarding (which they take for granted) and merging (which they’ve never done before, and whose value they don’t yet understand), almost everyone will choose the former.
- More precisely, neither makes asnchronous collaboration easy. TowTruck lets people share dynamic browser sessions in real time, which is really cool, but as noted in , that’s a very different model than forking and merging.
Dr. Cameron Neylon
Director of Open Access Advocacy, Public Library of Science
4:00 p.m., Wednesday, May 1, 2013
Room 205, Bissell Building, University of Toronto, 140 St George St
The web, like all network technologies before it from the mobile phone to writing itself, has the potential to enable a qualitative change in our capacity as people, organizations and societies. We are starting to see the first glimmerings of how our research capacity might change with projects like Galaxy Zoo and Polymath but these remain isolated examples. What will it take to exploit the network capacity that the web brings us to enable a step change in the efficiency and effectiveness of our research?
This seminar will be the first in a series highlighting new opportunities to network knowledge through application of knowledge media design values and methodologies.
Software Carpentry is pleased to announced a two-day software skills boot camp for women in science and engineering, to be held in Boston this June. We’re currently trying to raise the $6000 needed to give 120 grad students (and others) a chance to improve their research computing skills while networking with peers; donations would be very welcome.
Why a boot camp specifically aimed at women? Because a large body of research has shown that without initiatives like this, the cycle of low participation today leading to low participation tomorrow will continue unchecked. For example, WiT reports:
In the Bayer Facts of Science Education XIV survey, women and minorities raised a number of barriers in their path to STEM careers, including:
- Lack of mentors (50%)
- Lack of role models (49%)
- Stereotypes adversely affecting women and minorities (39%)
- Lack of communication from STEM industry (39%)
- Self doubt (35%)
- Cost of education (31%)
- “Sense of isolation” (29%)
- A lack of solid math and science education in poorer schools (24%)
Issues like the lack of role models, lack of mentors, stereotypes, and a sense of isolation are effectively addressed by getting a bunch of women together in one room. We’re not just presenting the Software Carpentry material, we are also creating a community of women who will support each other in tangible and intangible ways. If you would like to learn more, one of the most thorough and most readable pieces of research in this area remains Margolis and Fisher’s Unlocking the Clubhouse, which reports their work in the late 1990s and early 2000s at Carnegie-Mellon.
U of Toronto PhD student Christian Muise created an application that was selected as a winner of Google’s Places API Challenge. The competition brought together 87 developers from 27 countries, challenging them to build apps that address some of the most pressing needs in our communities. The top three applications were declared winners, and will be showcased at Google I/O, the company’s annual developer conference.
Muise’s award-winning app, TTC Pass, is “a website that allows for collaborative editing of the locations for purchasing various transit fares” in Toronto. See a video of the app, or the Google Places API Developer Challenge webpage.
Clay Shirky’s latest piece, which points out that offline colleges are broken, has been attracting a fair bit of attention. What isn’t is the way he and others are trying to frame the debate, which goes something like, “Yeah, MOOCs aren’t perfect, but have you actually looked at what most non-elite colleges actually offer?” This boils down to, “Our bad is slightly less bad than their bad,” but blithely ignores the fact that something much better than either exists. As Anu Partanen pointed out over a year ago, though, that “something better” would demand some uncomfortable soul-searching…
G.W. French, J.R. Kennaway, and A.M. Day: “Programs as visual, interactive documents.” Software – Practice and Experience (2013), DOI: 10.1002/spe.2182.
We present a novel approach to combined textual and visual programming by allowing visual, interactive objects to be embedded within textual source code and segments of source code to be further embedded within those objects. We retain the strengths of text-based source code, while enabling visual programming where it is beneficial. Additionally, embedded objects and code provide a simple object-oriented approach to adding a visual form of LISP-style macros to a language. The ability to freely combine source code and visual, interactive objects with one another allows for the construction of interactive programming tools and experimentation with novel programming language extensions. Our visual programming system is supported by a type coercion-based presentation protocol that displays normal Java and Python objects in a visual, interactive form. We have implemented our system within a prototype interactive programming environment called ‘The Larch Environment’.
This is cool: see their site for more information.
Releng 2013 is a one day workshop (co-located with ICSE) to bring together release engineers and researchers to discuss the challenges in release engineering and develop areas for further research. Areas of discussion include research and practice of all activities in between regular development and actual usage of a software product by the end user, i.e., integration, build, test execution, packaging and delivery of software. The conference is May 20 in San Francisco and deadline for submission of talks in February 7th. For more information see http://releng.polymtl.ca.