Planet Scheme

Syndicate content
Updated: 4 hours 34 min ago

Dominique Boucher: ECMAScript - turning a binary function into a variadic one

Wed, 08/25/2010 - 14:42
In my previous post, I mentioned that it's easy to take a binary JavaScript function and turn it into a variadic one. Here's the helper function to do that:
function variadic(fun, val0f, val1f) {  return function() {    var len = arguments.length;    if (len == 0) {       return val0f();    } else if (len == 1) {       if (val1f != undefined) {         return val1f(arguments[0]);       }       else {         return arguments[0];       }    } else {      var tmp = fun(arguments[0], arguments[1]);      for (index = 2; index < len; index++) {        tmp = fun(tmp, arguments[index]);      }      return tmp;    }  };}
The 'variadic' function takes three parameters:
  1. the binary function;
  2. a function that returns the value for the variadic version when called with zero arguments
  3. a function that returns the value for the variadic version when called on a single. If not specified, the first argument to the variadic function is returned as is when called on a single argument.
For example, the summation operator is defined as
var sum = variadic(function(x,y) {return x+y;},                   function() { return 0;},                   function(x) { return x;});
while the equivalent of Scheme's / function is defined as
var div = variadic(function(x,y) {return x/y; },                   function() { throw "not enough arguments to div"; },                   function(x) { return 1/x;});
It is now possible to call
div.apply(this, [1,2,3,4,5])
to get 0.008333333333333333.

Programming Praxis: Daniel Shanks’ Square Form Factorization Algorithm

Tue, 08/24/2010 - 10:00

In the mid-1970s, Daniel Shanks exploited the binary quadratic forms introduced by Legendre in the late eighteenth century and perfected by Gauss in the early nineteenth century to create a method of factoring integers. Binary quadratic forms are expressions of the form Ax2 + Bxy + Cy2 with A, B and C integers, normally represented as the triple (A, B, C). Binary quadratic forms are often called square forms.

We won’t discuss the mathematics of square forms, but we do need to know how to reduce a square form. A discriminant delta is calculated as Δ = b2 − 4ac. A single reduction step is called a rho reduction, and is given by ρ(a, b, c) = (c, r, (r2−Δ)/4c) where r is that unique integer b mod 2c such that −|c| < r <= |c|, if √Δ < |c|, or √Δ − 2|c| < r < √Δ, if |c| < √Δ.

A square form is in reduced form if |√Δ−2|a|| < b < √Δ; this reduction is accomplished by continually applying ρ until the reduced form is achieved.

Shanks’ method uses two loops. First, an initial square form is calculated, based on the number to be factored, which must be odd, composite, and not a perfect square; the details of that initialization are tedious, and appear in the solution on the next page. Then the square form is repeatedly reduced (this is the “forward” loop) until it is in proper form, which is described below. Then the reduced inverse square root of the square form calculated in the forward loop is reduced in a “backward” loop until a factor is identified when two successive square forms (A, B, C=c2) have the same B; the factor is then c, if c is odd, or c/2, if c is even. The reduced inverse square root of (A, B, C=c2) is initially (−c, B, −cA), which is then put into reduced form as described above.

Shanks defined a square form as proper using a queue of values derived in a rather complicated way. We will use a simpler method known as a sufficient list. Let L be the integer square root of the square root of the discriminant. If a square form (A, B, C) is encountered with |C| < L with C even, or |C| < L/2 with C odd, then C is placed on the sufficient list. Then a square form (A, B, C=c2) is proper if c is not on the sufficient list.

Your task is to write a function to factor integers using Shanks’ square form algorithm. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


Grant Rettke: Wrapping up the 2009-2010 School Year

Tue, 08/24/2010 - 04:36

This past May, I completed the Simulation and Parallel and Distributed Systems class that I was attending. While taking two classes while working full time was challenging; the pure fun of it all more than made up for the challenge! I will have fond memories of that semester for a long time.

I can’t wait to get started with Applied Mathematical Analysis next week.

Grant Rettke: Pondering repairing this rust

Mon, 08/23/2010 - 21:32

Options:

  1. Sand. Use light grit paper and paint.
  2. Seal. Use something like POR-15 to seal it and paint over that.

Jose Antonio Ortega: At the workshop

Mon, 08/23/2010 - 19:52

I just came back from the Scheme and Functional Programming Workshop at Montréal, hosted by Marc Feeley and an excellent organisation team. It’s been a fun couple of days putting faces to emails and IRC nicks and attending a handful of pretty interesting talks. Here’s a quick report.

My favourites were the two invited papers. The first day, Olin Shivers presented a pretty cool hack in a delicious talk consisting in actually writing the code he was explaining. Under the title Eager parsing and user interaction with call/cc, he showed us how call/cc is not just an academic toy, but can be put to good use in writing a self-correcting reader for s-expressions on top of the host scheme read (or any other parser, for that matter). He started from the very basics, explaining how input handling and buffering is usually delegated to the terminal driver, which offers a rather dumb, line-oriented service. Wouldn’t it be nice if, as soon as you typed an invalid character (say, a misplaced close paren) the reader complained, before waiting for the whole line to be submitted to read? Well, all we need to do is to implement the input driver in scheme, and he proceeded to show us how. As you know, reading s-expressions is a recursive task, meaning that when you detect invalid input in the middle of a partial s-expression, or want to delete a character, you might find yourself somewhere deep inside a stack of recursive calls and you’ll need to backtrack to a previous checkpoint. That’s an almost canonical use case for call/cc, provided you use it intelligently. Let me tell you that Olin is quite capable of using call/cc as it’s meant to be used, as he immediately demonstrated. I’m skipping the details in the hope that a paper will be available any time soon. As i mentioned, his talk was a beautiful example of live coding: he showed us the skeleton of the implementation and filled it up as he explained how it should work. Olin does know how to write good code, and it was a pleasure (and a lesson) seeing him doing just that. It was all so schemish: a terminal and emacs in the venerable twm: that’s all you need to create beauty.

The second invited talk was by Robby Findler, who gave us a tour of Racket’s contract system, how to use it and the subtleties of implementing it properly. The basic idea dates back to Meyer’s design by contract methodology of the early nineties, and was subsequently explored further by several authors, including Robby. Simple as they sound at first sight, good contracts are not trivial to implement. For instance, it’s vital to assign blame where’s blame is due, and Robby gave examples of how tricky that can get (and how Racket’s contracts do the right thing). Another subtlety arises when you try to write contracts assessing a property of an input data structure (say, you want to ensure that an argument is actually a binary search tree). The problem here is that checking the contract can alter the asymptotic complexity of the wrapped function (e.g., you can go from O(log n) to O(n) in a lookup, an exponential degradation). Racket provides an ingenious fix for that problem, by means of lazy contracts that are checked as the input is traversed by the “real” function.

There was also real-time scheme in Robby’s talk, although in a much more sophisticated way, thanks to Slideshow’s magic, which lets you embed code files and snippets in a presentation, evaluate them and show the results in the same or a new slide. Very elegant. He also used DrRacket a bit during the introductory part of his talk, and i’m starting to understand why some people are so happy with Racket’s IDE: it definitely felt, in his hands, professional and productive. And also kind of fun.

There were also lightning talks. I’m of course biased, but the one i enjoyed most was Andy Wingo’s Guile is OK!, where he showed us how Guile has overcome the problems, perceptual and real, of its first dozen years. For instance, he reminded us how Guile was traditionally a “defmacro scheme”, and he himself a “defmacro guy”… until he studied in earnest Dyvbig’s work and ported his syntax-case implementation to Guile, to become a “syntax-case man” as Guile gained full syntax-case support (i hope i’ll reach that nirvana some day; i still find syntax-case too complex and plagued by unintuitive corner cases (one of them was showed by Aaron Hsu in another lightning talk, where apparently none of us was able to correctly interpret 10 lines of scheme) that make me uneasy; but that’s surely just ignorance on my part). There are many other things that make Guile a respectable citizen of the Scheme Underground, which were also listed in Andy’s talk: i’ll ask him for a PDF, but in the meantime you can just try Guile and see :).

Although this time we didn’t have a talk by Will Clinger, to me it’s always a pleasure to listen to what he has to say, even if only as comments to other people’s talks. For instance, i enjoyed his introduction to Alex Shinn’s R7RS progress report. Will showed us three one dollar coins, of the same size and shape, but different, as he described, in almost everything else. And yet, all three were useful and recognised as (invalid) dollars by the Canadian vending machines at the entrance. He thinks that says something about standards, but he left to us to decide exactly what.

Finally, let me mention that this workshop has alleviated all my quibbles with what i’ve sometimes perceived as a fragmented, almost dysfunctional, community, made up of separate factions following their own path in relative isolation. My feeling during the workshop was nothing of the sort; rather, i’m back with the conviction that there’s much more uniting us that breaking us apart, and that there’s such a thing as a scheme underground ready to take over the world. Some day.


Filed under: Scheme

Grant Rettke: Delimited continuations on OCaml

Sat, 08/21/2010 - 16:20


Via Oleg:

The library delimcc directly implementing delimited continuations on
OCaml now supports native code (ocamlopt) on selected platforms.

http://okmij.org/ftp/continuations/caml-shift.tar.gz

The library delimcc implements shift/reset, prompt/control, shift0,
control0 delimited continuation operators with multiple, arbitrarily
typed prompts. The delimcc implementation is direct: it copies only
the relevant fragment of the OCaml stack. The implementation is fully
integrated with OCaml exceptions: exception handlers may be captured
in delimited continuations (and re-instated when the captured
continuation is installed); exceptions remove the prompts along the
way. The implementation has no typing problems, no bizarre ‘a cont
types, and no use for magic.

The library delimcc does *no* patching of the OCaml system and is a
regular (static or dynamically-loaded) library. The library is
therefore _perfectly_ compatible with any existing OCaml code, source
or binary.

The native- and byte-code versions of the library implement the
identical interface, described in delimcc.mli. Furthermore, the OCaml
code for the two versions is identical. Only the C code files,
implementing scAPI, differ for native-code.

Using the native-code version is identical to the byte-code version of
delimcc. No source code has to be changed; it has to be compiled using
ocamlopt rather than ocamlc.

This is the first release of the library, for OCaml 3.11. It has been
tested on i386 (x86_32) platform, on FreeBSD and Linux. The library
could perhaps be used on other platforms (on which stack grows
downwards); x86_32 is the only platform available to me. At any write,
the library contains no custom-written assembly code (although I
couldn’t avoid reading a lot of assembly code).

This is the first release of the native-code version, hence stressing
correctness at the expense of performance. A notable performance drain
is emulating data types with a custom GC scanning
function. Custom-scanned data types are possible without any changes
to OCaml, thanks to the provided GC hooks. Alas, the emulation doesn’t
seem to be efficient. A co-operation from the OCaml system would be
greatly appreciated. I have a hunch that custom-scan data types could
be useful for many other applications, for heap-allocating structures
with OCaml values intermixed with unboxed data.

It should be noted that the very operation of capturing and
reinstalling a delimited continuations will always be faster in
byte-code than in the native code. First of all, the byte-code
delimited continuation is smaller. Second, it is a uniform a sequence
of values or code pointers. In contrast, the corresponding captured
native-code delimited continuation — the portion of the C stack — is
not only bigger, but also contains unboxed values. We have to use
frame tables to figure out which stack slots contain live heap
pointers. Saving native-code delimited continuation is inherently more
complex. In fact, all the complexity of the native-code delimcc was
tidying the stack to please GC.

One should keep in mind the fundamental limitation of native-code
delimited continuations — the limitation that also holds for Scheme
and other systems with delimited control. The limitation concerns
capturing continuations in the OCaml code invoked as a call-back from
C code, which was, in turn, called from OCaml. It is safe to capture a
delimited continuation in the callback _provided_ the extent of the
captured continuation is limited to the call-back. The continuation
captured in the callback may be resumed within the call-back or after
the call-back has exited. One must not capture the continuation
through the C code that invoked the call-back. In short, no C frames
may be captured in delimited continuations. The library detects C
frames in captured continuations, and throws a run-time error. Such a
behavior of delimcc is consistent with the behavior of native-code
Scheme systems.

(via Caml-list)

Grant Rettke: Yaron’s Effective ML Lessons

Sat, 08/21/2010 - 06:28

* Favor readers over writers
* Create uniform interfaces
* Make illegal states unrepresentable
* Code for exhaustiveness
* Open few modules
* Make common errors obvious
* Avoid boilerplate
* Avoid complex type-hackery
* Don’t be puritanical about purity

(via janestreet)

Grant Rettke: How to Measure Anything

Sat, 08/21/2010 - 05:28

This looks like a great book.

Grant Rettke: Welcome New Connie Owners

Fri, 08/20/2010 - 23:50

Via John McCain:

This is a message I used to post to a Connie mail list monthly. If you just bought a Connie, it’s got some pretty good tips that might help you.

We sometimes forget to welcome the new Connie owners here, especially if you don’t introduce yourself by at least saying hi. I’ve been here a couple years and believe that the Concours owners community is one of the best things about this bike. It more than makes up for the “old” technology in my Connie. (well, I’d still like to have FI and hydraulic valves).

Here are a few suggestions, things I’ve learned along the way, that might help you enjoy your bike. Sometimes, I’m a slow learner, but these have sure helped me.

The first few things a new Connie owner should do while enjoying this great bike are (perhaps not in this order)…

1. Join COG (Concours Owners Group) http://www.concours.org . That gets you a great book of hints and tips (Best of Chalkdust) and fellowship (along with assistance) with some wonderful Connie owners. I’ve yet to attend an event other than a ride-to-eat meal, but rumors are that they’re great.

2. Buy the official Kaw manual for the bike. Your Kaw dealer or ebay.

3. Buy a Clymers (or equivalent) manual for the bike. Any motorcycle shop or ebay.

I have both manuals, and as a beginning motorcycle mechanic, actually need both. Clymers gives step-by-step instructions for tasks, while the Kaw manual simply says “do it” in too many cases. But the Kay manual has more technical information (torque values and such) and covers some heavy mechanic’n better. You can often pick them up on ebay for 1/2 price… but it’s worth paying full price for these.

4. If it’s a used bike that didn’t come with the owner’s manual. Buy one. Read it a couple of times. Take notes. It contains lots of useful information. For example, there are adjustments to the front and rear suspension (that make a HUGE difference in the ride), location of accessory power points (in front and under the seat), and a maintenance schedule and how to find those hidden helmet locks.

5. Join the newsgroups/mail lists… Guess you’ve already joined at least one :). There are several worth noting:

Concourstech@yahoogroups (supposedly for more technical assistance type discussions). http://autos.groups.yahoo.com/group/concourstech/

Concourse Owners ( cog@micapeak.com ) for general questions and discussions (including sheep, tank bag clocks, and Connie sightings)

The COG forum at http://www.concours.org/forum/default.asp THE Connie owners’ group.

Many of us only participate in one or two of these regularly, but there’s a wealth of info in all of them. If you only participate in one regularly, it’s worth checking the others from time-to-time.

6. Start doing your own maintenance if you have any mechanical aptitude or wannabe attitude. It’s easy with those manuals, saves a ton of money, and there’s plenty of help here. By saving receipts and making notes in a log, the warranty is honored if you do your own work, and you’ll probably do a better job than most bike shops. If it’s a used bike, you can start by replacing all the fluids and checking the maintenance items such as grease, air pressures, brake pads, etc.

7. Check out some of the commonly used references. This is my list, but there are also others out there. The regional COG sites are also often full of tips and tricks.

* http://www.ldrider.ca/techpages Technical info and others’ experience
* http://www.automated-design.com/concours/parts.htm Generic part numbers
* http://www.concours.org/list.faq.html COG technical pages
* http://www.carveyparker.com/concours.htm Whoo Hoo’s web Connie web page
* http://www.mindspring.com/~gbyoung2/ Hidden under “Misc. Technical stuff” are some real gems.
* http://home.austin.rr.com/mrandol/Connie/index.htm Has a great tire index amongst other technical musings
* http://www.concours.org/sc/kawasaki_ap_manual.htm Here’s a list of things that your dealer(if a new bike) or previous owner (if used) forgot to do. Go through this list and check this stuff. It’s worth the time.

There are so many Connie oriented web pages out there that Google returns 357,000+ hits for “Kawasaki Concours”. You can consume a lot of time reading them when you should be out riding.

For any other questions, ask them on any of these mail lists or forums. Connie owners are a friendly bunch and a huge resource with lots of information from both the educational perspective and the practical experience perspective.

Oh, and I’m assuming that the Connie isn’t your first bike. If it is, you have a bit of a steep learning curve. The Connie, with all it’s great points, isn’t known as an easy bike for the beginning rider. I can’t imagine taking a DMV test on one, and it takes getting used to before you’re comfortable at low speeds. So if you decided on a Connie as a first bike, get some serious riding instruction from a MSF type course. Sooner or later, you’ll probably drop her, and that plastic IS expensive (but it can often be glued back together). Yes it’s a tall, top-heavy bike, but she quickly loses that weight when you get her moving.

Enjoy your new bike!

Joe Marshall: INTERNATIONAL LISP CONFERENCE 2010 - HIGHLIGHTS and CALL for PAPERS

Fri, 08/20/2010 - 16:53
With the usual apologies to those who receive multiple copies of this...

 INTERNATIONAL LISP CONFERENCE 2010 - HIGHLIGHTS and CALL for PAPERS 

Important Dates:
~~~~~~~~~~~~~~~~

 * Deadline for all submissions (FIRM): September 6, 2010
 * Early registration: September 16, 2010
 * Author notification: September 20, 2010
 * Final paper due (in electronic form): October 5, 2010
 * Conference: October 19-21, 2010


Invited Speakers:
~~~~~~~~~~~~~~~~~

We are proud to announce that, for the 2010 edition, we will have the
following invited talks:

 * Lawrence Hunter
     Building a Mind for Life

 * Jans Aasman
     AllegroGraph and the Linked Open Data Cloud

 * Marc Feeley
     Gambit Scheme: Inside Out

 * Peter Seibel
     Common Lisp Standardization: The Good, the Bad, and the Ugly

 * Don Syme
     F#: Taking Succinct, Efficient, Typed Functional Programming 
     into the Mainstream

 * Lowel Hawkinson
     Lisp for Breakthrough Products

More information about speakers and talks is available at
http://www.international-lisp-conference.org/2010/speakers


Registration:
~~~~~~~~~~~~~

Rates for Early Registration (before or on September 16, 2010)

ALU member
 & ACM member      $355
 & ACM non member  $390
 & Student         $150

ALU Individual Sponsors
 & ACM member      $280
 & ACM non-member  $315
 & Student         n/a

non-member of ALU
 & ACM member      $430
 & ACM non-member  $465
 & Student         $165

Rates for Late Registration (after September 16, 2010 & Onsite)

ALU member
 & ACM member      $440
 & ACM non member  $485
 & Student         $185

ALU Individual Sponsors
 & ACM member      $365
 & ACM non-member  $410
 & Student         n/a

non-member of ALU
 & ACM member      $515
 & ACM non-member  $560
 & Student         $200

Due to colocation, registration must be done using ILC/SPLASH'10
unified registration forms available at http://splashcon.org/

Please note that the registration page (page 3) has the option "SPLASH
(OOPSLA/Onward!)" selected by default.  If you are only planning to
attend ILC, don't forget to deselect that option.


Travel and Accommodation:
~~~~~~~~~~~~~~~~~~~~~~~~~

SouthWest Airlines offers low fares into Reno but requires booking
online at www.southwest.com

John Ascuaga's Nugget offers reduced rates for ILC participants.
Please, visit http://splashcon.org/ to obtain the group code.


Scope:
~~~~~~

Lisp is one of the greatest ideas from computer science and a major
influence for almost all programming languages and for all
sufficiently complex software applications.

The International Lisp Conference is a forum for the discussion of
Lisp and, in particular, the design, implementation and application of
any of the Lisp dialects.  We encourage everyone interested in Lisp to
participate.

We invite high quality submissions in all areas involving Lisp
dialects and any other languages in the Lisp family, including, but
not limited to, ACL2, AutoLisp, Clojure, Common Lisp, ECMAScript,
Dylan, Emacs Lisp, ISLISP, Racket, Scheme, etc.

Topics may include any and all combinations of Lisp and:

 * Language design and implementation
 * Language critique
 * Language integration, inter-operation and deployment
 * Applications (especially commercial)
 * 'Pearls' (of wisdom)
 * Experience reports and case studies
 * Reflection, meta-object protocols, meta-programming
 * Domain-specific languages
 * Programming paradigms and environments
 * Parallel and distributed computing
 * Software evolution
 * Theorem proving
 * Scientific computing
 * Data mining
 * Semantic web

We also encourage submissions about known ideas as long as they are
presented in a new setting and/or in a highly elegant way.

Authors concerned about the appropriateness of a topic may communicate
by electronic mail with the program chair prior to submission.

Accepted papers will be published in the ACM Digital Library (PENDING).

Papers must be written in English and submitted electronically at
http://www.easychair.org/conferences?conf=ilc2010 in PDF or WORD
format.  Final submissions must not exceed 15 pages and need to use
the ACM format, for which templates can be found at:
http://www.acm.org/sigs/pubs/proceed/template.html.

Each paper should explain its contributions in both general and
technical terms, identifying what has been accomplished, explaining
why it is significant, and comparing it with previous work. Authors
should strive to make their papers understandable to a broad audience.
Each paper will be judged according to its significance, novelty,
correctness, clarity, and elegance.

The official language of the conference is English.  Some further
information is available at the conference web site, with more details
added later.  See: http://www.international-lisp-conference.org/

Technical Program:
~~~~~~~~~~~~~~~~~~

Original submissions in all areas related to the conference themes are
invited for the following categories.

 * Papers: Technical papers of up to 15 pages that describe original
   results or explain known ideas in new and elegant ways.

 * Demonstrations: Abstracts of up to 4 pages for demonstrations of
   tools, libraries, and applications.

 * Tutorials: Abstracts of up to 4 pages for in-depth presentations
   about topics of special interest for at least 90 minutes and up to
   180 minutes.

 * Workshops: Abstracts of up to 4 pages for groups of people who
   intend to work on a focused topic for half a day.

 * Panel discussions: Abstracts of up to 4 pages for discussions about
   current themes. Panel discussion proposals must mention panel
   members who are willing to partake in a discussion.

 * Lightning talks: Abstracts of up to one page for talks to last for
   no more than 5 minutes.

Depending on the technical content, each submitted paper will be
classified by the program committee as either a technical paper or as
an experience paper; and authors will be informed about this
classification.  Note that all interesting submissions are considered
valuable contributions to the success of the ILC series of
conferences.  As in past ILC's since 2007, accepted papers in both
categories will be presented at the conference, included in the
proceedings, and submitted to the ACM digital library.


Organizing Committee:
~~~~~~~~~~~~~~~~~~~~~

 * General Chair:
   JonL White - The Ginger IceCream Factory of Palo Alto, ALU

 * Program Chair:
   Antonio Leitao - Instituto Superior Tecnico/INESC-ID

 * Conference Treasurer:
   Duane Rettig - Franz, Inc., ALU Director

 * Publicity Chair:
   Daniel Herring - ALU Director

 * ALU Treasurer:
   Rusty Johnson - TASC, Inc., ALU Director


Program Committee:
~~~~~~~~~~~~~~~~~~

 * Antonio Leitao - Instituto Superior Tecnico/INESC-ID, Portugal
 * Alex Fukunaga - University of Tokyo, Japan
 * Charlotte Herzeel - Vrije Universiteit Brussel, Belgium
 * Christophe Rhodes - Goldsmiths College, University of London, UK
 * Didier Verna - EPITA Research and Development Laboratory, France
 * Duane Rettig - Franz, Inc., USA
 * Giuseppe Attardi - University of Pisa, Italy
 * Jeff Shrager - Symbolic Systems Program, Stanford University, USA
 * Joe Marshall - Google, Inc., USA
 * Julian Padget - University of Bath, UK
 * Keith Corbett - Clozure Associates, USA
 * Kent Pitman - PTC, USA
 * Manuel Serrano - INRIA Sophia Antipolis, France
 * Marc Feeley - University of Montreal, Canada
 * Marie Beurton-Aimar University of Bordeaux 1, France
 * Mark Stickel - SRI International, USA
 * Matthias Felleisen - Northeastern University, USA
 * Scott McKay - ITA Software, USA


Contacts:
~~~~~~~~~

 * Questions: ilc10-organizing-committee at alu.org

 * Program Chair: ilc2010 at easychair.org

For more information, see http://www.international-lisp-conference.org/

Will Donnelly: Announcing the UCL Portability Libraries

Fri, 08/20/2010 - 16:33

I spent the last year learning Haskell, and it is a very cool language. That said, I get the feeling that it also gains a gigantic competitive advantage from cabal-install and HackageDB. Package repositories and install tools for languages are not a new thing, but it was Hackage that made me realize how effectively a common package infrastructure can help to raise the popularity of a language.

And here's the thing: to a sufficiently fuzzy approximation, all the R6RS Schemes out there are interchangeable. They all offer a foreign function interface with roughly the same capabilities, they all offer the ability to launch and control subprocesses, and just about anything else can be built up from there. All of the cool things that can be built from these primitives, there is no fundamental reason they can't be portable.

So in April I started working on a set of portability packages, followed by an install tool, followed by a web-based package upload and browsing interface. And now it's all sufficiently done that I feel like I should release it.

So here you go.

Caveats

It supports Ikarus, Larceny, Mosh, Racket, and Ypsilon, at least on my machine.

The install tool could certainly use more features, like a package upgrade ability, code precompilation for the various Schemes, and complex version constraints on packages. The web interface is horrible code, and I'm just happy it seems to work as well as it does.

But what I am proud of are the simple but extensible package format, with myriad available extension points for new features, the simple repository format which allows anyone to start serving up packages if they don't like my offering, and the libraries which allow FFI and process manipulation to operate identically across Schemes.

Similar Projects

This code doesn't exist in a vacuum, and there are other projects whose scope overlaps with mine.

The most noticeable, to me, is Marco Maggi's Nausicaa, a gargantuan library distribution for R6RS Schemes. I regret to say that I have no experience using it, but the sheer number of libraries is impressive. I would like nothing more than to see some of those libraries available through UCL Install at some point.

The only other R6RS package manager that I have heard of is Dorodango. It does not seem to have been officially released yet, and if I understand the documentation correctly, it only supports Ikarus and Ypsilon at this time. It does have an upgrade command, unlike UCL Install at this point.  I envy the awesome choice of name, and I look forward to seeing what the author does with this one in the future.

Finally we come to Snowfort, an older attempt at a Scheme packaging system. I do not believe it is R6RS compatible, as it defines its own module system, and it appears to have no FFI or other portability layers, so I believe packages are limited to pure Scheme code. I would like to look at either manually or automatically porting some of these packages at some point, if the authors are amenable.

Digg This  Reddit This  Stumble Now!  Buzz This  Vote on DZone  Share on Facebook  Bookmark this on Delicious  Kick It on DotNetKicks.com  Shout it  Share on LinkedIn  Bookmark this on Technorati  Post on Twitter  Google Buzz (aka. Google Reader)  

Programming Praxis: Marriage Sort

Fri, 08/20/2010 - 10:00

Today’s exercise is about a relatively new sorting algorithm. We start with an article Optimizing Your Wife by Kevin Brown, which proposes that the best way for a man to find a wife is to decide how many women he is willing to date before he chooses a wife, we’ll call that N, determine which of the first √N women is “best,” according to whatever matters to him, and then choose the next woman after the first √N that is better than any of the first √N women. For instance, to find the marriageable woman in a batch of a hundred, date ten of them, then marry the next one that is better than any of those ten. You may not find the optimal woman, but you’ll be close.

Eric Burnett turned Brown’s idea into a sorting algorithm. First, sample the first √N values at the beginning of an array, then swap any of the remaining values that are better than the greatest value of the sample to the end of the array, swap the greatest value of the sample just before those at the end, then recur on the smaller array before those greatest values. Finish the sort by performing insertion sort on the entire array; that will be quick, since most values are near their final positions.

Burnett’s algorithm requires three pointers: the current location of the end of the sample, the current location of the end of the array still under consideration, and a pointer that sweeps through the array. The time complexity is O(n1.5), which is similar to other sorting methods like shell sort and comb sort that have a first stage that nearly sorts the input followed by insertion sort to clean up the rest.

Your task is to write a function that sorts an array using marriage sort. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


Joe Marshall: P ≠ NP (part 2 correction)

Wed, 08/18/2010 - 17:41
In the previous post I said that one example of an NP problem is “Check if two computer programs always produce the same output.” jkff commented:The "check if two computer programs always produce the same output" is not NP. It's simply undecidable and cannot be solved in any amount of time. I blew it and oversimplified the statement. At Wikipedia, they have a list of commonly known NP-complete problems and on that list is “Inequivalence of simple functions.” I seriously mangled the meaning because I was daydreaming about assembly code. What I should have done is put some restrictions on what sort of computer program I meant.

First, we have to restrict ourselves to programs that we can prove will always return an answer within a given amount of time. A computer program that does not have any loops, backward jumps (gotos), subroutine calls, or exceptions should simply run from from start to finish. Second, we do not allow the program to save any state between runs; the output must be a pure function of the input. Furthermore, we restrict the input to a finite set (for example, integers that expressible in a single 32-bit word). Now suppose we have two such programs. To prove they are not equivalent, we need to find a 32-bit integer input that makes each program give a different answer. By trial and error we will eventually find such an input or exhaustively show that there is none. However, if someone were to claim that a particular input produced different output, then it is easy to check by simply trying it out.

Joe Marshall: P ≠ NP (part 2)

Wed, 08/18/2010 - 16:57
As I mentioned in my last post, P and NP are commonly encountered complexity classes. Knowing what complexity class a problem is in can give us some idea of how difficult it may be to solve the problem. So what's the difference between P and NP? I don't want to get into the technical definition of P and NP, so I'm going to make broad generalizations that aren't technically rigorous, but will give you a feel for the difference.

Problems in class P are generally considered “tractable” by computer scientists. By this we mean that a computer can probably solve such a problem efficiently and relatively quickly. Recall that the complexity classes relate the ‘number of pieces’ to the difficulty. If you can solve a particular problem in P, then a similiar problem with a ‘larger number of pieces’ is likely within your grasp (and if not, it is usually a question of upgrading to a better computer). Problems in class P are the ‘meat and potatoes’ of computer programs.

Before I discuss NP, let me introduce the complexity class EXPTIME. EXPTIME is where you find some really hard problems, like computer chess or Go. It is true that there are computers that can play chess really well, but if we were to modify the rules of chess to add a couple of new pieces and make the board 9x9 instead of 8x8, it would greatly increase the difficulty of the game. Problems in EXPTIME are generally considered “intractable”. By this we mean that a computer will have a hard time solving such a problem efficiently and quickly. And if you are able to solve a particular problem in EXPTIME, then a similar problem with just ‘a few more pieces’ is likely beyond your grasp, no matter how fancy a computer you might buy. Problems in class EXPTIME take a lot of time and money to solve.

So what about NP? Problems in NP are quite curious. They seem to be very difficult to solve, much like EXPTIME problems, but they have the unusual property that it is very easy to check if a solution is correct. Let me illustrate: if I showed you a chess position and claimed that it was a good position for white, you'd have to do a lot of work to verify whether my claim was true. In fact, it would have to be about the same amount of work it took me to come up with the position in the first place. On the other hand, if I were to show you a jigsaw puzzle and claim that I had finished it, you could tell at a glance whether my claim were true. Problems in NP seem to be much harder to solve than problems in P, but as easy to verify as problems in P. That is a little weird.

Problems in NP are often encountered in computer programs, and many of these kind of problems, although very difficult to solve, have approximations that are relatively easy. In some cases, when a perfect solution is not needed, one that is ‘good enough’ will be a lot easier to compute. Another weird thing is that a lot of problems in NP don't sound at all like they'd be hard to solve. Here are some examples:
  • Take a group of people and divide them into two subgroups such that both subgroups have exactly the same amount of change in their pockets. (No, you aren't allowed to move the change around.)
  • Find the shortest path that visits all the major landmarks in a city.
  • Check if two computer programs always produce the same output.  Whoops!  I blew this one.  Check the next posting.
There is one final bit of weirdness. It is easy to prove that EXPTIME problems are much harder to solve that P problems, but no one has proven that NP problems are harder than P problems. (or that they aren't harder than P problems!) This isn't for lack of trying. Many smart people have worked on a proof for some time. There's a million dollar prize for the first person to prove this one way or the other. Recently, Vinay Deolalikar of HP Labs claimed that he has a proof. Unfortunately, other experts in the field have pointed out flaws in the proof that may invalidate it.

Programming Praxis: Cut

Tue, 08/17/2010 - 10:00

Unix V7 provided a utility called cut that reads its input a line at a time and selectively copies portions of each input line to standard output. Portions are selected either by character position or by character-delimited field. Cut is invoked as cut -clist [file ...] or cut -flist [-dchar] [file ...].

Character mode, invoked with the -c option, retains in the output those character positions that are mentioned in the list, which may contain column numbers, or ranges of column numbers separated by a dash, all separated by commas; counting starts from one. Field mode, invoked with the -f option, specifies a list of fields in a similar manner to character mode; fields are delimited by tab characters unless the field delimiter is changed by the -d option.

For example, the command cut -f1,3 -d: /etc/passwd prints user names and userid numbers from the password file.

Your task is to write a program to implement cut. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


Doug Williams: Simple Simulation in Racket

Sat, 08/14/2010 - 21:13
I updated the simple simulation example from the simulation collection to Racket. The simple simulation example shows how a basic discrete-event simulation engine can be written in pure Racket (specifically, the racket/base language). Fully commented it is only about two and a half pages of code and does not contain any references to external packages. If you're curious about the usage of continuations, this might be a good example to look at. [If you're an expert in continuations, any critique would be welcome - particularly on the event loop in the start-simulation procedure.]

The code uses parameters for its global variables, which may be unfamiliar to some people. For example, the simulation clock is maintained by the current-time parameter. This parameter is created by (make-parameter current-time 0.0). A call to (current-time) will return its current value. The call (current-time (event-time (current-event))) in start-simulation sets its value to the time of the next event. Finally, the parameterize call in run-simulation creates new instances of the parameters and (I think) is cleaner than just resetting their values.

And, here is the actual code.


#lang racket/base
;;; Simplified Simulation System

;;; Global simulation control variables

;;; future-event-list : (parameter/c (list? event?))
;;; current-time : (parameter/c real?)
;;; current-event : (parameter/c (or/c false/c event?))
;;; event-loop-exit : (parameter/c (or/c false/c continuation?))
;;; event-loop-next : (parameter/c (or/c false/c continuation?))
(define future-event-list (make-parameter '()))
(define current-time (make-parameter 0.0))
(define current-event (make-parameter #f))
(define event-loop-exit (make-parameter #f))
(define event-loop-next (make-parameter #f))

;;; Event definition and scheduling

;;; (struct event (time function arguments))
;;; time : (>=/c 0.0)
;;; function : (or/c false/c procedure?)
;;; arguments : list?
;;; Each event has a time the event is to be executed, the function to
;;; be executed, and the (evaluated) arguments to the function.
(struct event (time function arguments))

;;; (schedule event) -> any
;;; event : event?
;;; Add an event to the event list.
(define (schedule event)
(future-event-list (event-schedule event (future-event-list))))

;;; (event-schedule event event-list) -> (list-of event?)
;;; event : event?
;;; event-list : (list-of event?)
;;; Return a new list of events corresponding to the given event added
;;; to the given list of events.
(define (event-schedule event event-list)
(cond ((null? event-list)
(list event))
((< (event-time event)
(event-time (car event-list)))
(cons event event-list))
(else
(cons (car event-list)
(event-schedule event (cdr event-list))))))
;;; Simulation control routines

;;; (wait/work delay) -> any
;;; delay : (>=/c 0.0)
;;; Simulate the delay while work is being done. Add an event to
;;; execute the current continuation to the event list.
(define (wait/work delay)
(let/cc continue
;; Add an event to execute the current continuation
(schedule (event (+ (current-time) delay) continue '()))
;; Return to the main loop
((event-loop-next))))

;;; (start-simulation) -> any
;;; This is the main simulation loop. As long as there are events to
;;; be executed (or until the simulation is explicitly stopped), remove
;;; the next event from the event list, advance the clock to the time
;;; of the event, and apply the event's functions to its arguments.
(define (start-simulation)
(let/ec exit
;; Save the event loop exit continuation
(event-loop-exit exit)
;; Event loop
(let loop ()
;; Exit if no more events
(when (null? (future-event-list))
((event-loop-exit)))
(let/cc next
;; Save the event loop next continuation
(event-loop-next next)
;; Execute the next event
(current-event (car (future-event-list)))
(future-event-list (cdr (future-event-list)))
(current-time (event-time (current-event)))
(apply (event-function (current-event))
(event-arguments (current-event))))
(loop))))

;;; (stop-simulation) -> any
;;; Stop the execution of the current simulation (by jumping to its
;;; exit continuation).
(define (stop-simulation)
((event-loop-exit)))

;;; Random Distributions (to remove external dependencies)

;;; (random-flat a b) -> inexact-real?
;;; a : real?
;;; b : real?
;;; Returns a random real number from a uniform distribution between a
;;; and b.
(define (random-flat a b)
(+ a (* (random) (- b a))))

;;; (random-exponential mu) -> inexact-real?
;;; mu : real?
;;; Returns a random real number from an exponential distribution with
;;; mean mu.
(define (random-exponential mu)
(* (- mu) (log (random))))

;;; Example Simulation Model

;;; (generator n) -> any
;;; n : exact-positive-integer?
;;; Process to generate n customers arriving into the system.
(define (generator n)
(do ((i 0 (+ i 1)))
((= i n) (void))
(wait/work (random-exponential 4.0))
(schedule (event (current-time) customer (list i)))))

;;; (customer i) -> any
;;; i : exact-nonnegative-integer?
;;; The ith customer into the system. The customer is in the system
;;; 2 to 10 minutes and then leaves.
(define (customer i)
(printf "~a: customer ~a enters~n" (current-time) i)
(wait/work (random-flat 2.0 10.0))
(printf "~a: customer ~a leaves~n" (current-time) i))

;;; (run-simulation n) -> any
;;; n : exact-positive-integer?
;;; Run the simulation for n customers (or until explicitly stopped at
;;; some specified time).
(define (run-simulation n)
;; Create new global values
(parameterize ((future-event-list '())
(current-time 0.0)
(current-event #f)
(event-loop-exit #f)
(event-loop-next #f))
;; Schedule the customer generator
(schedule (event 0.0 generator (list n)))
;; Stop the simulation at the specified time (optional)
;(schedule (event 50.0 stop-simulation '()))
;; Start the simulation main loop
(start-simulation)))

;;; Run the simulation for 10 customers.
(run-simulation 10)


An example of the output looks like:


2.3985738870908615: customer 0 enters
5.395876334125138: customer 1 enters
7.716207787604494: customer 2 enters
8.062394508850671: customer 1 leaves
9.204092461998275: customer 0 leaves
14.431739507072376: customer 3 enters
15.15611565215575: customer 2 leaves
19.107487138771972: customer 4 enters
19.67563344212178: customer 3 leaves
26.51109574171283: customer 5 enters
27.060305975366514: customer 4 leaves
31.04777263526701: customer 6 enters
32.83444054122948: customer 5 leaves
36.31758748274069: customer 7 enters
39.27687483880874: customer 6 leaves
40.643350655011744: customer 8 enters
41.76876758081738: customer 7 leaves
42.82235216823591: customer 8 leaves
47.63479457664656: customer 9 enters
57.31661407095231: customer 9 leaves


Your output will vary slightly because of the random numbers. That is, this is a stochastic model.

Doug Williams: Status Update

Sat, 08/14/2010 - 20:23
I haven't posted for a while, so I just wanted to provide an update on the status and (hopeful) direction of my various PLaneT packages. This post will cover the science, simulation, and inference collections, which work together to provide a knowledge-based simulation capability.

We at SET Corporation are using the science, simulation, and inference collections as part of our Multi-Agent, Dynamic NEtwork Simulation System (MADNESS). The inference collection is also used in our COntent GENeration from Templates (COGENT) system. These are used on a daily basis in our agent-based simulation work.

Science Collection

The current release of the science collection is version 3.10. I only expect to do bug fixes to the version 3 code. The next release should be version 4.0. For version 4.0 I plan to convert the code to Typed Racket, which has gotten to the point that typed code generally runs as fast or faster than my hand-optimized numeric code.

The only major new functionality planned for version 4.0 are fast Fourier transforms. This code is already written and ready for release, but the documentation has not been updated. Vincent St-Amour at Northeastern University converted the FFT code to Typed Racket to test the optimized code generation for numeric processing. The results were extremely impressive and I need to make a separate post on this.

Simulation Collection

The current version of the simulation collection is version 3.4. I only expect to do bug fixes to the version 3 code. At some point I will release a version 4.0, which may (or may not) be in Typed Racket. The decision to use Typed Racket will be based on my experience in converting the science collection. I will also rewrite the simulation language using syntax-parse to provide better syntax error handling.

Inference Collection

The current version of the inference collection is version 2.7. There was a flurry of releases earlier this year as we really began to use the inference collection in MADNESS and COGENT. Unfortunately, the documentation has not yet caught up with the code changes.

I made earlier posts on the upcoming version 3 rewrite. As with the simulation collection, this may (or may not) be in Typed Scheme. I will definitely use syntax-parse to implement a rule compiler for the rule language.

Joe Marshall: P ≠ NP (part 1)

Fri, 08/13/2010 - 22:22
I'm sure most people who read this blog know a bit about what this means, but I want to try to explain it in a way my mom could understand. I won't go into excruciating detail, and I'll probably gloss over interesting points.

Let's start with an analogy. Suppose you go to the store to buy a present for your nephew. He likes jigsaw puzzles, so you want to get him one. He's going to be twelve years old, so you need to find one that has the appropriate challenge. A simple one with four pieces might be great for a two year-old, but your nephew would find that boring. A big one with ten thousand pieces is probably too much for him. You know he just finished one with five hundred pieces, so you pick one with six hundred pieces because it won't be too easy, but not too frustrating. What you are doing is estimating the complexity of the puzzle.

When you get to the store you find that they are sold out of jigsaw puzzles. They have a variety of other puzzles, though. They have a Rubik's cube, a Sudoku book, some cryptograms, a Tower of Hanoi, etc. They have models you can put together as well. But these things aren't jigsaw puzzles. You can't estimate the complexity the same way. A model with five hundred pieces would be quite a bit more difficult to assemble than a jigsaw puzzle with the same number of pieces.

Perhaps you look at the age ratings for the various puzzles. The eight-piece Tower of Hanoi puzzle is rated for ages ten and up, so maybe a twelve piece? The standard 3x3x3 Rubik's cube is rated for ages eight and up, but would the 4x4x4 be too hard or too easy? What about the 5x5x5?

The kind of rating system you use to describe the difficulty of a puzzle is analogous to the “complexity class” of a problem. The complexity class roughly describes how hard a problem will be to solve, and more importantly, how much harder ‘big’ problems are compared to the ‘small’ ones. Here's what I mean: if I take a jigsaw puzzle, any jigsaw puzzle, and double the number of pieces in it, I'll make it roughly twice as hard (more or less). If I take a cryptogram and double the number of words in it, I'll actually make it a lot easier. If I take the Tower of Hanoi and double the number of discs, I'll make it incredibly harder, maybe even impossible. These puzzles are in different complexity classes.

As it turns out, a lot of puzzles are in the same complexity class as the jigsaw puzzle: the difficulty is reasonably proportional to the number of pieces. Adding a piece or two doesn't change things that much, and it is easy to figure out how long it will take to solve if you've done a few of them. A lot of puzzles are more like the Tower of Hanoi. Adding a single piece will double the amount of time it takes to solve them.

P and NP are complexity classes for problems you might want to solve on a computer. There are all sorts of rating systems, but everyone is familiar with the MPAA movie ratings. There are all sorts of complexity classes, but most computer scientists are familiar with P and NP.

I'll pause here because the analogy is wearing thin. The main point is that “P ≠ NP” is a statement about complexity classes. In particular, it talks about two of the most popular complexity classes and that is part of the reason it is interesting.

Programming Praxis: E

Fri, 08/13/2010 - 10:00

This puzzle made the rounds of the programmer mathematician blogs a while ago:

Given a random number generator that returns a real number between zero and one, how many random numbers must be selected, on average, to make their accumulated total greater than one?

Your task is to write a program that answers the question above. When you are finished, you are welcome to read or run a suggested solution, or to post your own solution or discuss the exercise in the comments below.


Joe Marshall: Rambling

Thu, 08/12/2010 - 18:00
Writing a blog is hard. I need practice so I'm going to ramble a bit.

First, I'm very pleased that Blogger has improved their comment spam detection. Every time I made a post I'd have to come back a day or two later and remove a bunch of pointless comment spam.

Second, I told my daughter that someone claims to have a proof that P ≠ NP. She replied “Well, now we know that N doesn't equal 1.”

I read through the proof and I confess that I don't get it. On the other hand, I think I want to get it, so I'll have to learn some interesting things.

Here's another thing I don't get.`` E8 is any of several closely related exceptional simple Lie groups and Lie algebras of dimension 248''

``a simple Lie group is a connected non-abelian Lie group G which does not have nontrivial connected normal subgroups.''

``a Lie group is a group which is also a differentiable manifold, with the property that the group operations are compatible with the smooth structure.''

``a non-abelian group is a group (G, * ) in which there are at least two elements a and b of G such that a * b ≠ b * a.''

``a connected space is a topological space which cannot be represented as the union of two or more disjoint nonempty open subsets''