Rice’s Theorem (in a nutshell):
Unless everything is specified, anything non-trivial (not directly provable from the partial specification you have) can’t be proved
AI Implications (in a nutshell):
You can have either unbounded learning (Turing-completeness) or provability – but never both at the same time!
I have been semi-publicly called out by Eliezer Yudkowsky. He posted the following on IEET* Director James Hughes’ Facebook Wall in response to a post that referenced my last article (Coherent Extrapolated Volition: The Next Generation):
Really? All of his ideas?
This was in response to the fact that, in a clarification as to whether I was mis-characterizing the SIAI’s position in an earlier article (The “Wicked Problem” of Existential Risk with AI (Artificial Intelligence)) by using the phrase 0% risk, I made the following statement:
In response, I have to ask “What percentage risk do you associate with something that is “provably safe”? I recognize that the SIAI recognizes and would like to mitigate implementation risk – BUT they clearly insist upon a design with 0% risk. This is not a mis-characterization and, again, I have discussed it with the principals numerous times and they have never objected to the phrasing “insist on zero risk.”
And I certainly can’t let stand the claim that I “misrepresent” him (“misunderstand” might be barely tolerable since that is the game that he and the SIAI normally play, but “misrepresent” is an entirely different kettle of fish).
I also ensured that I posted his statement as an update to the article. I am a firm believer in the knowledge discovery process of science as opposed to the rhetorical process of argumentation. As such, I abhor misrepresentation along with all other attempts to obscure – like this strawman that appeared three minutes later.
That’s funny. I never ever claimed that he said such a thing. In fact, the clarification that he was responding to emphasized (by bolding the word design) the fact that I was only talking about design risk (after also specifically noting that the SIAI recognizes and would like to mitigate implementation risk). Isn’t all of this obvious from the first quote in this article?
At this point, he wants us to believe that his strawman successfully dismisses the entire article. I’d normally argue this immediately but there is a bigger, better fish in there to be pursued. Eli constantly refers to “AIs that prove changes correct”. Unless he cares to dispute this, it has always been quite apparent that he means “prove” in the sense of a logical/mathematical proof (i.e. guaranteed 100% correct). Yet, now, he is suddenly using the word reduce instead of eliminate. *That* is certainly worth exploring . . . and in less than a minute . . . .
|Mark Waser – Does it reduce one kind of risk or eliminate it?|
And after an hour-long pause:
This is arguably a bit rude/obnoxious as I tend to be overly impatient with “drive-bys”, rhetoric and argumentation (and I only repeat it here for completeness) but it does attempt to advance the conversation. H+ Magazine Editor Peter Rothman, in particular, has been trying to start numerous conversations about the impact of Rice’s theorem on Yudkowsky’s claims about provability and his “Friendly AI”. Successfully engaging Eli in such a conversation could clarify or resolve many issues that a number of people have with SIAI and “Friendly AI”.
Rice’s Theorem states that unless the input to a program is completely specified and the transition from input to output is completely specified, you can’t prove non-trivial properties about the output of the program (with non-trivial meaning anything that isn’t directly provable from the specification). Or, more simply, the relatively clear English statement above which is basically true by definition (a definitional tautology) and should be comprehensible by anyone. Rice’s theorem uses the specific “term of art” partial functions to refer to programs, procedures, etc. that are not “total” (i.e. fully specified).
Inputs can only be completely specified it they, or the complete set of their ranges, is countable (enumerable). For example, the integers between one and ten are countable. Infinity is not countable. The integers between one and any concrete number are countable (though possibly not during a lifetime). The real numbers between 0 and 1 are infinite and uncountable despite being bounded; however, they can be divided into a countable number of sets that cover that complete range (for example, a set 0 <= x < 0.5 and a set 0.5 <= y <= 1) which can then be specified.
CPU designs can be fully specified because they only accept binary inputs of specific fixed lengths (eminently countable). Some operating systems are provably safe because they only accept specified input and reject the set/range of all unspecified input. Similarly, certain programming languages (Adga, Charity, Epigram, etc.) disallow partial functions and, therefore, all programs written in them can be proved correct—but this disallowing carries a HEAVY price. Requiring that the transition from input to output be completely specified disallows any learning and change that is not completely specified. The “term of art” for unbounded learning systems is that they are “Turing-complete” (based upon Turing’s model of a state machine and an infinite input tape which being infinite, cannot be counted/specified).
The “killer consequence” of Rice’s theorem is that you can EITHER be Turing-complete (and have the capability of unbounded learning) OR have provable properties (like safety) BUT NOT BOTH.
So, his statement about what Rice’s Theorem says is a bit wonky but he is definitely very aware that CPUs are members of the class of “special cases for which the facts are provable”—and deploys that as a red herring. Notice, however, that he makes *NO* attempt to argue that his design is in that same class – I tend to assume because he is aware that it isn’t. He also deploys the standard SIAI post-obfuscation “this is so complicated that obviously no one except me can understand it”. Then, he continues to insist on conflating operational error/risk via cosmic rays with design errors before throwing in a huge new (to this thread) claim that “non-change-proving AIs are effectively *guaranteed to fail*.”
Of course, the most interesting facet of this last claim is that it means that if he can’t get around Rice’s theorem, he is claiming that his design is guaranteed to fail. My speculation has long been that this may well be the reason why he has decided to stop all work on implementing AI. So, I couldn’t resist replying in hopes of drawing him back.
Unfortunately, that is where the conversation stands. It would be excellent if this article could draw Eliezer into contact with the numerous others who agree that AI Safety is a critical issue but who don’t agree with many of his contentions. I am sure that Hank Pellisier would be absolutely delighted to publish any rebuttal that Eli would be willing to send his way. I similarly suspect that Peter Rothman would likely be very happy to publish a good rebuttal in H+ Magazine. And I will post an update to this article pointing to any rebuttal that I become aware of—even if it is only in the LessWrong echo chamber.
So, Eliezer, let me return the favor. I claim that you either misunderstand my position or are deliberately misrepresenting it. I claim that you either do not understand Rice’s Therorem’s implications for “Friendly AI” or that you are deliberately distorting and dodging them. Are you capable of maintaining your positions in the face of a rational scientific discourse without resorting to rhetoric, misrepresentations or other forms of the Dark Arts (as you yourself term them)? Or are you going to continue to “duck and weave” and avoid all meaningful engagement with those who honestly disagree with you?
*The Institute for Ethics & Emerging Technologies is an excellent non-profit, non-fear-mongering organization that includes “existential risk” in its portfolio of interests
* here image from http://yttalk.com/threads/big-vision.29714/