(Provo) The AGI Laboratory in Provo Utah is conducting a study related to a feasibility study of comparing human intelligence versus prototype Artificial General Intelligence cognitive architectures in this case around human-mediated AGI cognitive architectures designed to create models for teaching independent AGI systems. If you would like to help with this study you can sign up here:
(draft) Preliminary Proposal for a Mediated Artificial Super Intelligence Study, Experimental Framework and Definitions for an ICOM Independent Core Observer Model Cognitive Architecture based System
This preliminary study proposal is designed to determine if there is enough evidence of intelligence in an independent core observer model based mediated artificial superintelligence system or to indicate circumstantially in terms of creating a group collective “Supermind” (Malone). This initial proposal thesis that “a Mediated Artificial Super Intelligence or mASI system based on the Independent Core Observer Model (ICOM) cognitive architecture for AGI is or may be Conscious, Self-aware, Pass the Turing test, Demonstrate Qualia and other subjective measures as to demonstrate the possibility of being an Artificial Super Intelligence, well above the human standard. Our hypothesis is that this preliminary research program will validate and justify the continued research to refine and test mediated artificial superintelligence systems. If we can demonstrate enough evidence to support this hypothesis then further research may be warranted, which this preliminary study is designed to verify.
In an effort to determine further investment in this line of research this preliminary study proposal designed to test to determine if there is enough evidence of intelligence in an independent core observer model based mediated artificial superintelligence system to warrant further research or a more robust research program. This initial proposal thesis that “a Mediated Artificial Super Intelligence or mASI (Kelley) system based on the Independent Core Observer Model (ICOM) cognitive architecture for AGI is or maybe Conscious, Self-aware, Pass the Turing test, Demonstrate Qualia and other subjective measures as to demonstrate the possibility of being an Artificial Super Intelligence or Supermind (Malone) well and above the human standard. Our hypothesis is that this preliminary research program will validate and justify the continued research to refine and test mediated artificial superintelligence systems. If we can demonstrate enough evidence to support this hypothesis then further research may be warranted to which preliminary study is designed to justify or not.
Initial Research Goals
Besides the stated high-level goal of verifying the hypotheses there will be a number of measures or sub-goals in this study that include the following:
- To determine if we can functionally measure intelligence quotient (IQ) in an mASI system to compare with human subjects in a control group of individual humans. Is there an indication of a difference between the mASI system and the control group?
- If the mASI system can have a functional IQ measurement we must determine if that measure is above that of a group of humans working on an IQ test together. Is there a measurable differential or at least indication that there could be such a differential?
- To determine more subjective measures that are less qualitative, but thought of colloquially as supporting mASI as a functional system when justifying further research, including running a Turing test on the control group of humans versus an mASI and the Yampolskiy method (Yampolskiy) to determine if an mASI system experiences qualia, or at least possibly exhibits evidence of experiencing qualia, and to determine if an mASI system can be scored on the subjective Porter Method (Porter) for measuring consciousness.
- If this line of research proves at least worth further investigation a long-term goal is to create a safe structure to create independent AGI without the associated risk. Using an Artificial Super Intelligence framework even a mediated one can act as a safe box for keeping an independent AGI inline. While this study doesn’t do this out of the gate this is the long-term goal should the line of research prove worth additional investment.
The supposition is that these goals will provide the basis for the context needed to determine the value of the hypnosis and to justify, or not, further research along these lines related to mASI ICOM AGI systems.
Elemental Framework – Research Groups
The proposed structure for this preliminary study includes 3 core test groups as defined here including one subgroup or rather two control groups. Each group provides some fundamental basis for comparing and contrasting verses the other groups. Those groups include:
Group 1.0 “No Group, Control Group, Non-Proctored” – this group should be at least 30 randomly selected humans of various demographics that will not be supervised in their testing.
Group 1.1 “No Group, Control Group, Proctored” – this group should be at least 30 randomly selected humans of various demographics that will be supervised in their testing.
Group 1 and 1.1 should show that the 1.0 group tends to perform better given the likelihood of them cheating and this comparison will validate that tendency in humans.
Group 2 “In Person Group Collective” – this could be done as more than one group, but for the purpose of this preliminary study should be at least 30 adult humans that are administered a test over a given venue collectively where their group is able to communicate with each other to execute the tests given to groups 1.0 and 1.1. This should provide a comparative framework to compare humans in groups vs individual humans where the underlying supposition is that humans can perform better in groups.
Group 3 – “Mediated Artificial Intelligence System” or mASI where an instance of an mASI using the ICOM cognitive architecture is used consisting of at least 10 contextual generating nodes as well as a standard ICOM context engine to execute individual tests on its own proctored as in group 2. The supposition is that this gives is a preliminary comparison to group’s 1 and 2 to compare the mASI vs humans in groups and individual humans. This comparison and analysis should provide some evidence to verify the hypothesis, allowing a determination as to if further research is warranted.
Program Information Security and Policies
It is important to understand that the human subject’s information, especially identifiable information, is secure and separated from results. There will never be any way to affiliate specific data with individual human subjects. This means that all published data will be scrubbed, only used in a collective way. Demographic data is then used only for high-level comparisons and used in the abstract. The structure of this includes all subjects will be given a demographic survey and assigned ID’s. demographic data will not be directly associated with any individual, but with IDs, with that data stored in a GAP level secure system only, with no internet connection for the scope of the study, and all copies with ID values will be deleted or destroyed with only the GAP level secure documents stored in a digital archive. Assigned ID’s and contact data will be separate files from the demographic files and only stored in this secure manner to protect the human subjects. This also means that after each survey is collected and that data transferred and split that the demographic survey results will be deleted from the collection service.
Some of these questions will not be used in this initial or preliminary study, but there is significant research evidence that they affect the group and collective intelligence (Woolly) and would be needed in a wider study to be able to use the results here in an expanded research program.
Tests and Measures
There are 3 sets of test types that were considered for this study including an analysis tests for subjects, for use in further research, but will not be evaluated in this preliminary study, then qualitative and subjective tests as follows:
- Demographic Analysis
These tests are designed to get a general survey of the demographics of the human subjects in the studies, where the primary reason is a further correlation with additional research that my done after this point. These include the initial survey and may include additional surveys, as might be later defined separately than the initial questions listed above. Such data is kept separate from primary research data as per the secure information policy for this study.
- Qualitative Intelligence Tests
Intelligence Quotient (IQ) tests – are tests designed to measure ‘intelligence’ in humans (WF) where we are using short versions to assess only relative trends or the potential for further study, whereas given the expected sample size results will not be statistically valid, nor accurate other then at a very general level, which is believed to be enough to determine if the line of research is worth going down. Of these tests, two types will be used in the study, one a derivative of the Raven Matrices Test (WF) designed to be culturally agnostic, and the Wechsler Adult Intelligence Scale (WAIC)(WF) Test which is more traditional. Lastly falling into the category of WAIC there is a baseline full Serebriakoff MENSA test that we will apply to come and contrast scores between the two baselines tests.
Collective Intelligence (CI) Test – we would like to use this test, however, the information for executing this test is not publicly accessible and reaching out to the researchers that created this test has produced no response. (Edgel)
- Extended Meta Data and Subjective Tests
A number of tests or measures will be collected, more oriented towards analysis for further study, primarily around correlative purposes. None of these tests may be used outside of as possible illustrative examples, without being statistically valid given the rigor or subjective nature of these measures. These tests if considered would be outside the scope of the initial study.
The Turing Test – this test is not considered quantifiable and there is debate over whether this measure tells us anything of value, however, we will execute this test as a reference value.
The Porter Method – This appears to be a qualitative test, but individual question measures are entirely subjective and therefore the test lacks the level of qualitativeness to be valid without a pool of historical values to measure against, however, we will execute this test as a reference value.
The Yampolski Qualia Test – is a subjective measure of a subjective ‘thing’ and therefore not a qualitative measure, however, we will execute this test as a reference value. In theory, this only tests for the presence of Qualia in human-like subjects, passing this test does not mean that a subject does not experience qualia in the sense of the paper, just that it was not detected. This means that subjects may show signs of qualia, or not, but the test does show if they don’t experience it.
Autistic Spectrum Test – This test is a pre-diagnostic test demonstrating the potential of a subject to be on the autistic spectrum and is only an indication that the subject should consider evaluation professionally, and we will use this test only as a subjective measure as a reference value for later correlation or possible research directions with ICOM based systems.
The tests currently being considered are:
Experimental Results Analysis
Given the sample size, the threats to conclusions’ validity is the problem in the expected sample size. The main issue then is to ensure that we don’t fall to common analysis fallacies (Trochim) including not seeing a relationship that is not true, seeing a relationship when there is not one, conclusion errors, other cognitive biases in an analysis, or just issues with the sample size of subjects.
The primary analysis in this preliminary study will be to see if there is any evidence of a differential between IQ tests of the 4 test groups. If such a clear difference is present, even if not a large enough example to be statistically valid, such an indication in the positive meaning that the mASI group shows significant evidence of being more intelligent than the other samples, and to what degree that is true would support the hypothesis. The only real conclusion from the intended sample sizes would be whether to proceed or not with further research. To that end, other tests or analysis would not be qualitative, but subjective, and while interesting would not in themselves support the primary research objective. If the results show no evidence of an mASI system being more intelligent than groups 1, 1.1, and 2 then the mASI program will likely be shut down or fundamentally changed.
Further Research under the ICOM AGI program this preliminary study is associated with would include a much more detailed study if results come out in expected ranges. Further research then would also include bias filtering in mASI systems and studies involving a group 5 pre-trained asset in the mASI execution. Lastly, in any study build that these preliminary results in a fully proctored IQ test would be run against the study subject groups. Hypotheses in particular that would be considered include’s that trained ASD subjects used in mASI contextual agents would produce a greater level of cognitive function.
Study Framework Conclusions
Conclusions based on the process of producing this preliminary study framework include a couple of points on how that study will be executed, including the likelihood of a low bar of around 100+ subjects with at least 30-person groups in each venue of any further studies. Given the small sample size it should still be large enough to determine if it is worth a deeper or more rigorous program with a particular focus on mASI mental performance as well as studies in terms of filtering for cognitive bias, conditioning and training as well as opening up the door for an AGI safety structure using mASI as part of the containment given that mASI is based on AGI cognitive architecture.
Engel, D.; Woolley, A.; Chabris, C.; Takahashi, M.; Aggarwal, I.; Nemoto, K.; Kaiser, C.; Kim, Y.; Malone, T.; “Collective Intelligence in Computer-Mediated Collaboration Emerges in Different Contexts and Cultures;” Bridging Communications; CHI 2015; Seoul Korea
Kelley, D.; “Architectural Overview of a ‘Mediated’ Artificial Super Intelligent Systems based on the Independent Core Observer Model Cognitive Architecture”; Informatica [pending review]
Kelley, D.; Waser, M.; “Human-like Emotional Responses in a Simplified Independent Core Observer Model System;” Procedia Computer Science; Elsevier; BICA 2018; PCS 123(2018) 221-227
Kelley, D.; “The Independent Core Observer Model Computational Theory of Consciousness and the Mathematical Model for Subjective Experience;” ICNISC 2018; ISBN-13: 978-1-5386-6956-3
Malone, T; “Superminds – The Surprising Power of People and Computers Thinking Together”; Little, Brown and Company; 2018; ISBN-13: 9780316349130
Porter, H.; “A Methodology for the Assessment of AI Consciousness;” AGI 2016; 2016; Portland State University
Serebriakoff, V; “Self-Scoring IQ Tests;” Sterling/London; 1968, 1988, 1996; ISBN 978-0-7607-0164-5
Trochim, W.; “Threats to Conclusion Validity;” OCT 2018; http://www.socialresearchmethods.net/kb/concthre.php
Wikipedia Foundation (WF); “Raven’s Progressive Matrices;” Oct 2018; https://en.wikipedia.org/wiki/Raven%27s_Progressive_Matrices
Wikipedia Foundation (WF); “Wechsler Adult Intelligence Scale;” Oct 2018; https://en.wikipedia.org/wiki/Wechsler_Adult_Intelligence_Scale
Wikipedia Foundation (WF); “Intelligence Quotient”; Oct 2018; https://en.wikipedia.org/wiki/Intelligence_quotient
Woolly, A.; “Collective Intelligence In Scientific Teams;” May 2018
Yampolskiy, R.; “Artificial Intelligence Safety and Security;” CRC Press, London/New York; 2019; ISBN: 978-0-8153-6982-0
Yampolskiy, R.; “Detecting Qualia in Natural and Artificial Agents;” University of Louisville, 2018