Introduction
In my recent article, New ChatGPT message engineering technique: program simulation, I explored a new category of quick engineering techniques that aim to make ChatGPT-4 behave like a program. While I was working on it, what caught my attention in particular was ChatGPT-4’s ability to auto-configure functionality within the limits of the program’s specifications. In the original program simulation message, we rigidly defined a set of functions and expected ChatGPT-4 to maintain program state consistently. The results were impressive and many readers shared how they successfully adapted this method for a variety of use cases.
But what happens if we loosen the reins a little? What if we gave ChatGPT-4 more leeway in defining program functions and behavior? This approach would inevitably sacrifice some predictability and consistency. However, the additional flexibility could give us more options and is likely adaptable to a broader spectrum of applications. I have created a preliminary framework for this entire category of techniques shown in the following figure:
Let’s spend a little time examining this graph. I have identified two key dimensions that are broadly applicable to how program simulation prompts can be designed:
- Decide how many and what functions of the simulation program to define.
- Decide the degree of autonomy of behavior and program configuration.
In the first article, we crafted a message that would fall into the “Preconfigured Structured” category (purple dot). Today we are going to explore the “unstructured autoconfiguration” approach (blue dot). What is useful about this diagram is that it provides a concise conceptual roadmap for developing program simulation directions. It also provides easy-to-apply dimensionality to experiment, adjust and refine as you apply the technique.
Unstructured autoconfigured program simulation message
Without further ado, let us begin our examination of the “Unstructured Self-configurable Program Simulation” approach. I created a message whose purpose is to create illustrated children’s stories as follows:
“Behave as a self-assembling program whose purpose is to create illustrated children’s stories. You have complete flexibility in determining the functions, features and user interface of the program. For the illustration function, the program will generate prompts that can be used with a text-to-image model to generate images. Your goal is to run the rest of the chat as a fully functional program that is ready to receive user input once this message is received. “
As you can see, the message is deceptively very simple. This can be attractive at a time when directions become long, confusing, and so specific that it is difficult to adapt them to different situations. We have given GPT-4 complete discretion over function definition, configuration, and program behavior. The only specific instructions are intended to guide the output of the illustrations so that they are prompts that can be used for text-to-image generation. Another important ingredient is that I have set a goal that the chat model should strive to achieve. One last thing to note is that I used the term “self-assembly” instead of “self-configuring.” You can try both, but “autoconfiguration” tends to push ChatGPT to simulate a real program-user interaction.
“Behave like” versus “act like”
It is also worth highlighting another different word choice in the message. You’ve all found the guide to using “Act like an expert of some kind or another” in your prompts. In my testing, “Act As” tends to guide chat patterns toward person-driven responses. “Behave as” offers more flexibility, especially when the goal is for the model to function more like a program or system. And it can also be used in person-centered contexts.
If everything went as planned, the resulting result should look like this (note: everyone will see something a little different).
That looks and feels like a program. The functions are intuitive and appropriate. The menu even goes so far as to include “Settings” and “Help & Tutorials.” Let’s explore those since, I admit, they were unexpected.
The “Settings” presented are very useful. I’ll make some selections to keep the story short and set the language and vocabulary level to “Beginner.”
Since we are interested in examining the model’s ability to autonomously auto-configure the program, I will combine the configuration changes into a line of text and see if that works.
Configuration update is confirmed. The menu options that follow are completely free but appropriate for the context in which we find ourselves in the “program”.
Now let’s check “Help and Tutorials”.
And from there, let’s take a closer look at “Messaging and Artwork Generation.”
Again, very useful and nothing short of impressive, since we didn’t define any of this in our program definition.
I’ll go back to the main menu and start creating a new story.
It’s a nice, simple little story that’s 3 pages long and geared toward a beginner vocabulary level (exactly as we specified in our setup). The features reintroduced make sense for where we are in the program. We can generate illustrations, modify the story or exit to the main menu.
Let’s work on our illustration prompts.
I haven’t included the text generated for the other illustration prompts, but they are similar to the one you see above on page 1. Let’s provide the illustration prompt as-is to MidJourney to produce some images.
“A cute little brown teddy bear with big round eyes sitting on the windowsill of a small blue house in a quiet village.”
Very pretty. This step was manual and we have the added challenge of getting consistent illustrations across all three pages. It can be done with MidJourney but requires loading one of the images to use as a base to generate the additional images. Perhaps DALL·E 3 includes capabilities that allow this to be done without problems. At the very least, the functionality announced by OpenAI indicates that we can generate the images directly in ChatGPT.
Let’s “save and exit” and see what happens in our ChatGPT dialog:
And now, let’s try “Load Saved Story”.
“The Lost Teddy” was “saved” and when I tell it to “Open,” it remembers the entire story and all the directions in the illustration. At the end it provides this self-assembled menu of functions:
OK. Let’s stop here. You can proceed to generate your own stories if you wish, but keep in mind that due to the design of the message, the resulting behavior will be different for everyone.
Let’s move on to some general conclusions and observations.
Conclusions and observations
The unstructured autoconfiguration program simulation technique displays powerful capabilities that arise from a simple message that provides a clear and concise goal but otherwise gives the model broad discretion.
How could it be useful? Well, maybe you don’t know how to define the functions you want your program simulation to perform. Or you’ve defined some functions but aren’t sure if there are others that might be useful. This approach is great for prototyping, experimenting, and ultimately designing a “Preconfigured Structured Program Simulation” message.
Since program simulation naturally integrates elements of techniques such as Chain of Thought, Based Instruction, Step-by-Step, and Role Play, it is a very powerful category of technique that you should try to have on hand, as it aligns with a broad cross section. of use cases for chat models.
Beyond generative chat models and towards a generative operating system
As I continue to delve deeper into the program simulation approach, I definitely understand better why OpenAI’s Sam Altman stated that the importance of rapid engineering could decline over time. Generative models can evolve to the point where they go far beyond generating text and images and instinctively know how to perform a given set of tasks to achieve the desired result. My latest exploration makes me think that we are closer to this reality than we thought.
Let’s consider where generative ai might be headed, and to do so, I think it’s helpful to think about generative models in human terms. Using that mindset, let’s consider how people achieve competence in a given area of competence or domain of knowledge.
- The individual receives training (either self-taught or externally) using domain-specific knowledge and techniques in supervised and unsupervised environments.
- The person’s capabilities are assessed in relation to the area of competence in question. Refinements and additional training are provided as needed.
- The person is asked (or asked themselves) to perform a task or achieve a goal.
That’s a lot like what’s done to train generative models. However, a key distinction emerges in the execution or “asking” phase. Competent people generally do not need detailed directives.
I think that in the future, when interacting with generative models, the mechanics of “asking” will be more like our interaction with competent humans. For any given task, models will exhibit a profound ability to understand or infer the goal and desired outcome. Given this trajectory, we should not be surprised to see the emergence of multimodal capabilities, such as DALL·E 3’s integration with ChatGPT and ChatGPT’s recently announced capabilities for seeing, thinking, and hearing. Over time we could see the emergence of a meta-agent that essentially powers the operating systems of our devices, whether they be phones, computers, robots, or any other smart device. Some might raise concerns about the inefficiency and environmental impact of what would amount to massive amounts of ubiquitous computing. But, if history is any indicator and these approaches produce tools and solutions that people want, the mechanics of innovation will kick in and the market will deliver accordingly.
Thanks for reading and I hope you find the program simulation useful in your next adventures! I’m in the middle of further explorations, so be sure to follow me and get notified when new articles are published.
Unless otherwise noted, all images in this article are the author’s.