One of the major smalltalk shortages, which often mentioned by people, is lack of good interoperability with operating system and external libraries.
I know that originally, smalltalk were designed as an fully functional environment for personal computers, with very little (if any) need for low-level support of what we calling today as “Operating System” or basically “environment”. A modern examples, like SqueakNOS demonstrating very clearly, that smalltalk image can run without any OS, and still be quite functional and even useful.
But here, i do not want to analyze why or what prevented to push smalltalk environment into masses (if it it would be successful, there will be no need to talk about any interoperability with non-smalltalk world at first place), what i want to say is that it is a mistake today to keep trying to stick with paradigm where things happen only inside smalltalk environment, and nothing exists beyond it.
If smalltalk environment runs inside another environment (such as an OS), it is a mistake to not have decent interoperability with it.
So, what we have today? Both, Squeak and Cog VMs, is more or less same in this regard now (sure Cog is more advanced😉.
We have an FFI. Okay, maybe it is not as good or complete as analogous implementations in other similar languages, but in my opinion it is quite decent and we’re not standing here. So, on this front, i think, we are doing fine.
But FFI is just one side of medal: it allows an application(s) written in smalltalk to speak with external libraries, effectively supporting model where smalltalk environment is a host , while external modules is servants (i.e. embedded).
But if we turn our medal upside down, we can see that we’re absolutely missing (literally nothing), to support running smalltalk as embedded language in host application. And i see very little movement (and attention) towards changing the situation in this regard.
Personally, i think that smalltalk as embedded language could be best what i would dream of.
So, if we would like to do it, where we should start from?
First is VM , of course. We should turn VM from being self-sustaining and all-knowing OS-level process into a library, so then a host applications may link either statically or dynamically with it.
Second, we need an API, allowing host application to communicate with VM and be able to control execution of smalltalk code, as well as control the object memory and available VM capabilities.
There are many things, which hardcoded in VM, assuming that it is running as a standalone process in OS. For example a FilePlugin, a plugin which operating with files, but despite it has “plugin” name, is not optional. In cases when a host application using smalltalk as embedded language, it is easy to imagine that in many situations a smalltalk part may not require direct manipulation with files at all, since host application can take care about providing data to smalltalk side by using own, often more efficient, ways.
So we should make sure that VM does not assumes that it runs as a full-fledged environment, because when embedded, it is totally not VM’s cocern, but up to host application.
All of the listed above is relatively easy to fix, because it more or less about the same: we should make VM code more modular, and ensure that VM can be used as a library with well-defined API (like Lua does).
There’s only one thing, which makes me worry and solution might be quite tricky: scheduling. Again, since smalltalk envisioned as self-sustaining environment, it has to deal (by itself) with multiple processes to be able to run many things in parallel, and therefore VM has to support scheduling.
But again, this is not absolutely necessary when smalltalk used as embedded language. I shown that scheduling can be moved completely into language side, so VM will know very little about things like Process, Semaphore. It will still know about things like signals and contexts.. but much less. So this step, i think, is necessary, if we would like to get to the point where we can use smalltalk as embedded language.
For example, imagine that host application sent a message (through VM API, of course) and waiting for answer.. but because of scheduling, the context which evaluating given message can be interrupted and switched to another one and then even killed/lost etc , resulting the situation, where host application could never gain control back.
With VM scheduling it makes very difficult to have a simple call scheme, where host application “calls” VM, then after running some piece of smalltalk code VM returns result(s) and completely stops any activities upon next call. And even interpreter is implemented as infinite loop, once entered, never leaved.
Another aspect of same problem is that we don’t have a good abstraction around VM and interpreter state. If i would want to use multiple different interpreters with different object memory , running in parallel , i cannot do that. Because current VMs are too centered around idea that they controlling everything, and nothing happens outside. I demonstrated with HydraVM, that it is relatively easy to make VM to be able to maintain multiple interpreter states and run two (or more) object memories. But the idea needs further development and more attention.
I am not mentioning a language-side changes, because obviously, when we speaking about embedding, it will mean that in most cases we will run quite specialized image(s), far , far less feature-blown comparing to images what we use today. Yes we need decent tools support for generating such small images by either bootstrapping them or shrinking existing images, but for me VM is more important and more harder part of the story.
I’d like to know, what other thinking about it. Do you think we need to be able to use smalltalk as embedded language at all? Or we can live with just FFI?