One problem my little system of taking English and turning those ideas into some sort of neutral storage system is the fact that English loves to modify words to get them to satisfy different parts of a sentence. We can see that Chinese does that too, as does Cuneiform. The difference is that Chinese and Cuneiform are a little more obvious about how they do it.
Let's take the sentence "I am going to the store." Now let us also place that sentence in the past, in the future, and instead of "I", change the actor to "you."
|present||I||I am going to the store||我要去商店|
|past||I||I went to the store||我去了商店|
|future||I||I will go to the store||我要去商店|
|present||you||You are going to the store||你要去商店|
|past||you||You went to the store||你去了商店|
|future||you||You will go to the store||你要去商店|
mouse wearing a cape - MOUSE WEAR A CAPE
mouse that wore a cape - MOUSE THAT WORE A CAPE
the anticipation is killing me - THE ANTICIP I KILL ME
the anticipation killed me - THE ANTICIP KILL ME
nobody anticipated my being killed - NOBODY ANTICIP MY BE KILL
The computer has to know what the complete thought actually is to be able to formulate that thought into proper English. And possibly, any other spoken or written language. I'm not looking for perfect prose. If the text sounds like it's coming from a cave man, or an individual with a very thick accent, that could actually work for our purposes. Even better would be to discover the algorithm could be tuned to compose everything from Country Bumpkin to William Shakespeare.
Virtually every english sentence needs a subject and a predicate. In Chinese, each sentence has a subject a verb and an object. And then things get a bit murky.
Both have different rules that allow us to implement what news folks like to call the 5 Ws: Who, What, When, Where, Why... and How. (But they still call it the 5 Ws). Not every sentence has all of these elements, but a complete story will have a sentence delving into each. English and Chinese just have different ways of going about that process.
Long story short, I'm a software engineer. I'm not a linguist. So I'm going to throw away the technical terms that linguists use to describe the process of communication, and make up terms that better describe what this project is trying to do. Namely store thoughts in a form that can be expressed at a later time. And store those thoughts in a manner so simple an consistent that a computer can convert those ideas back into sentences.
What I've been finding is that many of the existing linguistic and grammatical terms that describe language bake in a lot of concepts that don't cross from one language to another, because each language organizes ideas in a different way. A way that makes sense for their particular way of doing things.
In a way this project is actually a bit like chemistry. You have compounds that break into Atoms. And then you find that atoms themselves are made up of sub-atomic particles. And then that those sub-atomic particles are themselves made of quarks. The Sentence structure you are used to dealing with from grammer school is on the atom level, with a confusing soup of sub-atomic particles. The computer is going to need to be working on the quark level, with a different set of rules to make English atoms vs. Chinese atoms.
Ok, maybe chemistry isn't the best analogy for people. I promise there won't be equations.
What I will be working on for my next article will be taking different sentence patterns and figuring out what sort of arrangement of blocks would be needed to accommodate all of those patterns without resorting to reading into the context. Computers can't. For our game purposes, the computer agent will have databases full of these ideas, and then use a grammer synthesizer to form those ideas into sentences. The system will also have a way to take a player's commands and turn those command into one of these data structures. It will be clearer when I get some more examples together.
Basically I'm trying to get a cluster of ideas that will snap together like lego, with different bricks connecting in designated places to form a structure. As I said... more on this as I work out specifics.