I have weak understanding here. Chapter 7 of BoI is related to this discussion.
What are the limits of machine learning? My understanding is that all machine learning methods currently in use are trying to use induction ideas to create knowledge. That goes against one of the core idea in BoI which is that one cannot create knowledge using induction. But we know of programs that have created knowledge using induction like AlphaZero which created knowledge of how to play chess.
I can think of some possible lines of reasoning to find the errors in my reasoning here but I cannot progress further. Possible lines are:
The knowledge in the program was put there by the programmers so the knowledge was indeed created by creative processes.
There are indeed some methods to create knowledge by non-creative processes which is how knowledge in that chess playing program got there. The process by which knowledge was created by that program as described by the program writers might be sounding like induction to me but is actually not.
Creativity isn’t necessarily required in the process of knowledge creation. Knowledge is created by evolution. Evolution requires a population of replicators subjected to variation and a selection process. When knowledge is created by biological evolution the variance comes from mutation. When knowledge is created by idea evolution the variance comes from creativity.
Disclaimer: I have never heard of AlphaZero or even Monte Carlo tree search before your post.
AlphaZero works by undergoing a training period where it does “reinforcement learning” from games of self-play. It learns from a “tabula rasa” in the sense that it starts with only the rules of the game of chess. It is not provided with any “domain-specific human knowledge or data”. The initial parameters of the neural network are randomized. These parameters are updated throughout the training period.
Here’s a wikipedia explanation of MCTS:
“The application of Monte Carlo tree search in games is based on many playouts, also called roll-outs . In each playout, the game is played out to the very end by selecting moves at random. The final game result of each playout is then used to weight the nodes in the game tree so that better nodes are more likely to be chosen in future playouts.”
My understanding is that AZ doesn’t just select moves at random. Moves are selected in proportion to the root visit count which is stored in the parameters of the neural network. (this part I need to do more research on. I have a friend who works with neural networks I can ask, I can update after I talk with him)
In this case I am thinking of “good” or “winning” chess moves as the replicators. “Good” moves are moves which were identified as having a high probability of winning (or at least drawing) during the training process. These moves are stored in the neural network and reused during gameplay.
There is also variation happening during the training period. For a given game state s, random moves (not entirely random since they are based on the visit counts of the root state) are simulated until an end state is reached (win, loss, or draw). A score is then given to the move based on the result (+1 for win, 0 for draw, -1 for loss). This is the selection process that allows AZ to “learn” good moves from a distribution of random moves.
The end result of the training (variation and selection) process is a population of very good chess moves (replicators). Based on my understanding it seems that AZ is creating knowledge through the process of evolution. But it is relying on random variance rather than creativity.
Happy to hear feedback on where I went wrong =) The parts I need to research more are how the moves are selected based on the root visit count. And also whether good chess moves can be considered replicators (i have made mistakes with replicators in the past).
The rules of the game are domain-specific knowledge. But anyway:
Why was there a significant delay going from playing Go to Chess? Coding the rules of chess should only take them a few days. And they should have it playing hundreds of board games (and Starcraft) by now if the only effort needed is to code the game rules since nothing but the rules is domain-specific. It should be easy to use with stuff where the rules and outcomes are well-defined and reasonably simple.
Why were there different versions of AlphaGo? What did they change between versions to make it better at Go? Whatever the answer, were there human beings thinking about which changes would make it better at Go, and trying to do those changes? If so, that is domain specific knowledge that they’re adding.
Is AZ creating better chess moves during training? No. All possible moves are known at the start. Chess moves like ne4 (or c3-e4, possibly with a color specification) are not being created or removed. What it does during training is change node weightings. And the nodes are not chess moves. It doesn’t end training with the information that ne4 is a higher weighted move than ng4. The weighted nodes mean something else.
This is obvious. I’m surprised this didn’t it occur to me. Evolution is itself mentioned as the other process which can create knowledge. I think BoI says that (human) creativity is the only process which can create explanatory knowledge. I think that shouldn’t matter to this discussion because I’m interested in understanding what kind of knowledge can be created by processes called machine learning.
This was poor of me. I was depending on other people to do the work of finding out what machine learning is, what kind of knowledge it is creating. I should’ve posted some article that I was ready to explain. I don’t understand this stuff myself only so I was asking others to do the work for me.
I think your analysis of AZ creating knowledge through evolutionary processes is wrong for reasons that Elliot mentioned. I don’t see how a move can be a replicator.
You’re talking about the nodes in neural networks right? I think you are. I think the nodes in Monte Carlo tree search are indeed positions and the edges are the moves.
There wasn’t much delay in AlphaZero. AZ is the algorithm they created after the success of AlphaGo which was specialized for Go. AZ was playing Go Chess and Shogi pretty soon. That is to say that it wasn’t given domain specific knowledge.
AlphaGo was indeed given domain specific knowledge. It was specialized for a domain.
IDK if there are any good, general ideas about that (tho I suspect not outside trivial cases).
WRT neural nets, I’d say the collection of weighted edges + nodes are a sort of image or digest of a few things: the neural net algorithm, training data, fitness function(s), programmer choices like what the input nodes connect to and what output nodes mean, any injected randomness/entropy, etc.
Can you think of a counter-example to this idea? You do say the following, but you don’t explain it:
The rules of the game are domain-specific knowledge.
Note they specifically called out domain-specific human knowledge. I’ll just provide you with their full quote instead of taking it out of context:
“Our results demonstrate that a general-purpose reinforcement learning algorithm can learn, tabula rasa—without domain-specific human knowledge or data, as evidenced by the same algorithm succeeding in multiple domains—superhuman performance across multiple challenging games.”
For reference here is the “domain knowledge” they supply the program:
1.The input features describing the position, and the output features describing the move,are structured as a set of planes; i.e. the neural network architecture is matched to thegrid-structure of the board.
2.AlphaZero is provided with perfect knowledge of the game rules. These are used duringMCTS, to simulate the positions resulting from a sequence of moves, to determine gametermination, and to score any simulations that reach a terminal state.
3.Knowledge of the rules is also used to encode the input planes (i.e. castling, repetition,no-progress) and output planes (how pieces move, promotions, and piece drops in shogi).
4.The typical number of legal moves is used to scale the exploration noise.
5.Chess and shogi games exceeding 512 steps were terminated and assigned a drawn out-come; Go games exceeding 722 steps were terminated and scored with Tromp-Taylorrules, similarly to previous work.
And they should have it playing hundreds of board games (and Starcraft)
“AlphaStar was ranked above 99.8% of active players on Battle.net, and achieved a Grandmaster level for all three StarCraft II races: Protoss, Terran, and Zerg. We expect these methods could be applied to many other domains.”
I haven’t looked into the difference between AlphaStar and AlphaZero but they are made by the same people.
Also looks like it’s fairly adaptable to any kind of game:
Why were there different versions of AlphaGo?
I didn’t read much about AlphaGo but there was only one version of AlphaZero:
“The hyperparameters of AlphaGo Zero were tuned by Bayesian optimization. In AlphaZero, we reuse the same hyperparameters, algorithm settings, and network architecture for all games without game-specific tuning.”
Is AZ creating better chess moves during training? No. All possible moves are known at the start.
That’s true they aren’t creating new moves. I should have been more specific in my original comment. When I said “good moves” I probably could have called them “good tactics” or “good strategies”. A move on it’s own can’t be good or bad, it depends on the game state. The same move could be game-winning or game-losing in different game states.
I think that good tactics are replicators. Imagine a room full of amateur chess players watching two grandmasters play. The grandmaster playing black starts with a strong opening sequence consisting of 3 moves. The amateur players are impressed with opening and take notes on that sequence of moves. In their future games they might try to replicate that sequence against their opponents.
Now consider the same situation but with two amateurs being watched by a room full of grandmasters. The amateur playing black uses a novel opening that none of the grandmasters have seen before. However, his opening results in a strong positional disadvantage and he ends up losing the game. None of the grandmasters take note of his sequence of moves and they do no replicate it any future games.
AlphaZero tests out many different tactics and strategies based on all kinds of different game states. Winning tactics and strategies are “stored” (I think, still need to talk with my neural network friend) in the neural network and are replicated in future live/training games.
Here are some interesting quotes on the gameplay of AZ:
“I always wondered how it would be if a superior species landed on earth and showed us how they play chess. I feel now I know”. — Peter Heine Nielsen
“It doesn’t play like a human, and it doesn’t play like a program. It plays in a third, almost alien, way.” — Demis Hassabis
They also provide some pseudocode for the AZ algorithm. I think it would be fun/worthwhile to try and understand the pseudocode and how it works. I looked over it myself but need to learn more about MCTS and neural networks to understand it fully.
Moves aren’t stored in the NN like you would store them in a tree for searching. A naive implementation might use a straight mapping of the output layer of a NN to some encoding of the move. e.g., with a NN to play chess, mb you have 8 + 8 + 8 + 8 + 5 output nodes (move from rank + file (8+8), move to rank + file(8+8), promotion or none(4+1); nb: not sure if this is enough to encode all possible moves, but it’s demonstrative; also, more efficient encodings are definitely possible).
A general simplification is like: when good stuff happens increase the weights + biases of paths between input and output nodes, and when bad stuff happens decrease weights + biases of paths between input and output nodes.
I have two suggestions that might help you make some progress:
ctrl+f for “induction” in BoI and closely re-read what DD has to say.
explain why you think knowledge creation is happening. Some things to consider: Could it be possible that ML is not actually creating knowledge, just reorganizing it in some way (or something like that)? Where is the boundary between when knowledge creation happens and when it doesn’t? Does BFS, DFS, A*, etc create knowledge? Does an “evolutionary” search algorithm create knowledge?
My feeling is that the discussion is a bit off topic at this point and it might be good to reduce the complexity a bit and focus on the core of the topic.
Do you think it would still be accurate to say the “tactics” are stored in the NN? It seems like they are stored in the NN in the form of weights, biases of paths, etc.
If they aren’t stored in the NN where would they be stored?
My feeling is that the discussion is a bit off topic at this point and it might be good to reduce the complexity a bit and focus on the core of the topic.
You’re not wrong. Do you think that’s ok for the “friendly” board though? One of the things I liked about discord was that we could have informal, unstructured discussions. I think allowing those kinds of discussions would be a good fit for the “friendly” board.
Some knowledge is encoded in them, somehow. Just like knowledge is encoded in our brains somewhere (mb NN structure, mb RNA, mb combinations of that and other things). It’s also like how knowledge is encoded in genes, or printed words, or the states of electrons in a computers memory. Maybe a better way to say it would be like a shadow or projection of some knowledge is encoded in those things. But we don’t know how to read or understand those encodings generally. (It’d be a big deal if we did)
Yes, it’s up to you and/or @doubtingthomas. It depends on your goals. I think Elliot’s comments in Friendly Category Posting Policies Question give some good background on the Friendly category and what’s okay/not okay. In this case, reducing complexity was just my suggestion, but I’m not going to be pushy about it or criticize you for talking about AlphaZero or things. (Well, I’m going to try to be like that, and it’s fair for someone to call me out if I’m not being like that)
I agree. And based on my reading on AZ it doesn’t seem like the tactics were encoded into the algorithm or otherwise put there by the programmers. The same algorithm was used for go, chess and shogi. So were the optimal strategies for all three of those games somehow encoded into a single, generic algorithm? That seems unlikely to me especially considering we know exactly what domain knowledge was provided to the algorithm (rules of the game, legal moves, etc.).
So if the tactics are stored or encoded into the NN where did they come from? I haven’t seen any evidence or explanation of how they could have been encoded into the algorithm. So my conclusion is that the tactics (which I think are knowledge) were created by AZ (using evolution, but not creativity).
The knowledge in the program was put there by the programmers so the knowledge was indeed created by creative processes.
This one? I’m a bit confused so I will write down as much as I can to make things clear. Are you asking me if I can think of a counter example to this idea? I think you are. What does this idea say? I think it says: The knowledge in AZ was put there by the programmers so knowledge in that program was indeed created by people writing that program. The knowledge in AZ was put there by Deepmind people. You are asking me if I can think of a counter example of this idea. What does that mean? An example which can counter this idea?
Let’s presume that in some wayall the knowledge about playing chess/go/etc was external to the “learning” that AZ did, i.e., no new knowledge was created. If that’s a false presumption, then there should be stuff about AZ’s performance/output that can’t be explained. In essence, let’s look for a contradiction.
In the context of the assumption above, that observation just means that the underlying AZ algorithm has some reach. That’s not unexpected. Do you have a criticism of the idea in this case, the apparent differences between Chess, Go, and Shogi are just parochial misconceptions?
I don’t think that AZ finds optimal strategies (it might, but there’s no guarantee). Certainly there are some good and novel ‘strategies’ used by AZ when playing the games, tho. I put ‘strategies’ in quotes b/c when a person thinks about a strategy they’re using explanatory knowledge. The result of AZ’s training doesn’t create explanatory knowledge, tho. If it did, then AZ would be a person according to BoI, ch3:
The ability to create and use explanatory knowledge gives people a power to transform nature which is ultimately not limited by parochial factors, as all other adaptations are, but only by universal laws. This is the cosmic significance of explanatory knowledge – and hence of people, whom I shall henceforward define as entities that can create explanatory knowledge.
“Likeliness” shouldn’t be how you judge something correct or not. IMO a better way to think about it is Which decisive criticisms are unanswered?
Broadly, there are two parts for AZ: training and playing. Provided that the training side has some universality (e.g., given any board game specification it can ‘learn’ to play it), then I don’t think there’s anything particularly interesting going on there. If AZ’s training part doesn’t have some universality, then we should be able to come up for an explanation about places/situations/games where it won’t work. For example, with Stockfish we know that it doesn’t have universality over board games b/c it’s programmed specifically to play chess. It would be very surprising if it worked for Go, too, without significant modifications.
I think you might be confusing two things here. I don’t think it’s contentious that there is some knowledge about playing those games embedded in the output of AZ’s training phase. A better q might be like “so if the programmers didn’t (directly or indirectly) put the encoded knowledge in the NN, where did it come from?”
From BoI ch7:
The task of ruling out the possibility that the knowledge was created by the programmer in the case of ‘artificial evolution’ has the same logic as checking that a program is an AI – but harder, because the amount of knowledge that the ‘evolution’ purportedly creates is vastly less. Even if you yourself are the programmer, you are in no position to judge whether you created that relatively small amount of knowledge or not. For one thing, some of the knowledge that you packed into that language during those many months of design will have reach, because it encoded some general truths about the laws of geometry, mechanics and so on. For another, when designing the language you had constantly in mind what sorts of abilities it would eventually be used to express.
The Turing-test idea makes us think that, if it is given enough standard reply templates, an Eliza program will automatically be creating knowledge; artificial evolution makes us think that if we have variation and selection, then evolution (of adaptations) will automatically happen. But neither is necessarily so. In both cases, another possibility is that no knowledge at all will be created during the running of the program, only during its development by the programmer.
My commentary on AZ:
I think AZ’s training output (the NN) is a program. That program is generated via a sort of metaprogramming, and the relevant logic (and explanatory knowledge) is put there by the programmers. Additional knowledge is put there via the implementation of stuff like MCTS, which again contains embedded knowledge that the programmers already had. AZ’s algorithms contain generic and universal knowledge about how to play games (possibly only some particular types of games), and playing games is a long-time preexisting field of mathematics, which is where a lot of the knowledge embedded in AZ comes from.
When good chess players watch AZ’s output program playing chess, and think something like huh, that’s an interesting move, then those chess players will start doing creative thinking about why that move might be good and how it will affect the game, and how each player will react and strategize after that. Those chess players might learn something from that creative thinking, but the knowledge that they create (which could be considered a strategy/tactic proper) wasn’t in AZ’s output NN program; it was something that the chess players had to come up with so that they understood what was going on. That understanding (that the chess players now have) doesn’t need to line up with AZ’s behavior. After such creative thought, the chess players might be able to beat AZ (and each other) in new and interesting ways b/c they created explanatory knowledge. AZ’s output NN’s behavior was inspiration for that, but it isn’t the source of the new explanatory knowledge the chess players obtained.
From the linked wiki page on metaprogramming:
Metaprogramming is a programming technique in which computer programs have the ability to treat other programs as their data. It means that a program can be designed to read, generate, analyze or transform other programs, and even modify itself while running. … It also allows programs greater flexibility to efficiently handle new situations without recompilation.
Metaprogramming was popular in the 1970s and 1980s using list processing languages such as LISP. LISP hardware machines were popular in the 1980s and enabled applications that could process code. They were frequently used for artificial intelligence applications.
edit: I said this above and I think it might be pressuring and worded in an authoritative tone, which I don’t think I want to do in Friendly.
“Likeliness” shouldn’t be how you judge something correct or not.
Yeah. Specifically, I mean a counter-example to the conjecture that AZ’s training algorithm doesn’t create knowledge and all appearances of new knowledge are misconceptions.
Here’s a relevant BoI quote from ch5:
Whenever a high-level explanation does follow logically from low-level ones, that also means that the high-level one implies something about the low-level ones. Thus, additional high-level theories, provided that they were all consistent, would place more and more constraints on what the low-level theories could be. So it could be that all the high-level explanations that exist, taken together, imply all the low-level ones, as well as vice versa. Or it could be that some low-level, some intermediate-level and some high-level explanations, taken together, imply all explanations. I guess that that is so.
When I was reading BoI I remember realizing that some subset of objectively true knowledge implies all other objectively true knowledge. The above quote is the closest thing I found to that searching BoI just now, tho maybe there’s something about it in FoR, too.
Somewhat related is one of my favorite quotes; from ch10:
SOCRATES: I also see why you urge me always to bear human fallibility in mind. In fact, since you mentioned that some moral truths follow logically from epistemological considerations, I am now wondering whether they all do. Could it be that the moral imperative not to destroy the means of correcting mistakes is the only moral imperative? That all other moral truths follow from it?
Do you have a criticism of the idea in this case, the apparent differences between Chess, Go, and Shogi are just parochial misconceptions ?
So I did some research and I think you’re right. The reason chess, go and shogi can all use the same algorithm is because they are “perfect information” games. They don’t contain hidden information. As opposed to games like StarCraft of Hearthstone which do contain hidden information.
I don’t think that AZ finds optimal strategies
I think that would depend on the game being played. For example, in the “perfect information” game of tic-tac-toe it’s quite easy to find the optimal move (assuming the opponent is perfectly rational and also takes optimal actions) by exploring the “game tree” with a minimax strategy. But for a game like chess the “game tree” is too large to be explored by even our fastest computers. So we need to balance the exploration of the tree with limited time and memory constraints. AZ uses a variant of the Upper Confidence bounds applied to Trees (UCT) algorithm to select moves in the MCTS. The UCT algorithm is what lets AZ find a good choice within a given time limit. With infinite time and memory the UCT algorithm should converge to minimax.
I don’t think it’s contentious that there is some knowledge about playing those games embedded in the output of AZ’s training phase. A better q might be like “so if the programmers didn’t (directly or indirectly) put the encoded knowledge in the NN, where did it come from?”
I would say that the knowledge could have been created by evolution during the training process of AZ.
The relevant logic (and explanatory knowledge) is put there by the programmers. Additional knowledge is put there via the implementation of stuff like MCTS, which again contains embedded knowledge that the programmers already had. AZ’s algorithms contain generic and universal knowledge about how to play games (possibly only some particular types of games), and playing games is a long-time preexisting field of mathematics, which is where a lot of the knowledge embedded in AZ comes from.
There is definitely some knowledge embedded in the algorithms by the programmers (or whoever wrote the algorithms). For example, the knowledge that playing to maximize your chances of winning while minimizing your opponents chances of winning is a good strategy. Or the idea that we can assign values to wins, losses or draws and that we should try to pick moves with high expected values.
What isn’t embedded in the AlphaZero code is the knowledge of which moves have high expected values, or which moves maximize our chances of winning while minimizing our chances of losing. Imagine that aliens from another galaxy have created an extremely complex (by human standards) two player, zero sum, perfect information game. We could imagine some convoluted rules which make it very difficult to determine at first glance what the “optimal” move would be for a given game state. And the games could be so long that they would take years for a human to complete.
If someone explained the rules to you and then entered you into a game tournament it would not be useful for you to know that you should pick moves which have high expected values. Or that you should pick moves which maximize your chances of winning while minimizing your chances of losing. Because you would have no way of knowing which moves are high expected value and which are low expected value.
Yet we could give that same information to AlphaZero and it would be able to come up with very strong moves (assuming enough time and memory). Moves which the programmers of AlphaZero never considered because they had never heard of our hypothetical alien game. But AZ can only come up with those moves after it has gone through the “training” process. I think this is when the knowledge of which moves are “good” (based on a specific game state) is created by evolution.
During the training process AZ begins by taking random moves and playing out the games. It has to be random at first because AZ doesn’t know which moves lead to wins or losses and so the expected value for all of the moves is the same. The results from the random variations in the playouts are used to “train” the neural network so that it eventually can output a “good” move based on a given game state.
The programmers of AZ have told AZ to only store information on winning moves. There is no need to explore and store all of the possible playout variations of moves which are known to lead to losses. So a move will only cause AZ to copy it (by updating weights in the NN) if it leads to winning outcomes.
Consider a particular game state being evaluated by AZ. In this game state there is one move which leads to an eventual win, while all other moves lead eventually to a loss. Only the winning move will be copied into the NN and replicated by AZ in future games. Any other randomly selected move would not be copied because they would all lead to an eventual game loss.
So there is a process of variation (randomly selecting moves and playing out the games) and selection (keeping track of moves that lead to game wins). And AZ has to go through this process of variation and selection before it can win any games. If AZ went to play a game with just the knowledge embedded in it by the programmers (without doing the training process) it could do nothing more than select moves at random. The knowledge of which moves have high expected values has to be created in the training process before AZ can start winning games of chess against other people or programs.
In general, it makes sense to me that knowledge is created by alternating variation and selection on a population of replicators. The variation can be random (like random mutations in DNA) or it can come from creative thought. If we have any given population of replicators and we alternate processes of random variation and then selection we should expect evolution to happen. Biological evolution by natural selection is one example of knowledge creation by random variance. But there’s no reason we couldn’t recreate that process in say a lab at Google. I’m not certain that we have recreated that process but I think it’s definitely physically possible. And it explains how the knowledge of expected values for moves in a given board state is created.
It’s possible that all of that knowledge was embedded in the algorithm from the beginning but it’s not clear to me how exactly that would work. Which parts of the AZ code contain the knowledge of high value chess moves, go moves, hypothetical alien game moves, etc.? And why would AZ have to go through a training process if all of that knowledge was already embedded in the code?
I said this above and I think it might be pressuring and worded in an authoritative tone, which I don’t think I want to do in Friendly.
No worries, your comment was very insightful. Appreciate any further feedback, I’m pretty tired and will look this over tomorrow for errors or things I should clarify.
The programmers don’t embed which moves are good (at least I’m willing to grant that for discussion; IRL maybe some of them do know stuff about the game and embed some of it). They embed a method of finding out which moves are good. Consider two hypotheticals to see how this works.
I embed into an app a brute force approach to finding the best moves in games. I have plenty of compute power to brute force chess or I focus on some other games with fewer possible moves. I don’t personally know anything about which chess moves are good, but I do know a way to find out (by brute force calculations that check all possible moves). The app does the labor but isn’t creative, it’s just mechanically doing what I told (coded) it to.
I make a robot which can move around, move objects around, visit a recharge station, visit a repair shop, use sensors, make sounds, and a few other things. It has physical capabilities similar to a human or better. I don’t know anything about how to build a cabin. I program a cabin building feature into the robot. How does it work? The robot connects to www.robotrecipes.com and gets instructions from the cabins section. Although I don’t know how to build a cabin, I do know steps to get a cabin built (1. look up recipe. 2. do what recipe said. This is similar to how I cook some foods, except I do the cooking work myself instead of having my robot do it.).
So to begin with, do you agree that my software isn’t creating knowledge in these cases?
Knowledge is information suited to solving one or more problems(s) in one or more context(s).
Creating knowledge is generating said information where it did not exist before.
Then regarding your hypotheticals:
The information regarding the best moves (or the best move for the context of a given game) did not exist before it was created by running the brute force algorithm. So it would seem that this hypothetical program is creating knowledge.
I think this one is underspecified. The key question is, does robotrecipes.com include literally all the information needed to build a cabin or rule out building a cabin at any specific site? “Recipe” type sites I have seen do not, and rely on lots of general knowlege and decision making. There’s a ton of stuff to deal with like different slopes, soil types and composition, what if it’s raining or the temperature is below freezing or the wind is blowing 30 mph or there’s 15 feet of water covering the site etc. Ex: If a step in the recipe calls for the ground to be level and there’s a 5 ton boulder sticking out of the ground what do you do? Try to cut off the top of the boulder to match the ground? Use enough dynamite to fragment the boulder? Conclude the site is not suitable and look for another site (in which case, how big does the boulder have to be to trigger this?) Build dirt around the boulder to cover it all to a uniform level? If the robotrecipes site contained all this type of information for all possible building sites Including information to definitively rule out some sites, and information about how to tell which parts of the recipe to apply when/where on all suitable sites, then no knowledge would be created in building a specific cabin. Otherwise, it seems the robot in this hypothetical would have to create some knowledge - specifically how to apply the recipe to the specific context in which a cabin is to be built.
Neither of these algorithms would be universal knowledge creators. They are both bounded and not AGI. But that didn’t seem to be what you were asking about here.
I agree. My argument would be that they embed a method of finding out which moves are good via evolution.
I agree that knowledge isn’t being created in the 2nd example. The knowledge of how to build a cabin had to be uploaded to www.robotrecipes.com in advance. And the robot had to be programmed in advance to search the cabin section of that site when we enter the “build cabin” command.
In the 1st example I think that the “brute force approach” you embed into the app is using evolution to create knowledge. Using the minimax approach with tic-tac-toe should always give us the optimal move, assuming our opponent is rational and also uses optimal moves. To use the minimax approach you have to build out the game tree for tic-tac-toe, connect four, etc. Building out the game tree involves building out all variations of all possible game states. Then there is a selection process based on some evaluation function (like +1 for a win, -1 for a loss) to determine which moves should be played based on given game states.
In chess we can’t use the minimax function because the game tree is too large. We can use the Monte Carlo tree search which uses random variation along with a selection process.
The reason the application has to run the brute-force computations is because it needs to utilize the process of evolution to create the knowledge. You can’t get that knowledge without building out the game tree (which involves variance and selection).
In my recent post I specified that:
“Any other randomly selected move would not be copied because they would all lead to an eventual game loss.”
I wanted to point out that not any random, generic chess move is being selected. It’s a very specific variant of all the possible chess moves. Random moves would simply be ignored. And most of the possible moves are random moves. So most of the possible variants would be ignored.
The app does the labor but isn’t creative, it’s just mechanically doing what I told (coded) it to.
Evolution by natural selection isn’t doing anything creative when it creates knowledge.
So, I understand that you think brute force calculation going through all legal moves in chess or tic-tac-toe is “evolution” and “knowledge creation”. I disagree. I think that would need to be sorted out before discussing a more complicated case like AlphaZero. Do you agree? And have you read FoR?
Do you also think calculators create knowledge? Sometimes the information regarding the answer to some math problem didn’t exist before the calculator did some computations and output it. I think you’re either 1) attributing guiding and directing intelligence to tools (similar to giving a shovel significant credit for digging a whole, or maybe only an automated shovel with a motor) or 2) separating knowledge creation from intelligence or anything equivalent or similar to intelligence (so you don’t think the software does guiding or directing intelligence, but you want to give it credit anyway). But I don’t know which, and maybe it’s something else.