¿Los algoritmos genéticos son adecuados para problemas como el problema de Knuth?

Jan 23 2021

Todos sabemos que los algoritmos genéticos pueden brindar una solución óptima o casi óptima. Por lo tanto, en algunos problemas como los NP difíciles, con un compromiso entre el tiempo y la solución óptima, la solución casi óptima es suficientemente buena.

Dado que no hay garantía de encontrar la solución óptima, ¿se considera que GA es una buena opción para resolver el problema de Knuth?

Según la inteligencia artificial: un enfoque moderno (tercera edición), sección 3.2 (p. 73):

Knuth conjeturó que, comenzando con el número 4, una secuencia de operaciones factorial, raíz cuadrada y piso alcanzará cualquier número entero positivo deseado.

Por ejemplo, se puede llegar a 5 desde 4:

piso (sqrt (sqrt (sqrt (sqrt (sqrt ((4!)!))))))

So, if we have a number (5) and we want to know the sequence of the operations of the 3 mentioned ones to reach the given number, each gene of the chromosome will be a number that represents a certain operation with an additional number for (no operation) and the fitness function will be the absolute difference between the given number and the number we get from applying the operations in a certain order for each the chromosome (to min). Let's consider that the number of the iterations (generations) is done with no optimal solution and the nearest number we have is 4 ( with fitness 1), the problem is that we can get 4 from applying no operation on 4 while for 5 we need many operations, so the near-optimal solution is not even near to the solution.

Entonces, ¿GA no es adecuado para este tipo de problemas? ¿O la representación cromosómica y la función de aptitud sugeridas no son lo suficientemente buenas?

Respuestas

1 nbro Jan 23 2021 at 00:48

Antes de intentar responder su pregunta de manera más directa, permítame aclarar algo.

Las personas a menudo usan el término algoritmos genéticos (GA), pero, en muchos casos, lo que realmente quieren decir es algoritmos evolutivos (EA), que es una colección de algoritmos de optimización basados ​​en la población (es decir, se mantienen múltiples soluciones al mismo tiempo) y enfoques que se inspiran en el darwinismo y la supervivencia del más apto . GA es uno de estos enfoques, donde los cromosomas son binarios y tiene tanto la mutación como la operación de cruce. Existen otros enfoques, como las estrategias de evolución o la programación genética .

As you also noticed, EAs are meta-heuristics, and, although there is some research on their convergence properties [1], in practice, they may not converge. However, when any other potential approach has failed, EAs can be definitely useful.

In your case, the problem is really to find a closed-form (or analytical) expression of a function, which is composed of other smaller functions. This really is what genetic programming (in particular, tree-based GP) was created for. In fact, the Knuth problem is a particular instance of the symbolic regression problem, which is a typical problem that GP is applied to. So, GP is probably the first approach you should try.

Meanwhile, I have implemented a simple program in DEAP that tries to solve the Knuth problem. Check it here. The fitness of the best solution that it has found so far (with some seed) is 4 and the solution is floor(sqrt(float(sqrt(4)))) (here float just converts the input to a floating-point number, to ensure type safety). I used the difference as the fitness function and ran the GP algorithm for 100 generations with 100 individuals for each generation (which is not a lot!). I didn't tweak much the hyper-parameters, so, maybe, with the right seed and hyper-parameters, you can find the right solution.

To address your concerns, in principle, you could use that encoding, but, as you note, the GA could indeed return $4$ as the best solution (which isn't actually that far away from $5$), which you could avoid my killing, at every generation, any individuals that have just that value.

I didn't spend too much time on my implementation and thinking about this problem, but, as I said above, even with genetic programming and using only Knuth's operations, it could get stuck in local optima. You could try to augment my (or your) implementation with other operations, such as the multiplication and addition, and see if something improves.