유전 알고리즘은 Knuth 문제와 같은 문제에 적합합니까?

Jan 23 2021

우리 모두는 유전 알고리즘이 최적 또는 거의 최적의 솔루션을 제공 할 수 있다는 것을 알고 있습니다. 따라서 NP-hard 문제와 같은 일부 문제에서 시간과 최적 솔루션 사이의 절충안은 거의 최적의 솔루션으로 충분합니다.

최적의 솔루션을 찾을 수 있다는 보장이 없기 때문에 GA가 Knuth 문제를 해결하는 데 좋은 선택으로 간주됩니까?

인공 지능에 따르면 : 현대적인 접근 방식 (제 3 판), 섹션 3.2 (p. 73) :

Knuth는 숫자 4부터 계승, 제곱근 및 바닥 연산의 시퀀스가 ​​원하는 양의 정수에 도달 할 것이라고 추측했습니다.

예를 들어 4에서 5에 도달 할 수 있습니다.

플로어 (sqrt (sqrt (sqrt (sqrt (sqrt ((4!)!))))))

따라서 우리가 숫자 (5)를 가지고 있고 주어진 숫자에 도달하기 위해 언급 된 3 개의 작동 순서를 알고 싶다면, 염색체의 각 유전자는 특정 작동을 나타내는 숫자가 될 것입니다. (작동 없음) 피트니스 함수는 주어진 숫자와 각 염색체에 대해 특정 순서로 연산을 적용하여 얻은 숫자 (최소) 간의 절대 차이입니다. 반복 횟수 (세대)가 최적의 솔루션없이 수행되고 가장 가까운 숫자가 4 (피트니스 1)라고 가정 해 봅시다. 문제는 4에 연산을 적용하지 않고 4를 얻을 수있는 반면 5에 필요한 것은 많은 작업을 수행하므로 거의 최적의 솔루션이 솔루션에 가깝지 않습니다.

그렇다면 GA는 이런 종류의 문제에 적합하지 않습니까? 아니면 제안 된 염색체 표현과 피트니스 기능이 충분하지 않습니까?

답변

1 nbro Jan 23 2021 at 00:48

질문에 더 직접적으로 대답하기 전에 명확하게 설명하겠습니다.

People often use the term genetic algorithms (GAs), but, in many cases, what they really mean is evolutionary algorithms (EAs), which is a collection of population-based (i.e. multiple solutions are maintained at the same time) optimization algorithms and approaches that are inspired by Darwinism and survival of the fittest. GAs is one of these approaches, where the chromosomes are binary and you have both the mutation and cross-over operation. There are other approaches, such as evolution strategies or genetic programming.

As you also noticed, EAs are meta-heuristics, and, although there is some research on their convergence properties [1], in practice, they may not converge. However, when any other potential approach has failed, EAs can be definitely useful.

In your case, the problem is really to find a closed-form (or analytical) expression of a function, which is composed of other smaller functions. This really is what genetic programming (in particular, tree-based GP) was created for. In fact, the Knuth problem is a particular instance of the symbolic regression problem, which is a typical problem that GP is applied to. So, GP is probably the first approach you should try.

Meanwhile, I have implemented a simple program in DEAP that tries to solve the Knuth problem. Check it here. The fitness of the best solution that it has found so far (with some seed) is 4 and the solution is floor(sqrt(float(sqrt(4)))) (here float just converts the input to a floating-point number, to ensure type safety). I used the difference as the fitness function and ran the GP algorithm for 100 generations with 100 individuals for each generation (which is not a lot!). I didn't tweak much the hyper-parameters, so, maybe, with the right seed and hyper-parameters, you can find the right solution.

To address your concerns, in principle, you could use that encoding, but, as you note, the GA could indeed return $4$ as the best solution (which isn't actually that far away from $5$), which you could avoid my killing, at every generation, any individuals that have just that value.

I didn't spend too much time on my implementation and thinking about this problem, but, as I said above, even with genetic programming and using only Knuth's operations, it could get stuck in local optima. You could try to augment my (or your) implementation with other operations, such as the multiplication and addition, and see if something improves.