Keywords—Single Objective Optimization, Evolutionary Algo-rithms, Environmental Adaptation Method, Community Detection Problem.
- INTRODUCTION
K. K. Mishra et al. [33] proposed Environmental Adaptation Method as an alternative of GA [25], PSO [27], and DE [53] for solving optimization problems. This algorithm uses the theory of adaptive learning given by Mark Baldwin [6]. EAM uses three operators: adaptation, alteration, and selection to find optimal solutions. These operators are based on the theory of adaptive learning. Adaptation operator was used to update the phenotypic structure of the solution in the search space on the basis of current environmental fitness and current solution’s fitness. Alteration operator used updated positions of solutions as an input and incorporate some extra diversity mechanism by inverting each bit of every solution with some probability. After alteration operator, offspring population is obtained. Finally, a selection operator is used to select the best solution from parent and offspring population for the next generation. Although, the convergence rate and diversity of solution was good enough and very efficient for multi-modal functions, but this algorithm was not efficient for uni-mode functions. Moreover, the best solution of the population of this algorithm did not have any contribution in search process of optimal points in the search space. Therefore, to make the algorithm efficient for both multi-model and uni-mode functions, and to utilize the contribution of best solution in the search process, K. K. Mishra et al. [34], [35] proposed a variant of EAM named “Improved Environmental Adaptation Method (IEAM)”.
IEAM uses the same operators with some changes in adap- tation and divided the population into two parts. One part con- tains only one (best) solution and other parts contain remaining (N-1) solutions. Adaptation operator has a different mechanism to change the position of the solution of the both parts in the search space. The working of remaining two operators are same as they were in EAM. The use of the best solution in the search process in IEAM was found very useful and it performed well as compared to EAM. Both EAM and IEAM are binary coded due to which there were some computational overhead and limitations. These algorithms can not be used for real numbered problems. In order to reduce computational overhead and to work with real numbered problems, a new version of IEAM named “Improved Environmental Adaption Method with Real Parameter (IEAM-RP)” [52] was suggested in 2016. In IEAM-RP, the adaption operator has been modified to deal with real numbered problems. The performance of the algorithm was checked on 24 benchmark functions of COCO [19] framework. IEAM-RP has Improved the convergence rate and was able to find the optimal solutions in less number of function evaluations. Although the performance of IEAM-RP was very good in lower dimensions (2-D, 3-D, 5-D, and 10- D), it was not so efficient in higher dimensions optimization problems.
In this paper, a variant of IEAM-RP named “IEAM-RP1” is developed to improve the performance of the algorithm in the higher dimensions. In the proposed method two operators, adaptation, and selection have been used to find optimal solu- tions. For this purpose, the main concentration has been given to provide extra exploration capability to the solutions. Based on the current environmental fitness, the population is divided into two parts. The first part contains those solutions that have lower fitness values compared to the current environmental fitness and second part contains those solutions that have higher fitness values compared to the current environmental fitness. Here, current environmental fitness is the average fitness of the population. Different mechanisms have been applied to change the phenotypic structure of solutions of both parts. Special attention has been given to balance the exploration and exploitation capabilities of the operators.
The remaining part of the paper is organized as follows: Section II has presented the background details and related work. The description of the proposed algorithm is given in Section III. The description of experimental setup and benchmark functions are given in Section IV and experimental results are described in Section V. The proposed algorithm is applied to a real-world problem of community detection in a complex network in section VI. In Section VII, experimental results of community detection problem are discussed and conclusion of the paper is given in Section VIII.
- BACKGROUND DETAILS AND RELATED WORK
- Environmental Adaptation Method (EAM)
K. K. Mishra et al. [33] proposed a new algorithm to solve single objective optimization problems. This algorithm used the theory of adaptive learning in three steps. For each step, one operator has been designed to achieve the desired goal. These operators are adaptation, alteration, and selection. These operators use a binary representation of the solutions that represent the phenotypic structure of an individual. Adaptation operator uses randomly initialized population as an input and changes their phenotypic structure in the search space. Mathematically, adaptation operator is used to explore other regions in the given range as follows:
Pi+1 = (α(Pi)F(Pi)/Favg+ β)%(2L−1) (1)
WherePiand Pi+1 is the position vector and updated position vector of ithmember in the population. αand βare constant values that needs to be chosen according to optimization problems taken into consideration. F(Pi) and Favg is the
fitness value of Piand environmental fitness value of current population respectively. L is the total number of bits in an individual. After adaptation, the alteration operator is used to invert each bit of every individual with a certain probability in order to provide extra diversity. After the alteration operator, an offspring population of the same size is obtained. Selection operator is used to select the best individual from parent and offspring population. This process is repeated until we get the optimal solution or a maximum number of iterations has been reached. EAM has following shortcomings:
- EAM works with binary encoding, we can not apply this algorithm in real numbered problems.
- In higher dimensions, it needs a greater number of bits and convergence rate decreases which prone to stagnation.
- There is no contribution of best individual of the popula- tion in the search process in the subsequent generations.
- Improved Environmental Adaptation Method (IEAM)
K. K. Mishra et. al. [34] proposed a new version of EAM to incorporate the contribution of the best member of the population in the search process, named “Improved Environmental Adaptation Method (IEAM)”. The adaptation operator of EAM was designed to explore the problem search space and alteration operator exploited new solutions around the solutions generated by adaptation operator. Since EAM uses little exploitation, it did not perform well with uni-model benchmark functions. In IEAM there are few changes in operators as well as the fine-tuning of parameters. In IEAM, some changes were made in the adaptation operator. The adaptation operator made the best member explore other good regions, while the remaining members exploit the problem search space. The best member uses its own fitness and environmental fitness to explore the other good regions in the problem search space as follows:
Pi+1 = [α∗(Pi)F(Xi)/Favg + β]%2l (2)
WherePiis the position vector of the member that wants to update its structure, l is the number of bits, and α, βare the random numbers between 0 to 1. F(Xi) is the fitness value of ithmember and Favgis the current environmental fitness. Remaining members of the population are guided to update their position vector in the direction of the best member in the hope that optimal solution exists somewhere close to the structure of the best solution. Adaptation operator uses adaptation window of variable bandwidth and changes their structure as follows:
Pi+1 = [α∗(Pi)F(Xi)/Favg+ β∗[(Gb−Pi) + (Pb−Pi)]]%2l
(3)
where Gbis the position vector of the best member, and Pbis the personal best position vector of the member that wants to update its position in the problem search space. The remaining two operators of IEAM, alteration, and selection, were used in the same way as they were used in EAM.
- The value of (Gb − Pi) + (Pb − Pi) is different for each member of the population which is known as bandwidth. Due to this, some members are not able to move properly in the search space and decreases the performance of the algorithm.
- Both EAM and IEAM are binary encoded, which results in high computational time.
- Improved Environmental Adaptation Method With Real Parameter (IEAM-RP)
To overcome the above problems, a new version of IEAM was proposed in 2016: Improved Environmental Adaption Method with Real Parameter Encoding (IEAM-RP). IEAM-RP uses basic framework of IEAM. Unlike IEAM, IEAM-RP uses two operators: Adaption and Selection. Like other optimization algorithms, IEAM-RP also starts with a randomly initialized population in the search space. Here, adaption operator is used to create offspring population from parent population on the basis of adaption factor and mobility factor. The adaption factor is the ratio of individual fitness to current environmental fitness, whereas the mobility factor is the difference between the best and worst position vectors. The best and worst positions are the positions that correspond to the members having the best fitness and worst fitness respectively. The best member of the population uses adaption factor to change its structure, whereas, remaining members use mobility factor to attain the phenotypic structure of the best member. The offspring population produced by the adaption operator was diverse enough, so the alteration operator was excluded. Mathematically, the best member of the population changes its phenotypic structure as follows:
Pi+1 = Pi∗F(Xi)/Favg+ β (4)
Where βis a random number between 0 to 1. Piis the position vector of best member in the population that wants to update its phenotypic structure, F(Xi) is the fitness value of the best member and Favgis the current environmental fitness. Remaining members of the population try to attain the phenotypic structure of the best member by using mobility factor as follows:
Pi+1 = Pi+ β∗(best_position−worst_position) (5)
Here Piis the position vector of ithmember, and best_position-worst_position is the fixed bandwidth that is used by non-best members to move in the problem search space. The value of best_position-worst_position is called as mobility factor, it is fixed for all non-best members in a particular generation that makes the members move properly in the problem search space.
The performance of IEAM-RP was compared with other nature-inspired optimization algorithms on a total of 24 bench- mark functions of COCO [19] framework on 2-D, 3-D, 5-D, and 10-D dimensions. It was found that IEAM-RP performs better than others state-of-the-art algorithms in many ways. IEAM-RP has removed all the limitations of IEAM and performed well in terms of convergence rate and diversity. The main problem of IEAM-RP was it did not perform well in higher dimensions, like, 20-D, and 40-D. This is due to the fact that IEAM-RP was unable to maintain the balance between exploration and exploitation in an efficient way.
- Evolutionary Strategy (ES)
Ingo Rechenberg [47] proposed ES algorithm for optimiza- tion in 1989 by using the idea of adaptation and evolution. In this algorithm, one parent is selected to generate an offspring in one generation. The offspring population is produced by using a mutation operator that remains the same in all gener- ations. The first version of this algorithm was known as two- membered ES. After the first proposal, a number of updated versions of this algorithm have been developed. Later version of the algorithm is known as (µ+ λ) ES. In this version, µparents are used to create λmutated offspring. The approach of this version of the ES algorithm is similar to real parameter genetic algorithm [24], [60], [18]. The evolutionary process in ES was enhanced by using the distribution model of the population resulting in a new algorithm called Covariance Matrix Adaptation Evolution Strategy (CMAES) [21]. The distribution model helps CMAES to learn linkage of involved variables. CMAES starts with the initialization of mean vector
(m∈Rn) and covariance matrix (cov) where n represents the dimension of the problem.
mj = rand(Lj,Uj), such that Lj <mj <Uj j= 1,2,,n, cov= σ2C,
where C∈Rn×nand σis the step size. The initial value of C is set to an identity matrix. In each iteration, λnew solutions
are generated using multivariate normal distribution N(m, cov). The weighted sum of the best µ solutions is used to update m, σ, and C. The general accepted values of µand λare [λ/4] and 4 + [3ln(n)] respectively [22]. This process is repeated until the terminating criterion.
- Differential Evolution (DE)
Storn and Price [54] proposed one of the most powerful stochastic real-parameter optimization algorithms in current use. DE is population-based optimization algorithm and oper- ates the similar computational steps as employed by a standard evolutionary algorithm (EA) to find optimal solution. Since DE was developed in 1995, it has drawn the attention of many researchers all over the world resulting in a lot of variants of the basic algorithm with improved performance. It uses three operators (mutation, crossover and selection). The working of each operator is described as follows:
Mutation: For each target vector xi,G, i = 1, 2, 3, … , NP, a mutant vector is generated according to
vi,G+1 = xr1 ,G+ F∗(xr2 ,G−xr3 ,G)
convergence [46].
Crossover: In order to improve the diversity, crossover is introduced. It defines the following trial vector:
Ui,G+1 = (U1i,G+1,U2i,G+1,U3i,G+1,…,UDi,G+1) is
formed, where
Uji,G+1 if randj(0,1) ≤Cr
randomly initialized potential solutions (particles) in the search space. Each point in the search space associates some objective function value. In order to find the optimal solution, particles fly in the search space with some velocity and change their location using current velocity. Each particle moves in the search space using the following velocity:
V(t+1) = w×V(t)+c1×r1(pbest−X(t))+c2×r2(gbest−X(t))
(6)
On the basis of velocity, particles change their location in the search space as follows:
xji,Gotherwise
X(t+ 1) = X(t) + V(t+ 1) (7)
Selection: To decide whether or not it should become a member of generation G + 1, the trial vector Ui,G+1 is compared to the target vector xi,G. If vector Ui,G+1 yields a smaller cost function value than xi,G, then xi,G+1 is set to Ui,G+1, otherwise, the old value xi,G is retained.
The task of any optimization is to search for the parameter vector X∗ that minimizes an objective function
Where
- pbest and gbest are the personal best and global best positions, respectively.
- X(t) is the current position of the particle.
- V(t) is the current velocity of the particle.
- w is inertia weight.
c1 and c2 are cognitive and social behavioural constants.
RD is a non-empty large finite set serving as the domain of the search.
This algorithm uses a combination of mutation, crossover, and selection operators to target optimal solutions. Several mutation schemes to refine classical DE were compared by Mezura-Montes et al. [32] by using eight different DE variants on a set of 13 benchmark problems. Liu et al. [30] proposed Fuzzy Adaptive Differential Algorithm driven by the fuzzy logic controller that is adaptive during each generation of DE. Qin et al. [44] proposed Self-Adaptive Differential Evolution (SADE) in which both trial vector generation scheme and control parameters Cr and F are gradually self-adapted through learning from their previous understanding. In order to tune the random parameters, other variants [39], [1], [9] were also proposed .
Zhang et al. [61] proposed JADE as an adaptive differential evolution to improve the optimization performance by implementing a new mutation strategy with the optional external archive. JADE uses a neighborhood-based mutation strategy to enhance the performance of DE. In JADE, the maximum size of external archive A is bound to population size. Multiple best solutions are used to balance the exploration and exploitation. The performance of JADE was evaluated on a set of 20 benchmark functions. It was found that for higher dimensional problems, JADE with an external archive shows promising results.
- Particle Swarm Optimization
Particle swarm optimization (PSO) is a population-based stochastic optimization technique developed by Dr. Eberhart and Dr. Kennedy [27] in 1995, inspired by social behavior of bird flocking or fish schooling. PSO processes a set of
- r1 and r2 are random numbers generated between [0,1].
Unlike, GA, and DE, PSO do not have any mutation, crossover or selection operators. After the first proposal of PSO, many variants have been developed to improve its performance and to solve real-world problems. At the present time, PSO has been used for training [12] in neural network, controller parameter setting [58], multiobjective optimization [11], and so on.
- DESCRIPTION OF THE PROPOSED ALGORITHM
In this paper, a variant of IEAM-RP is suggested to improve its performance on the higher dimension of a given problem. In the proposed algorithm, enough attention has been given to improve the operators so that the algorithm can perform better in higher dimensional problems. Any optimization algorithm tries to find optimal solutions using three steps, which are: an exploration of search space to identify new good solutions, exploitation around the good solutions, and selection of good solutions. Selection of good solutions is essential to identify those locations in the search space where the probability of getting optimal solutions is very high. After that, these locations are exploited to check whether a global optimum solution lies in these locations or not. If an optimal solution is not found then the search space is explored so that other good locations can be found. These steps are repeated until the global optimum solution is identified or maximum iteration has been achieved. In the proposed approach, above three steps are achieved using two operators (adaptation and selection). The detailed description of both operators have been given below.
- Adaptation Operator
The main purpose of this operator is to create offspring population from the parent or intermediate population. This operator divides the population into two parts. One part contains those solutions that have smaller fitness value compared to current environmental fitness and another part contains those solutions that have higher fitness value compared to current environmental fitness. The division of the population into two parts is to measure the deviation of the solutions from the current environmental fitness. In the case of minimization, the solutions having the smaller fitness compared to current environmental fitness are better as compared to the solutions that have higher fitness to the current environmental fitness. In the proposed algorithm, different strategies have been adopted in the two parts of solutions to generate offspring population. The environmental fitness Favgis the average fitness of the population and represented as follows:
N
Favg= ) Fi/N (8)
i=1
Where N is the initial population size, Fiis the fitness value of ithmember of the population. Each member having the higher fitness value compared to the current environmental fitness changes its location as follows:
PH+1 = permute(PH) (9)
where PHis the position vectors of the members that want to change its genome structure. Permute (PH) is one of the possible combination of respective solutions. To understand this, let us consider a position vector of PH = {1.2, 3.7, 2.4}. For this position vector, a total 6 possibilities are there, which are: {1.2, 3.7, 2.4}{1.2, 2.4, 3.7}{3.7, 1.2, 2.4}{3.7,
2.4, 1.2}{2.4, 1.2, 3.7}{2.4, 3.7, 1.2}. Permute(PH) will return
- Selection Operator
This operator is used to select best N (initial population size) individuals from parent and offspring population for the next generation. For doing this, it combines parent and offspring population and selects top N individuals on the basis of fitness values. This process continues until we get the optimal solutions or a maximum number of iterations has been reached.
Algorithm1IEAM-RP1
1: Set initial values of parameters
2: N = 100×DIM 1>DIM is the problem dimension
3: Initialize each solution of population randomly
4: repeat
5: for g=1 to MaxGen do
6: Evaluate fitness value of each solution
7: Generate offspring population using Adaptation
8: Update population using Selection
9: endfor
10: untilstopping criteria is not met or optimal solution is not
found
Algorithm2Adaptation
1: Compute average fitness of current population 1>use eq. 8
2: On the basis of Favg, divide population into two parts
3: if IF≥Favg 1>IF= individual fitness
4: Apply random exploration using eqn 9 to find other
promising solutions
5: else
6: Apply exploitation using eqn 10 around the good solutions
7: end if
8: END
PL+1 = PL+ β×(best_position−worst_position) (10)
Here, in equation 10, best_position and worst_position are the position vectors of solutions that have least fitness and highest fitness values respectively. βis a random number from 0 to 1. PLis position vectors of solutions that have smaller fitness values compared to current environmental fitness. PL+1 is updated position vectors of PL. The best_position – worst_position provides a fixed bandwidth to the members of the population so that they can properly evolve in the problem search space. Now PH+1 and PL+1 are combined which is the offspring population created by adaptation operator.
Algorithm3 Selection
1: Combine parents and offspring population
2: Select N distinct best individuals from the combined population
- DESCRIPTION OF EXPERIMENTAL SETUP AND BENCHMARK FUNCTIONS
We evaluated the performance of IEAM-RP1 over 24 Black- Box Optimization Benchmarking (BBOB)functions [20] using COmparing Continuous Optimizers (COCO) framework [19], [14]. The results generated by IEAM-RP1 are compared with the other existing state-of-the-art algorithms. To analyze the performance of different algorithms, both uni-modal and multi- modal functions have been taken.
- Experimental Setup
The search domain for all functions is used as [−5,5]D. In the experiments, all functions are tested for D = 2, 3, 5, 10, and 20 search space dimensionalities. If foptimal is an optimal fitness value defined for each benchmark function individually and ∆fis the required precision then ftarget= foptimal+ ∆fis the target value of function. The value of
15. Maximum allowed function evaluations (FEs) for each trial are calculated using eqn 11.
Maximum_FEs= 1000×Population_Size×Dimension
2 (11)
where
Population_Size= 100 ×Dimension (12)
All experiments are performed using MATLAB 2017a in Windows7 on a desktop with Core-i5 @2.80GHz processor and 4GB RAM.
- Benchmark Functions
The Black-Box Optimization Benchmarking (BBOB) provides 24 noiseless real-parameter single-objective benchmark functions. These functions are further classified
into five groups, including separable functions (f1 −f5), functions with low or moderate conditioning (f6 −f9), uni-modal functions with high conditioning (f10 −f15), multi-modal functions with adequate global structure (f16 −f19), and multi-modal functions with weak global structure (f20 −f24) as listed in Table I. All benchmark functions are scalable with the dimension (D). Most functions
have no specific value of their optimal solution. All functions have an artificially chosen optimal function value [20].
Table I: BBOB Benchmark Functions
Group Name | f# | Function Name |
Separable Functions | f1 | Sphere |
f2 | Ellipsoidal | |
f3 | Rastrigin | |
f4 | Buche-Rastrigin | |
f5 | Linear Slope | |
Functions with low or moderate conditioning | f6 | Attractive Sector |
f7 | Step Ellipsoidal | |
f8 | Rosenbrock Original | |
f9 | Rosenbrock Rotated | |
Functions with high conditioning and unimodal | f10 | Ellipsoidal |
f11 | Discus | |
f12 | Bent Cigar | |
f13 | Sharp Ridge | |
f14 | Different Powers | |
Multi-modal functions with adequate global structure | f15 | Rastrigin |
f16 | Weierstrass | |
f17 | Schaffers F7 | |
f18 | Schaffers F7, moderately ill-conditioned | |
f19 | Composite Griewank-Rosenbrock F8F2 | |
Multi-modal functions with weak global structure |
f20 | Schwefel |
f21 | Gallagher’s Gaussian 101-me Peaks | |
f22 | Gallagher’s Gaussian 21-hi Peaks | |
f23 | Katsuura | |
f24 | Lunacek bi-Rastrigin |
- EXPERIMENTAL RESULTS AND DISCUSSION
The performance of IEAM-RP1 is compared with other state-of-the-art algorithms using convergence rate and accuracy of solutions metrics. These algorithms are as follows: JADEctpb [42], CMAES [21], BIPOP-CMAES [48], Simul-
taneous Optimistic Optimization (SOO) [36], CMA-CSA [4], IEAM-RP [52], GP1-CMAES and GP5-CMAES [5], and BSif
[41]. Datasets for all above-mentioned algorithms can be downloaded from http://coco.gforge.inria.fr. Figure 1-5 shows the convergence graphs for the comparison of the proposed al- gorithm against 14 other state-of-the-art algorithms on BBOB benchmark functions. Experimental results and convergence graph have shown that the proposed algorithm is very com- petitive with other algorithms. In IEAM-RP, only one member of the population was used to explore and remaining members were used for exploitation. Due to this, the algorithm was not able to find optimal solution efficiently. In the proposed algorithm (IEAM-RP1), approximate half population explores the problem search space for finding a better solution and remaining population exploits the problem search space around the good solutions. This technique of proposed algorithm was found very promising and this improved the overall performance of the algorithm. All values are divided by dimension and plotted as log10 values versus dimension. ∆f
= 10(1,0,−1,−2,−3,−5,−8). Numbers above ERT-symbols (if appearing) indicate the number of trials reaching the respec- tive target. The light thick line with diamonds indicates the respective best result BBOB 2009 for ∆f= 10−8.
In the proposed algorithm, a set of solutions that have lower fitness values compared to the current environmental fitness guide the other solutions to improve their fitness values. Due to this, the algorithm is able to find a globally optimal solution as early as possible and this increases the convergence rate of the algorithm. To avoid the problem of stagnation in multi-model benchmark functions, exploration is used. The exploration covers larger search space and helps in covering the new solutions in unexplored regions. In Table II, the position of IEAM-RP1 is given against 14 state-of-the-art algorithms. From the table, it is clear that the proposed algorithm is very competitive with other algorithms.
Solutions may get stuck in local optima (premature conver- gence) due to Excessive exploitation. Excessive exploration may result in wasting more time on weak solutions (random search) rather than utilizing the good solutions found in the search space. In a good algorithm, there should be an equal probability of exploitation and exploration to apply. In the proposed algorithm, enough attention has been given for the same to make the algorithm more powerful for uni-model and multi-model benchmark functions. The competitive results of the proposed algorithm could be expected because it has suc- cessfully balance the exploration and exploitation. Moreover, the way of exploration for struggling solutions is capable of exploring the unexplored regions in the problem search space. The Exploitation is applied to the solutions that have good fitness values. In the proposed algorithm, exploitation makes the solutions search around the good solutions. The way of working of exploitation and its mathematical expression in the
BSif
best 2009
1.0 f6-9,2-D
best 2009 CMA-CSA
1.0 f10-14,2-D
best 2009 CMAES
JADEctpb IEAM-RP1
0.8
JADEctpb BIPOP-CMAES
0.8
JADEctpb CMA-CSA
CMAES
JADEb
GPSO1 SOO
0.6
IEAM-RP1 IEAM-RP
0.6
BIPOP-CMAES SOO
GPSO1
IEAM-RP1
BIPOP-CMAES CMA-CSA
0.4
JADEb SOO
0.4
GP5-CMAES IEAM-RP
GP1-CMAES JADEb
GP5-CMAES
0.2
GP1-CMAES GP5-CMAES
BSif
0.2
GP1-CMAES GPSO1
BSif
0 RF5-CMAES
0.0
RF1-CMAES RF5-CMAES
0.0
RF1-CMAES RF5-CMAES
log10 of (# f-evals / dimension)
(A)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(B)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(C)
best 2009 BIPOP-CMAES
1.0 f20-24,2-D
JADEctpb best 2009
1.0 f1-24,2-D
best 2009 JADEctpb
CMA-CSA SOO
0.8
JADEb CMAES
0.8
CMAES BIPOP-CMAES
BIPOP-CMAES
SOO
JADEctpb JADEb
0.6
SOO IEAM-RP1
0.6
CMA-CSA IEAM-RP
CMA-CSA
IEAM-RP1
GPSO1 IEAM-RP1
0.4
IEAM-RP GP5-CMAES
0.4
JADEb GPSO1
BSif
GP1-CMAES GP5-CMAES
0.2
GP1-CMAES GPSO1
BSif
0.2
GP1-CMAES GP5-CMAES
BSif
0 RF5-CMAES
0.0
RF1-CMAES
RF5-CMAES
0.0
RF1-CMAES
RF5-CMAES
log10 of (# f-evals / dimension)
(D)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(E)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(F)
(F) All functions
BSif
best 2009
1.0 f6-9,3-D
best 2009 CMA-CSA
1.0 f10-14,3-D
best 2009 BIPOP-CMA-ES
JADEctpb IEAM-RP
0.8
BIPOP-CMA-ES CMAES
0.8
JADEctpb JADEb
JADEctpb
CMAES
GPSO1 CMAES
0.6
IEAM-RP IEAM-RP1
0.6
CMA-CSA IEAM-RP1
GPSO1
SOO
CMA-CSA BIPOP-CMA-ES
0.4
JADEb SOO
0.4
GP5-CMAES GP1-CMAES
GP1-CMAES JADEb
GP5-CMAES
0.2
GP1-CMAES GP5-CMAES RF1-CMAES
0.2
IEAM-RP GPSO1
BSif
0 RF5-CMAES
0.0
BSif
RF5-CMAES
0.0
RF1-CMAES
RF5-CMAES
log10 of (# f-evals / dimension)
(A)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(B)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(C)
best 2009
best 2009
CMA-CSA JADEb
0.8
CMAES JADEb
0.8
CMAES
BIPOP-CMA-ES
BIPOP-CMA-ES
CMA-CSA
CMAES IEAM-RP1
0.6
SOO CMA-CSA
0.6
JADEb SOO
IEAM-RP
IEAM-RP1
IEAM-RP GP1-CMAES
0.4
BSif IEAM-RP1
0.4
IEAM-RP GPSO1
GPSO1
BSif
GP5-CMAES
0.2
GPSO1 GP5-CMAES GP1-CMAES
0.2
GP1-CMAES GP5-CMAES
BSif
0 RF5-CMAES
0.0
RF1-CMAES
RF5-CMAES
0.0
RF1-CMAES
RF5-CMAES
log10 of (# f-evals / dimension)
(D)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(E)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(F)
(F) All functions
BSif
best 2009
1.0 f6-9,5-D
best 2009 CMA-CSA
1.0 f10-14,5-D
best 2009 BIPOP-CMA-ES
JADEctpb JADEb
0.8
BIPOP-CMA-ES CMAES
0.8
CMAES CMA-CSA
JADEctpb
JADEctpb
IEAM-RP SOO
0.6
IEAM-RP1 IEAM-RP
0.6
JADEb SOO
JADEb
GP5-CMAES
CMAES
BIPOP-CMA-ES
0.4
SOO GPSO1
0.4
GP1-CMAES IEAM-RP1
CMA-CSA GP5-CMAES GP1-CMAES
0.2
GP1-CMAES GP5-CMAES
BSif
0.2
IEAM-RP RF1-CMAES
BSif
0 RF5-CMAES
0.0
RF1-CMAES
RF5-CMAES
0.0
GPSO1
RF5-CMAES
log10 of (# f-evals / dimension)
(A)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(B)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(C)
Dim/Func_Group | Separ | Lcond | Hcond | Multi | Multi2 | All |
2D | 3 | 5 | 7 | 9 | 6 | 6 |
3D | 4 | 6 | 6 | 6 | 9 | 7 |
5D | 4 | 5 | 9 | 6 | 8 | 6 |
10D | 5 | 6 | 9 | 6 | 8 | 6 |
20D | 8 | 5 | 11 | 4 | 9 | 6 |
algorithm is very effective for the desired goal.
- APPLICATION OF IEAM-RP1 IN COMMUNITY DETECTION OF COMPLEX NETWORKS
- INTRODUCTION
MANY real-world and complex systems can be treated as networks, such as the world-wide-web, the Internet, commu-
nication networks, biochemical networks, collaboration net- works, biological networks, and transportation networks. These networks are called as complex networks and modeled as graphs, where nodes (vertices) represent the objects and edges (links) represent the interactions among these objects.
An important problem in the study of complex networks is the detection of community structure [16] and has at- tracted more attention in interdisciplinary fields [59], [55]. These community structures also referred as clustering [15], or communities having dense intra-connections, and sparse interconnections. Community detection discovers the hidden community structure and used for discovering a complex net- work topology and understanding complex network functions [10], [7], [15]. The number of links inside the same community should be higher than the number of links connecting to the remaining nodes of the graph.
In particular, community detection is the process of dividing
CMA-CSA
best 2009
1.0 f20-24,5-D
best 2009 BIPOP-CMA-ES
1.0 f1-24,5-D
best 2009 BIPOP-CMA-ES
BIPOP-CMA-ES JADEb
0.8
JADEb
JADEctpb
0.8
JADEb CMA-CSA
CMA-CSA
JADEctpb
SOO IEAM-RP1
0.6
SOO CMAES
0.6
CMAES IEAM-RP1
IEAM-RP
SOO
JADEctpb GP1-CMAES
0.4
IEAM-RP1
BSif
0.4
IEAM-RP GP1-CMAES
GPSO1 GP5-CMAES RF1-CMAES
0.2
GP1-CMAES GPSO1 GP5-CMAES
0.2
GPSO1
BSif
GP5-CMAES
0 RF5-CMAES
0.0
RF1-CMAES
RF5-CMAES
0.0
RF1-CMAES
RF5-CMAES
log10 of (# f-evals / dimension)
(D)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(E)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(F)
(F) All functions
BSif JADEctpb
1.0 f6-9,10-D
best 2009 CMA-CSA
1.0 f10-14,10-D
best 2009 CMA-CSA
best 2009 JADEb
0.8
BIPOP-CMAES CMAES
0.8
BIPOP-CMAES
JADEctpb
JADEctpb
CMAES
IEAM-RP1 CMAES
0.6
JADEb IEAM-RP1
0.6
JADEb
GP5-CMAES
IEAM-RP
SOO
SOO IEAM-RP
0.4
SOO GPSO1
0.4
GP1-CMAES IEAM-RP1
GPSO1 GP5-CMAES GP1-CMAES
0.2
GP1-CMAES
BSif
RF1-CMAES
0.2
RF1-CMAES IEAM-RP GPSO1
0 RF5-CMAES
0.0
GP5-CMAES
RF5-CMAES
0.0
BSif
RF5-CMAES
log10 of (# f-evals / dimension)
(A)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(B)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(C)
In the last two decades, many research efforts have been done to discover the communities in networks. Numerous of different algorithms from different fields have been developed to detect communities in a network. These fields include data mining, statistics, physics, and evolutionary computation. The approaches of detecting communities in a network can broadly be classified into three different categories: divisive hierarchical methods, agglomerative hierarchical methods [26], and optimization methods. The divisive hierarchical methods
for detecting communities are introduced in [8], [16], [31], and [38]. These methods take complete network as an input, start detecting the edges that connect different communities and remove them. The agglomerative hierarchical methods consider each node as a cluster and then merge similar commu- nities recursively until the whole graph is obtained [10], [29], [43], [57]. Optimization methods for detecting communities in a network, define an objective function that allows the division of a graph in the subgraph and tries to optimize the objective function in order to obtain the best partitioning of the network [3], [28], [49]. Among the optimization methods, many algorithms have been developed in which evolutionary algorithms have been found very competitive due to their population based search technique. In particular, [13], [23], [29], [40], and [56] have applied genetic algorithms to discover communities in a network.
In this paper, IEAM-RP1 is used to detect community
CMA-CSA
best 2009
1.0 f20-24,10-D
BIPOP-CMAES
best 2009
1.0 f1-24,10-D
best 2009 BIPOP-CMAES
BIPOP-CMAES SOO
0.8
CMA-CSA
JADEctpb
0.8
CMA-CSA
JADEctpb
JADEb
JADEb
CMAES IEAM-RP1
0.6
IEAM-RP SOO
0.6
CMAES IEAM-RP1
CMAES
SOO
JADEb
GP1-CMAES
0.4
IEAM-RP1 GP1-CMAES
0.4
IEAM-RP GP5-CMAES
RF1-CMAES GPSO1 GP5-CMAES
0.2
GP5-CMAES
BSif
RF1-CMAES
0.2
GP1-CMAES
BSif GPSO1
0 RF5-CMAES
0.0
GPSO1
RF5-CMAES
0.0
RF1-CMAES
RF5-CMAES
log10 of (# f-evals / dimension)
(D)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(E)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(F)
(F) All functions
BSif JADEctpb
1.0 f6-9,20-D
CMA-CSA
best 2009
1.0 f10-14,20-D
CMA-CSA
best 2009
best 2009 JADEb
0.8
BIPOP-CMAES
JADEctpb
0.8
BIPOP-CMAES CMAES
CMAES
JADEctpb
CMA-CSA CMAES
0.6
IEAM-RP1 IEAM-RP
0.6
JADEb
GP5-CMAES
JADEb
GP1-CMAES
IEAM-RP1 IEAM-RP
0.4
GPSO1 SOO
0.4
RF1-CMAES SOO
GPSO1 GP5-CMAES GP1-CMAES
0.2
GP1-CMAES GP5-CMAES RF1-CMAES
0.2
IEAM-RP IEAM-RP1 GPSO1
0 RF5-CMAES
0.0
BSif
RF5-CMAES
0.0
BSif
RF5-CMAES
log10 of (# f-evals / dimension)
(A)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(B)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(C)
An entry at position (i, j) is 1 if there is an edge from node i to node j, 0 otherwise. Let us consider G be a graph and S be a subgraph of G. Let node i belongs to S, the degree of i in S can be split as
ki(S) = kin(S)+kout(S)
- Community Definition
where kin
i i
out
i (S) and ki (S) are the internal and external degrees of i in S, and represented as
)
i(S) =
j∈S
Aij
where ki= ),jAij. Here, A is the adjacency matrix of graph.
is the number of edges connecting i to the other nodes in S, and
kout(S) = ) Aij
j∈/S
best 2009 BIPOP-CMAES
1.0 f20-24,20-D
BIPOP-CMAES
best 2009
1.0 f1-24,20-D
best 2009 BIPOP-CMAES
CMA-CSA
JADEctpb
0.8
CMA-CSA
JADEctpb
0.8
CMA-CSA
JADEctpb
CMAES
JADEb
IEAM-RP CMAES
0.6
JADEb SOO
0.6
CMAES IEAM-RP1
GP1-CMAES
IEAM-RP
JADEb
GP1-CMAES
0.4
RF1-CMAES IEAM-RP1
0.4
SOO
BSif
RF1-CMAES GP5-CMAES GPSO1
0.2
IEAM-RP GP5-CMAES
BSif
0.2
GPSO1 GP1-CMAES RF1-CMAES
0 BSif
0.0
RF5-CMAES
GPSO1
0.0
GP5-CMAES
RF5-CMAES
log10 of (# f-evals / dimension)
(D)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(E)
0 1 2 3 4 5 6 7 8
log10 of (# f-evals / dimension)
(F)
(F) All functions
in fig. 6(c) for the possible encoded genotypes, shown in fig. 6(b).
out
A subgraph S is said to be a weak community if
D.Initialization
Individuals are initialized randomly for the population. For
out
i(S) >) ki (S)
i∈S
a population of size P, and N number of nodes, after random
initialization, we will have a matrix of size P×N. Here, for each individual, we have a gene of length N. The value in
compared to the rest of the graph. In a weak community, the sum of the degrees within the subgraph is larger than the sum of the degrees toward the rest of the network.
C.GeneticRepresentation
On the basis of a number of the communities in a network, each individual of the population can be expressed as a set of real integer values. Hence, an individual in the population contains N genes and represented as
Pi= (pi1,pi2,…,piN) (13)
where pij(1 ≤j ≤N) is a random integer value within the range [1, C], and i = 1, 2,…, P. Here, N is the number of nodes in the network taken into consideration and P is the population size. The dimension of each individual is same as
the number of nodes in the network. The real integer value of each dimension represents which community number its corresponding node belongs to. If pij= pik, it represents that the jth node and kth node of ith individual belong to the same community structure, otherwise, they belong to different communities. An individual position vector represents a partition of the network. Fig. 6(a) represents a network of ten nodes partitioned into two communities. Fig. 6(b) shows the genotype structure of the network that has been shown in fig. 6(a). The two possible connected components are shown
each cell of the gene is community number in which a node belongs to. These values are in the range of (1, C), where C is the number of communities in a given network.
(a)
Node | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
Individual | 1 | 1 | 1 | 2 | 1 | 2 | 1 | 2 | 1 | 2 |
(b) |
E.Descriptionoffitnessfunction
In this paper, modularity optimization is taken as the fitness function. Let us consider the given network has C partitions V1, V2, V3,…,VC, where Viis the vertex set of subgraph Gi. A
Figure 6: (a) Network of 10 nodes divided into two commu- nities (1, 2, 3, 5, 7, 9) and (4, 6, 8, 10). (b) One possible individual of the network (a). (c) Graph-based representation of the individual.
is the adjacent matrix of the given network. Mathematically, the modularity density is defined as follows:
C
- EXPERIMENTAL RESULTS AND ANALYSIS In this section of the paper, we analyze the effectiveness
of the proposed approach on some real-world networks for
which the partitioning is known. Further, the results obtained by the algorithms are compared with existing state-of-the- art methods. The maximum number of function evaluations was set to 5000 for each algorithm. It was found that the proposed approach is very competitive compared to other existing algorithms.
- Description of the Real-life Data Sets
In this paper, four standard real-world networks have been taken into consideration. These networks are, the Zachary’s Karate Club [2], [17], the American College Football [2],
[17], the Bottlenose Dolphins [2], [17], and the Krebs’ [2], [50] books on American Politics. The community structures of these networks are already known. The details of these net-
works can also be found at “http://www-personal.umich.edu/∼
(14)
mejn/netdata/”. The description of the above four networks is
i∈V1 ,j∈V2
Aij
- Comparison of Experimental Results of IEAM-RP1 With Other Algorithms
i∈V1 ,j∈V¯2
Aij,V¯2 = V2 −V1
In this subsection, the comparison between IEAM-RP1 and other single objective algorithms (GA [51], FM [10], BGLL [8], MODBSA/D [62], and IEAM-RP [52]) for four real-
of intra-connections inside communities and second term rep- resents the degree of inter connections between communities. The value of D varies from 0 to 1, and the value approaching to 1 represents good partitioning of the network. In general, modularity value for networks typically falls in the range of (0.3, 0.7). Therefore, the problem of community detection can be regarded as a problem of maximizing the modularity value given in equation 14.
F. StepsofIEAM-RP1forcommunitydetection
The steps of IEAM-RP1 method to solve community detec- tion problem is given below:
Step1: Initialize P (population size), the maximum number of iteration, and N (number of nodes). The dimension of the problem will be same as the number of nodes in the network.
Step2: Initialize the initial population randomly. This will give a matrix of size P×N.
Step3: Compute the fitness value of the initial population
using equation 14.
Step4: Initialize the iteration counter = 1.
Step5: Apply adaptation operator 2 on the population to create offspring population.
Step6: Apply selection operator 3 to select best N members for the next generation.
Step7: Iteration counter = Iteration counter + 1.
Step8: If Iteration counter <Maximum number of iteration, go to step 5, else go to step 9.
Step9: The best member of the population gives the optimal partition of the network.
world networks are illustrated. Shi et al. [51] proposed a genetic algorithm based approach for modularity maximiza- tion in community detection. In this approach, a locus-based adjacency encoding technique has been used to represent the community partitions. This encoding technique enables the algorithm to determine the number of communities automati- cally. A hierarchical agglomeration based algorithm has been developed by Clauset et al. [10] for community detection in complex networks. At the beginning of this algorithm, each node is treated as an isolated community. Hence, in the beginning, the number of nodes and communities are same. Then, the two communities are merged if their amalgamation produces the largest modularity. The process is repeated until no further improvements can be achieved. Blondel et al. [8] developed a heuristic method to detect the community structure of large networks. This method was based on modularity max- imization. The method was divided into two phases that were repeated iteratively. At each phase, the nodes or the community structure of the network are merged with their neighbors when the gain of the modularity is the largest. The algorithm is repeated until no further improvement can be achieved. Feng et al. [62] proposed multiobjective discrete backtracking search optimization algorithm with decomposition (MODBSA/D) for community detection in a complex network. In MODBSA/D, ratio cut (RC) and negative ratio association (NRA) was used as two objectives for simultaneous optimization. Here, NRA and RC measure the density of intra-community and inter- community connections respectively. The modularity and the NMI are used as two metrics that need to be maximized to uncover the community structure in complex networks. In table
Table IV shows that the maximum value and mean value for all networks is the largest for IEAM-RP1. Moreover, from the table, it is obvious that the values obtained by IEAM-RP1 for all the networks have large difference compared to other algorithms. For karate network, the best and worst values of modularity was obtained by IEAM-RP1 and FM respectively. This shows that IEAM-RP1 is able to extract community structure more efficiently for karate network among all algo- rithms. However, FM can hardly obtain the optimal community partition for this network. For all the networks, second best value of modularity is observed by IEAM-RP. For Dolphin, Football, and Polbooks networks the third best value of mod- ularity was obtained by MODBSA/D, MODBSA/D, and GA respectively. Hence, experimental results of modularity have shown that IEAM-RP1 is the best alternative algorithm for detecting community structure in complex networks. The better results of IEAM-RP1 in modularity optimization could be expected due to its convergence rate and diversity preservation capability that the algorithm has shown in single objective op- timization problems. The way of exploitation and exploration during search process was found very effective for the desired objective. Figure 7, and 8 are the possible network of karate, and dolphins respectively using IEAM-RP1 for the highest modularity value. These networks are very close to the true network.
Table III: Description of the real-world networks
Networks | #Nodes | #Edges | #Clusters | Description |
Karate | 34 | 78 | 2 | Zachary’s karate club |
Dolphins | 62 | 159 | 2 | Dolphin social network |
Polbooks | 105 | 441 | 2 | Books about US politics |
Football | 115 | 613 | 12 | American College football |
Table IV: Comparison of modularity values obtained on the real-world networks
Algorithm | Modularity (Q) | Karate | Dolphin | Football | Polbooks |
GA | Maximum | 0.4181 | 0.5015 | 0.5927 | 0.5254 |
Mean | 0.4143 | 0.4973 | 0.5881 | 0.5223 | |
Std | 0.0038 | 0.0008 | 0.0071 | 0.0054 | |
FM | Maximum | 0.3817 | 0.5023 | 0.5756 | 0.5026 |
Mean | 0.3627 | 0.5014 | 0.5712 | 0.5007 | |
Std | 0.0032 | 0.0005 | 0.0056 | 0.0014 | |
BGLL | Maximum | 0.4154 | 0.5157 | 0.6041 | 0.4921 |
Mean | 0.4123 | 0.5113 | 0.6012 | 0.4903 | |
Std | 0.0009 | 0.0004 | 0.0087 | 0.0029 | |
MODBSA/D | Maximum | 0.4197 | 0.5254 | 0.6047 | 0.5234 |
Mean | 0.4197 | 0.5202 | 0.6038 | 0.5216 | |
Std | 0.0000 | 0.0008 | 0.0057 | 0.0027 | |
IEAM-RP | Maximum | 0.6443 | 0.6218 | 0.6121 | 0.6045 |
Mean | 0.6211 | 0.6205 | 0.6109 | 0.5915 | |
Std | 0.0122 | 0.0021 | 0.0061 | 0.0076 | |
IEAM-RP1 | Maximum | 0.6919 | 0.6884 | 0.6681 | 0.6545 |
Mean | 0.6871 | 0.6705 | 0.6409 | 0.6493 | |
Std | 0.0092 | 0.0345 | 0.0678 | 0.0064 |
- CONCLUSION AND FUTURE WORK
In this paper, a variant of IEAM-RP named “IEAM-RP1” is developed to solve single objective optimization problems.
The main contribution of this algorithm is to improve the overall performance of IEAM-RP by balancing the exploitation and exploration. The performance of the developed algorithm was tested on 24 benchmark functions of COCO framework. Further, the algorithm was applied to a real-world problem of community detection of complex network. In order to analyze the performance of the proposed approach, four standard real- world networks have been taken with known cluster size. Modularity maximization was taken as one of the objective function.
The convergence graphs for BBOB benchmark functions have shown the effectiveness of proposed algorithm. From the convergence graphs, it is clear that in most of the cases the performance of the proposed algorithm is better as compared to IEAM-RP. In the case of real-world application of community detection, the proposed algorithm outperformed to all other algorithms. The experimental values of tables IV indicate that IEAM-RP1 can be considered as one of the alternative algo- rithm to extract community structure in complex networks. In the future, we want to extend IEAM-RP1 to handle multiobjec- tive optimization problems simultaneously. After developing multiobjective version of IEAM-RP1, we have to apply it to a real-world problem.