【DM论文阅读杂记】基于图聚类和半监督分类的自加权多核学习

时间：2020-09-18 15:46:00 阅读：75 评论：0 收藏：0 [点我收藏+]

Paper Title

Self-weighted Multiple Kernel Learning for Graph-based Clustering and Semi-supervised Classification

Basic algorithm and main steps

Basic ideas

The paper proposes a novel multiple kernel learning framework for clustering and semi-supervised learning.

Algorithm

Input: Kernel matrices $ {H^i}_{i=1}^r $, parameters $ α, β, γ $
Initialize: Random matrix $ S, K =\sum_i H^i / r $R
Repeat

(1) For each i, update the i-th column of $S$ by $ S_i = (\gamma I + K) ^{-1} (K_{i,:} - \frac{\alpha G_i}{4} ) $

(2) Calculate $K$ by $ K = \frac{2S^T - SS^T - I +2\beta \sum_i \omega_i H^i}{2\beta \sum_i \omega_i} $
(3) Update $\omega$ by (9) $ \omega_i = \frac{1}{2||H^i - K||F} $
(4) For clustering, calculte $P$ as the $c$ smallest eigenvectors of $L = D ? S$ correspond to the $c$ smallest eigenvalues. For SSL, calculte $P$ according to $ P_u = - L{uu}^{-1} L_{ul} Y_l $
Until stopping criterion is met.

Details

Assumptions

Each kernel is a perturbation of the consensus kernel
The kernel that is close to the consensus kernel should receive a large weight

SKML model

Self-weighted Multiple Kernel Learning

\[\min \limits_{S,P,K} T_r(K - 2KS + S^T K S) + \gamma||S||_F^2 + \alpha T_r(P^TLP) + \beta \sum _{i=1}^r \omega_i ||H^i - K ||_F^2$ \]

? $ s.t. P^T P = I, S \geq 0 $

$K $: rarget kernel matrix

$ S $: similarity graph matrix

$ \omega_i $: optimal weight

$ H^i $: the i-th kernel matric, input

$ \alpha, \beta,\gamma $: input parameters

$ P$: cluster indicator matrix or label matrix

Optimization

Fix P, K，optimize S

Use $ \sum_{ij} 1/2 ||P_{i,:} - P_{j,:} ||2^2 s{ij} = T_r(P^TLP)$ and set the first derivative of $S$ to be zero, obtain:

$ S_i = (\gamma I + K) ^{-1} (K_{i,:} - \frac{\alpha G_i}{4} ) $
Fix S，P，optimize K

? $ K = \frac{2S^T - SS^T - I +2\beta \sum_i \omega_i H^i}{2\beta \sum_i \omega_i} $

Fix S，K，optimize P

? $ \min \limits_P T_r(P^TLP), s.t. P^TP = I $

Integrated SMKL model

Unify the two fundamental components of SSL(Graph construction and label inference) into a unified framework. Reformulate SMKL for semi-supervised classification as:

$ \min \limits_{S,P,K}T_r(K-2KS + S^TKS) + \gamma ||S||_F^2 + \alpha T_r(P^TLP) + \beta \sum_i \omega_i||H^i - K||_F^2 $

$ s.t. S \geq 0, P_l = Y_l $

$ Y_l = [y_1,y_2,...,y_l]^T $ : label matrix

$ y_i \in R^{c x 1} $ ：one-hot，$y_{ij} = 1 $ means the i-th sample belong to label j 。

$ P = [P_l ; P_u] = [Y_l; P_u] $ $ P_u $ : unlabeled

$ P_u = - L_{uu}^{-1} L_{ul} Y_l $

Predict： $ y_i = arg \max \limits_j P_{ij}$

Motivation

Existing parameter-weighted multiple kernel learning method suffers some problems.

The linear combination of base kernels over reduces the feasible set of optimal kernels, which could result in the learned kernel with limited representation ability.
The optimization of kernel weights may lead to inappropriate assignments due to noise and carelessly designed algorithms.

Contribution

Proposed a novel way to construct the optimal kernel and assign weights to base kernels, without additional parameters and solving quadratic program.
Integrated it into a unified framework for clustering and semi-supervised classification.

My own idea

Qestions: How to choose a more suitable base kernel matrix H; The meaning of "allows the most suitable kernel to reside in some kernels’neighborhood "; Details of derivation of aforementioned self-expressiveness based graph learning method; Details of semi-supervised Classification; Provtion of the theory one metioned in this paper
Shortcomings I assume: The paper only analysis the sensitity of parameters $\alpha , \beta, \gamma $ using YALE dataset, on which several methods (including the SMKL ) don‘t perform well.
Relation. Comparing to traditional kernel learning method, the self-weighted methods proposed in the paper sufficiently considers the negotiation between the process of learning the optimal kernel and that of graph/label learning. Besides, We can directly calculate the weight from kernel matrices,thus avoiding solving quadratic program.

【DM论文阅读杂记】基于图聚类和半监督分类的自加权多核学习

原文：https://www.cnblogs.com/chensy86/p/13623621.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)