jmschrei / apricot

apricot implements submodular optimization for the purpose of selecting subsets of massive data sets to train machine learning models quickly. See the documentation page: https://apricot-select.readthedocs.io/en/latest/index.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Have you ever considering implement the multilinear extension and continuous greedy algorithm in apricot package?

athossun opened this issue · comments

Since for now the relaxation method is one of the important solution to submodular optimization and it is easy to do extension comparing with the highly tailored combinatorial algorithms.

btw, it is a fantastic job you have done!

Yes, I have considered it, but I didn't quite understand how it worked from the papers. This is, in part, because my time for adding new features to apricot dwindled with my new position. Do you have a good resource that could help? My development time is somewhat low but I agree that it's an important optimizer and would like to support it.

Yes, I have considered it, but I didn't quite understand how it worked from the papers. This is, in part, because my time for adding new features to apricot dwindled with my new position. Do you have a good resource that could help? My development time is somewhat low but I agree that it's an important optimizer and would like to support it.

Thx for this replying. I suggest that it can start with the multilinear extension of submodular functions you offered. In high level, the continuous greedy is a kind of Frank Wolfe algorithm and it solves the relaxation problem iteratively, i.e., maximizing F(x) over a certain polytope constraint, where F() is the multilinear extension and x is vector with n dimensions (n is the size of the ground set).

In the "Experiments" Section of the paper "Differentially Private Decomposable Submodular Maximization (AAAI-21)" given by Chaturvedi et al., they gave a specific multilinear extension of the submodular function they using. I think maybe it can help you.

btw, are the submodular functions you offering in apricot all monotone functions? Since I check the code in "base.py", there is a parameter "n_samples" required. And we all know that the size of the solution can not be fixed for maximizing non-monotone submodular function.