代码之家  ›  专栏  ›  技术社区  ›  Charles Pehlivanian

最大分区

  •  0
  • Charles Pehlivanian  · 技术社区  · 5 年前

    给定一个整数 n ,和2个实数序列{ a_1 , ..., a_n }以及{ b_1 , ..., b_n },与 a_i , b_i >0,为所有人 对于给定的固定值 m < n 让{ P_1 , ..., P_m }是集合{1。。。, n }如in P_1 UU P_n 1. n },与 P_i 的成对不相交(空交集)。我想找一个大小合适的分区 m 使表达最大化

    该集合的分区数量为 n 选择 m ,用蛮力做的事情太大了。是否有更好的迭代或近似解决方案?

    为了深入了解这个问题,最后的代码块通过暴力解决。针对实际尺寸问题( n ~1e6, k ~20)它现在无法使用,但很容易被破坏。

    编辑 :预报告 , b 根据价值观 2. b 总是给出递增的分区索引:

    a = rng.uniform(low=0.0, high=10.0, size=NUM_POINTS)
    b = rng.uniform(low=0.0, high=10.0, size=NUM_POINTS)
    
    ind = np.argsort(a/b)
    (a,b) = (seq[ind] for seq in (a,b))
    

    示例运行

    NUM_POINTS = 16
    PARTITION_SIZE = 3
    

    给出了一个最优的划分

    [[0, 1, 2, 3, 4, 5, 6, 7], [8, 9], [10, 11]]
    

    其在指数中是单调的。我想我可以证明这一点。如果是这样,暴力搜索可以改进为 n 选择 k -1次,还是很长,但节省了不少。

     import numpy as np
     import multiprocessing
     import concurrent.futures
     from functools import partial
     from itertools import islice
    
     rng = np.random.RandomState(55)
    
     def knuth_partition(ns, m):
         def visit(n, a):
             ps = [[] for i in range(m)]
             for j in range(n):
                 ps[a[j + 1]].append(ns[j])
             return ps
    
         def f(mu, nu, sigma, n, a):
             if mu == 2:
                 yield visit(n, a)
             else:
                 for v in f(mu - 1, nu - 1, (mu + sigma) % 2, n, a):
                     yield v
             if nu == mu + 1:
                 a[mu] = mu - 1
                 yield visit(n, a)
                 while a[nu] > 0:
                     a[nu] = a[nu] - 1
                     yield visit(n, a)
             elif nu > mu + 1:
                 if (mu + sigma) % 2 == 1:
                     a[nu - 1] = mu - 1
                 else:
                     a[mu] = mu - 1
                 if (a[nu] + sigma) % 2 == 1:
                     for v in b(mu, nu - 1, 0, n, a):
                         yield v
                 else:
                     for v in f(mu, nu - 1, 0, n, a):
                         yield v
                 while a[nu] > 0:
                     a[nu] = a[nu] - 1
                     if (a[nu] + sigma) % 2 == 1:
                         for v in b(mu, nu - 1, 0, n, a):
                             yield v
                     else:
                         for v in f(mu, nu - 1, 0, n, a):
                             yield v
    
         def b(mu, nu, sigma, n, a):
             if nu == mu + 1:
                 while a[nu] < mu - 1:
                     yield visit(n, a)
                     a[nu] = a[nu] + 1
                 yield visit(n, a)
                 a[mu] = 0
             elif nu > mu + 1:
                 if (a[nu] + sigma) % 2 == 1:
                     for v in f(mu, nu - 1, 0, n, a):
                         yield v
                 else:
                     for v in b(mu, nu - 1, 0, n, a):
                         yield v
                 while a[nu] < mu - 1:
                     a[nu] = a[nu] + 1
                     if (a[nu] + sigma) % 2 == 1:
                         for v in f(mu, nu - 1, 0, n, a):
                             yield v
                     else:
                         for v in b(mu, nu - 1, 0, n, a):
                             yield v
                 if (mu + sigma) % 2 == 1:
                     a[nu - 1] = 0
                 else:
                     a[mu] = 0
             if mu == 2:
                 yield visit(n, a)
             else:
                 for v in b(mu - 1, nu - 1, (mu + sigma) % 2, n, a):
                     yield v
    
         n = len(ns)
         a = [0] * (n + 1)
         for j in range(1, m + 1):
             a[n - m + j] = j - 1
         return f(m, n, 0, n, a)
    
     def Bell_n_k(n, k):
         ''' Number of partitions of {1,...,n} into
             k subsets, a restricted Bell number
         '''
         if (n == 0 or k == 0 or k > n): 
             return 0
         if (k == 1 or k == n): 
             return 1
    
         return (k * Bell_n_k(n - 1, k) + 
                     Bell_n_k(n - 1, k - 1)) 
    
     NUM_POINTS = 13
     PARTITION_SIZE = 4
     NUM_WORKERS = multiprocessing.cpu_count()
     INT_LIST= range(0, NUM_POINTS)
     REPORT_EACH = 10000
    
     partitions = knuth_partition(INT_LIST, PARTITION_SIZE)
     # Theoretical number of partitions, for accurate
     # division of labor
     num_partitions = Bell_n_k(NUM_POINTS, PARTITION_SIZE)
     bin_ends = list(range(0,num_partitions,int(num_partitions/NUM_WORKERS)))
     bin_ends = bin_ends + [num_partitions] if num_partitions/NUM_WORKERS else bin_ends
     islice_on = list(zip(bin_ends[:-1], bin_ends[1:]))
    
     # Have to consume it; can't split work on generator
     partitions = list(partitions)
     rng.shuffle(partitions)
     slices = [list(islice(partitions, *ind)) for ind in islice_on]
     return_values = [None] * len(slices)
     futures = [None] * len(slices)
    
     a = rng.uniform(low=0.0, high=10.0, size=NUM_POINTS)
     b = rng.uniform(low=0.0, high=10.0, size=NUM_POINTS)
     ind = np.argsort(a/b)
     (a,b) = (seq[ind] for seq in (a,b))
    
     def start_task():
         print('Starting ', multiprocessing.current_process().name)
    
     def _task(a, b, partitions, report_each=REPORT_EACH):
         max_sum = float('-inf')
         arg_max = -1
         for ind,part in enumerate(partitions):
             val = 0
             for p in part:
                 val += sum(a[p])**2/sum(b[p])
             if val > max_sum:
                 max_sum = val
                 arg_max = part
             if not ind%report_each:
                 print('Percent complete: {:.{prec}f}'.
                       format(100*len(slices)*ind/num_partitions, prec=2))
         return (max_sum, arg_max)
    
     def reduce(return_values):
         return max(return_values, key=lambda x: x[0])
    
     task = partial(_task, a, b)
    
    
     with concurrent.futures.ThreadPoolExecutor() as executor:
         for ind,slice in enumerate(slices):
             futures[ind] = executor.submit(task, slice)
             return_values[ind] = futures[ind].result()        
    
    
     reduce(return_values)
    
    0 回复  |  直到 5 年前
        1
  •  0
  •   Zabir Al Nazi    5 年前

    我试图用示例输入简单地重新表述问题,如果我遗漏了什么,请告诉我。

    A=[1,3,2,1,4] B=[2,1,5,3,1] n=长度(A)=长度(B)=5

    我们有两个正整数列表。

    我们需要找到一组索引S(N={1,2,3,..N}的子集),假设它是{2,3,5}。现在,我们得到一个新的集合S'=N-S={1,4}

    对于S和S',(sum(A[S]))^2/(sum(B[S']))需要最大化。

    正如你所说,近似解也会奏效。我们可以使用的启发式方法之一是,我们需要选择这样的S,这样A列表的值就很高,B列表的值就是 低。

    当我们取A子集的和的平方时,让我们对A进行排序并选择一个子列表,这样我们就能得到最大分数。

    import numpy as np
    
    A = np.array([1, 2, 3, 4, 1, 2, 3])
    B = np.array([3, 3, 1, 2, 1, 3, 1])
    
    sorted_idx = sorted(range(len(A)), key=lambda k: A[k]) # also other sorting strategy can be used, A[k]/B[k]
    
    A_p = A[sorted_idx]
    B_p = B[sorted_idx]
    
    max_s = 0
    part_ans = -1
    
    for i in range(len(A_p)):
      cur_s = (sum(A_p[:i])**2)/sum(B_p[i:])
      if cur_s >= max_s:
        print(cur_s)
        max_s = cur_s
        part_ans = i
    
    print(f'The partitions are: {sorted_idx[:i]} and {sorted_idx[i:]}')