代码之家  ›  专栏  ›  技术社区  ›  Erel Segal-Halevi

使用python多处理的Anytime算法

  •  2
  • Erel Segal-Halevi  · 技术社区  · 7 年前

    我想用Python编写一个类,该类可以在给定的时间段内运行特定的算法,然后停止并返回它在超时之前找到的最新值。

    例如,我编写了一个简单的类,用于查找向量中的最大值:

    import time, multiprocessing
    
    class AnytimeAlgorithm:
        def __init__(self, vector):
            self.vector = vector
            self.result = 0
    
        def update_forever(self):
            while True:
                i = random.randint(0, len(self.vector) - 1)
                if self.vector[i] > self.result:
                    self.result = self.vector[i]
                    print("self", self, "result", self.result)
    
        def result_after(self, seconds):
            p = multiprocessing.Process(target=self.update_forever, name="update_forever", args=())
            p.start()
            p.join(seconds)
            if p.is_alive():
                p.terminate()
            p.join()
            print("self", self, "final result", self.result)
            return self.result
    
    
    if __name__ == "__main__":
        import random, numpy as np
        vector = np.random.rand(10000000)
        maximizer = AnytimeAlgorithm(vector)
        print(maximizer.result_after(0.01))
    

    当我运行这个类时,它表明,正如预期的那样,结果会随着时间的推移而增加。但是,返回值始终为0!以下是典型输出:

    self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.420804014071
    self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.444555804935
    self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.852844624467
    self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.915336332491
    self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.964438367823
    self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.975029317702
    self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.975906346116
    self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.987784181209
    self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.996998726143
    self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.999480015562
    self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> result 0.999798469992
    self <__main__.AnytimeAlgorithm object at 0x7f8e94de1cf8> final result 0
    

    我犯了什么错?

    1 回复  |  直到 7 年前
        1
  •  2
  •   javidcf    7 年前

    当您使用 multiprocessing 在Python中,它实际上创建了一个新的独立Python进程,并根据您的要求运行它。事实上,API被简化了 喜欢 multithreading 不应该让你困惑。在主流程中,创建 AnytimeAlgorithm 对象然后,创建 Process 运行函数的;这将创建一个新进程并复制解释器的状态,因此您有一个 AnytimeAlgorithm算法 在新流程中也可以使用。然而,这两个对象并不相同,它们甚至不在同一个过程中,因此它们不能(直接)共享任何信息。在新流程中对对象所做的更改仅影响该流程中的对象副本,而不影响原始副本。

    您可以查看有关如何在主进程和派生进程之间共享信息的文档,例如 pipes, queues shared memory ,这可能是一个很好的选择:

    import multiprocessing
    import random
    import numpy as np
    
    class AnytimeAlgorithm:
        def __init__(self, vector):
            self.vector = vector
            self.result = multiprocessing.Value('d', 0.0)
    
        def update_forever(self):
            while True:
                i = random.randint(0, len(self.vector) - 1)
                if self.vector[i] > self.result.value:
                    self.result.value = self.vector[i]
                    print("self", self, "result", self.result.value)
    
        def result_after(self, seconds):
            p = multiprocessing.Process(target=self.update_forever, name="update_forever", args=())
            p.start()
            p.join(seconds)
            if p.is_alive():
                p.terminate()
            p.join()
            print("self", self, "final result", self.result.value)
            return self.result.value
    
    
    if __name__ == "__main__":
        import random, numpy as np
        vector = np.random.rand(10000000)
        maximizer = AnytimeAlgorithm(vector)
        print(maximizer.result_after(0.1))
    

    输出:

    self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.01491873361800522
    self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.060776471658675835
    self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.7476611733129928
    self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.9468162088782311
    self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.9531978645650057
    self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.9992671080742871
    self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.999293465561661
    self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.9996894825552965
    self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.9998511378366163
    self <__mp_main__.AnytimeAlgorithm object at 0x0000017D26AB7898> result 0.999933119926922
    self <__main__.AnytimeAlgorithm object at 0x00000195FBDC7908> final result 0.999933119926922
    0.999933119926922
    

    请注意,使用 Value 由于进程间同步访问,会产生额外的开销。阅读文档以了解该类的锁定工作原理,并考虑以最小化对共享资源访问的方式编写算法(例如,使用在每次计算结束时编写的时态局部变量)。