代码之家  ›  专栏  ›  技术社区  ›  junsung kang

Cassandra突然挂起,返回WindowsFileSystemException:“该进程不可访问,因为该文件正被另一个进程使用”

  •  -1
  • junsung kang  · 技术社区  · 8 月前

    我正在运行一个产品。

    我正在使用以下版本:

    服务器规格为128GB的总内存。我正在使用24GB。和WindowServer 2019标准。
    此外,“nodetool flush”每四小时运行一次。

    如果你查看下面的Cassandra日志,你可以看到这项工作一直持续到14:47:05.865。

    [ 00:47:06.117 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 00:47:06,116 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
    [ 01:47:06.098 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 01:47:06,097 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
    [ 02:46:58.241 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [CompactionExecutor:9301] 2023-09-04 02:46:58,240 AutoSavingCache.java:395 - Saved KeyCache (527 items) in 7 ms ] 
    [ 02:47:06.080 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 02:47:06,080 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
    [ 03:47:06.062 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 03:47:06,061 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
    [ 04:47:06.044 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 04:47:06,043 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
    [ 05:47:06.026 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 05:47:06,025 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
    [ 06:46:58.152 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [CompactionExecutor:9391] 2023-09-04 06:46:58,151 AutoSavingCache.java:395 - Saved KeyCache (527 items) in 7 ms ] 
    [ 06:47:06.008 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 06:47:06,008 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
    [ 07:47:05.990 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 07:47:05,989 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
    [ 08:47:05.972 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 08:47:05,971 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
    [ 09:47:05.953 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 09:47:05,952 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
    [ 10:46:58.065 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [CompactionExecutor:9480] 2023-09-04 10:46:58,064 AutoSavingCache.java:395 - Saved KeyCache (527 items) in 7 ms ] 
    [ 10:47:05.936 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 10:47:05,936 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
    [ 11:47:05.918 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 11:47:05,917 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
    [ 12:47:05.901 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 12:47:05,900 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
    [ 13:47:05.884 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 13:47:05,883 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
    [ 14:46:57.977 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [CompactionExecutor:9568] 2023-09-04 14:46:57,976 AutoSavingCache.java:395 - Saved KeyCache (527 items) in 7 ms ] 
    [ 14:47:05.865 ] [ [34mINFO [0;39m ] [ [36m.g.n.e.c.a.Cassandra[0;39m ] [ lambda$accept$0      ] [ 62  ] [ INFO  [IndexSummaryManager:1] 2023-09-04 14:47:05,865 IndexSummaryRedistribution.java:78 - Redistributing index summaries ] 
    [ 15:21:18.913 ] [ [34mINFO [0;39m ] [ [36m.c.EmbeddedCassandra[0;39m ] [ stop                 ] [ 85  ] [ Stops EmbeddedCassandra[name='cassandra-1', version='3.11.7'] ] 
    [ 15:21:21.915 ] [ [31mWARN [0;39m ] [ [36mWindowsCassandraNode[0;39m ] [ stop                 ] [ 111 ] [ java.lang.Process.destroyForcibly() has been called for 'WindowsCassandraNode[pid='-1', exitValue='not exited']'. The behavior of this method is undefined, hence Cassandra's node could be still alive ] 
    [ 15:21:21.916 ] [ [34mINFO [0;39m ] [ [36mdedCassandraDatabase[0;39m ] [ stop                 ] [ 131 ] [ EmbeddedCassandraDatabase[name='cassandra-1', version='3.11.7', node=WindowsCassandraNode[pid='-1', exitValue='1']] has been stopped ] 
    [ 15:21:21.918 ] [ [1;31mERROR[0;39m ] [ [36mdedCassandraDatabase[0;39m ] [ stop                 ] [ 137 ] [ Working Directory 'C:\Windows\TEMP\apache-cassandra-3.11.7-3621485632518011638' has not been deleted ] 
    java.nio.file.FileSystemException: C:\Windows\TEMP\apache-cassandra-3.11.7-3621485632518011638\.toDelete: The process is inaccessible because the file is in use by another process.
    
        at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:86)
        at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:97)
        at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:102)
        at sun.nio.fs.WindowsFileSystemProvider.implDelete(WindowsFileSystemProvider.java:269)
        at sun.nio.fs.AbstractFileSystemProvider.deleteIfExists(AbstractFileSystemProvider.java:108)
        at java.nio.file.Files.deleteIfExists(Files.java:1165)
        at com.github.nosan.embedded.cassandra.commons.util.FileUtils$1.visitFile(FileUtils.java:82)
        at com.github.nosan.embedded.cassandra.commons.util.FileUtils$1.visitFile(FileUtils.java:78)
        at java.nio.file.Files.walkFileTree(Files.java:2670)
        at java.nio.file.Files.walkFileTree(Files.java:2742)
        at com.github.nosan.embedded.cassandra.commons.util.FileUtils.delete(FileUtils.java:78)
        at com.github.nosan.embedded.cassandra.EmbeddedCassandraDatabase.stop(EmbeddedCassandraDatabase.java:134)
        at com.github.nosan.embedded.cassandra.EmbeddedCassandra.doStop(EmbeddedCassandra.java:157)
        at com.github.nosan.embedded.cassandra.EmbeddedCassandra.stop(EmbeddedCassandra.java:86)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.springframework.beans.factory.support.DisposableBeanAdapter.invokeCustomDestroyMethod(DisposableBeanAdapter.java:339)
        at org.springframework.beans.factory.support.DisposableBeanAdapter.destroy(DisposableBeanAdapter.java:273)
        at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroyBean(DefaultSingletonBeanRegistry.java:571)
        at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingleton(DefaultSingletonBeanRegistry.java:543)
        at org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingleton(DefaultListableBeanFactory.java:1072)
        at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.destroySingletons(DefaultSingletonBeanRegistry.java:504)
        at org.springframework.beans.factory.support.DefaultListableBeanFactory.destroySingletons(DefaultListableBeanFactory.java:1065)
        at org.springframework.context.support.AbstractApplicationContext.destroyBeans(AbstractApplicationContext.java:1060)
        at org.springframework.context.support.AbstractApplicationContext.doClose(AbstractApplicationContext.java:1029)
        at org.springframework.context.support.AbstractApplicationContext.close(AbstractApplicationContext.java:978)
        at org.springframework.boot.SpringApplication.close(SpringApplication.java:1284)
        at org.springframework.boot.SpringApplication.exit(SpringApplication.java:1271)
        at kr.co.wisenut.manager.sf1.SearchManagerApplication.main(SearchManagerApplication.java:72)
    [ 15:21:21.920 ] [ [34mINFO [0;39m ] [ [36m.c.EmbeddedCassandra[0;39m ] [ stop                 ] [ 87  ] [ EmbeddedCassandra[name='cassandra-1', version='3.11.7'] has been stopped ]
    
    

    然而,如果我查看我正在运行的应用程序日志,我可以看到从08:44:48.596左右我无法访问Cassandra。

    [ 08:44:48.596 ] [ [1;31mERROR[0;39m ] [ [36mc.e.ExceptionHandler[0;39m ] [ renderErrorResponse  ] [ 73  ] [ Query; CQL [SELECT * FROM manager_common_user;]; All host(s) tried for query failed (no host was tried); nested exception is com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (no host was tried) ] 
    [ 08:45:37.999 ] [ [1;31mERROR[0;39m ] [ [36mc.e.ExceptionHandler[0;39m ] [ renderErrorResponse  ] [ 73  ] [ Query; CQL [SELECT * FROM manager_common_user;]; All host(s) tried for query failed (no host was tried); nested exception is com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (no host was tried) ] 
    [ 15:18:56.148 ] [ [1;31mERROR[0;39m ] [ [36mc.e.ExceptionHandler[0;39m ] [ renderErrorResponse  ] [ 73  ] [ Query; CQL [SELECT * FROM manager_common_user;]; All host(s) tried for query failed (no host was tried); nested exception is com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (no host was tried) ] 
    [ 15:21:14.880 ] [ [34mINFO [0;39m ] [ [36m.CustomTaskScheduler[0;39m ] [ shutdown             ] [ 208 ] [ Shutting down ExecutorService 'customTaskScheduler' ]
    
    

    并且,禁用或挂起问题是否与下面的日志有关?

    [ 15:21:21.918 ] [ [1;31mERROR[0;39m ] [ [36mdedCassandraDatabase[0;39m ] [ stop                 ] [ 137 ] [ Working Directory 'C:\Windows\TEMP\apache-cassandra-3.11.7-3621485632518011638' has not been deleted ] 
    java.nio.file.FileSystemException: C:\Windows\TEMP\apache-cassandra-3.11.7-3621485632518011638\.toDelete: The process is inaccessible because the file is in use by another process.
    

    在中间,Cassandra日志中没有留下任何错误日志。
    请帮我弄清楚我缺了什么部分。

    1 回复  |  直到 8 月前
        1
  •  0
  •   Erick Ramirez    8 月前

    根据您发布的日志条目,文件系统级别似乎存在问题,其中一个临时文件被进程锁定,因此另一个进程无法访问它。

    Cassandra使用的Java文件I/O库在Windows上一直存在问题,尤其是在NTFS上。对于用户来说,这一直是一个问题的来源,我们认为继续支持该项目不好(请参阅 discussion in this thread )。

    我们最终在Cassandra 4.0中完全放弃了对Windows的支持( CASSANDRA-16171 )。

    建议Windows用户使用的解决方法有: