'find -exec' 或 'find | 哪个更快?xargs -0'?

人气:165 发布:2022-10-16 标签: unix shell find xargs

问题描述

在我的 Web 应用程序中,我使用 PHP 脚本渲染页面,然后从中生成静态 HTML 文件.将静态 HTML 提供给用户以提高性能.HTML 文件最终会变得陈旧,需要删除.

In my web application I render pages using PHP script, and then generate static HTML files from them. The static HTML are served to the users to speed up performance. The HTML files become stale eventually, and need to be deleted.

我正在讨论编写驱逐脚本的两种方法.

I am debating between two ways to write the eviction script.

第一种是使用单个查找命令,例如

The first is using a single find command, like

find /var/www/cache -type f -mmin +10 -exec rm {} ;

第二种形式是通过 xargs 管道,类似于

The second form is by piping through xargs, something like

find /var/www/cache -type f -mmin +10 -print0 | xargs -0 rm

第一种形式为它找到的每个文件调用 rm,而第二种形式只是将所有文件名发送到单个 rm(但文件列表可能非常长).

The first form invokes rm for each file it finds, while the second form just sends all the file names to a single rm (but the file list might be very long).

哪种形式会更快?

在我的例子中,缓存目录在几个 Web 服务器之间共享,所以这一切都是通过 NFS 完成的,如果这对这个问题很重要的话.

In my case, the cache directory is shared between a few web servers, so this is all done over NFS, if that matters for this issue.

推荐答案

我希望 xargs 版本会稍微快一些,因为您不会为每个文件名生成一个进程.但是,如果实际上在实践中有很大差异,我会感到惊讶.如果您担心 xargs 发送给每个 rm 调用的长列表,您可以使用 -l 和 xargs 来限制它将使用的令牌数量.但是,xargs 知道最长的 cmdline 长度,不会超过这个长度.

I expect the xargs version to be slightly faster as you aren't spawning a process for each filename. But, I would be surprised if there was actually much difference in practice. If you're worried about the long list xargs sends to each invocation of rm, you can use -l with xargs to limit the number of tokens it will use. However, xargs knows the longest cmdline length and won't go beyond that.

976