使用 DomDocument 将所有标题标签替换为 h4 标签

人气:192 发布:2022-10-16 标签: php replace tags document dom

问题描述

我已经使用 DomDocument 来获取 GetElementById.它选择了一个div.我需要用 h4 标记替换该 div 中的所有标题标记.

I've used DomDocument to to GetElementById. It has selected a div. I need to replace all the header tags within that div with the h4 tag.

推荐答案

你没有在你的问题中明确你遇到的具体问题是什么.我假设有两个部分可能会导致您出现一些问号.

You have not made clear in your question what the concrete problem is you run into. I would assume that there are two parts that could cause you some questions marks.

第一个问题是如何掌握要重命名的所有元素,第二个问题实际上是如何重命名元素.

The first one would be how to get the hand on all the elements that you want to rename and the second one is actually how to rename an element.

首先要做的事情是:要选择所有标题元素,您需要选择 标题元素(h1 到 h6).结合它们还需要是具有特定 id 属性的 div 标记的子级的条件,这似乎是一件相当复杂的事情.然而,对于 xpath 查询,它仍然只是简单的.

So first things first: To select all the header elements you need to select all tags that are Heading elements (h1 to h6). Combined with the condition that they also need to be children of the div tag with a specific id attribute this seems like a rather complicate thing to do. However with an xpath query, it is still merely straight forward.

作为我的代码示例的示例,我选择了 id `"content" 并且以下 xpath 表达式查询所有标题元素:

Exemplary for my code examples I have choosen the id `"content" and the following xpath expression queries all heading elements:

(
    //div[@id="content"]//h1
    |//div[@id="content"]//h2
    |//div[@id="content"]//h3
    |//div[@id="content"]//h4
    |//div[@id="content"]//h5
    |//div[@id="content"]//h6
)

如果我在这个网站上运行它(在我回答之前),它会创建以下标签列表:

If I run this on this website here (before I answered it), it creates the following listing of tags:

Found 8 elements:
 #00: <h1>
 #01: <h2>
 #02: <h2>
 #03: <h3>
 #04: <h3>
 #05: <h3>
 #06: <h2>
 #07: <h4>

正如这很好地展示的那样,使用 xpath 查询,甚至可以创建不同元素的列表以及具有特定条件(例如作为具有 id 的 div 的子级).这段代码一目了然:

As this demonstrates well, with an xpath query even a list of different elements and with specific conditions like being a child of the div with the id can be created. This code at a glance:

$url = 'http://stackoverflow.com/questions/16307103/use-domdocument-to-replace-all-header-tags-with-the-h4-tags';

$dom = new DOMDocument();
$internalErrorsState = libxml_use_internal_errors(true);
$dom->loadHTMLFile($url);
libxml_use_internal_errors($internalErrorsState);
$xpath = new DOMXPath($dom);

$expression = '
(
    //div[@id="content"]//h1
    |//div[@id="content"]//h2
    |//div[@id="content"]//h3
    |//div[@id="content"]//h4
    |//div[@id="content"]//h5
    |//div[@id="content"]//h6
)';

$elements = $xpath->query($expression);
echo "Found ", $elements->length, " elements:
";
foreach ($elements as $index => $element) {
    printf(" #%02d: <%s>
", $index, $element->tagName);
}

重命名 DOMElement

那么关于重命名元素的第二个问题呢?

Renaming a DOMElement

So what about the second problem, about renaming elements?

开箱即用的 DOMDocumet 不支持此功能.有一个方法存根(DOMDocument::renameNode();在当前的 PHP 手册中没有记录)但是如果你调用它,你会得到一个警告,它没有实现:

DOMDocumet out of the box does not support this. There is a method stub (DOMDocument::renameNode(); undocumented in the current PHP manual) but if you call it you get a warning that it is not implemented:

警告:DOMDocument::renameNode():尚未实现

Warning: DOMDocument::renameNode(): Not yet implemented

相反,人们需要推出自己的版本.这就是它的工作原理:由于您无法使用 DOMDocument 重命名元素,您所能做的就是使用重命名的名称创建一个新元素并复制节点以重命名其所有属性和子节点,然后将其替换为重命名为浅拷贝.这是通过以下方法完成的:

Instead one needs to roll her own version. And this is how it works: As you can not rename an element with DOMDocument, all you can do is to create a new element with the renamed name and copy the node to rename all its attributes and children into it and then replace it with the renamed shallow copy. This is done by the following method:

/**
 * Renames a node in a DOM Document.
 *
 * @param DOMElement $node
 * @param string     $name
 *
 * @return DOMNode
 */
function dom_rename_element(DOMElement $node, $name) {
    $renamed = $node->ownerDocument->createElement($name);

    foreach ($node->attributes as $attribute) {
        $renamed->setAttribute($attribute->nodeName, $attribute->nodeValue);
    }

    while ($node->firstChild) {
        $renamed->appendChild($node->firstChild);
    }

    return $node->parentNode->replaceChild($renamed, $node);
}

将它与上面的 foreach 循环结合起来,在输出标签名称旁边,它们也可以重命名:

Bringing this together with the foreach loop from above, next to outputting the tag-names, they can also be renamed:

$elements = $xpath->query($expression);
echo "Found ", $elements->length, " elements:
";
foreach ($elements as $index => $element) {
    printf(" #%02d: <%s>
", $index, $element->tagName);
    dom_rename_element($element, 'h4');
    ###################################
}

然后,再次查询 xpath 表达式,将只得到 h4 标签:

And then afterwards, querying the xpath expression again, will result in h4 tags only:

$elements = $xpath->query($expression);
echo "Found ", $elements->length, " elements:
";
foreach ($elements as $index => $element) {
    printf(" #%02d: <%s>
", $index, $element->tagName);
}

输出:

Found 8 elements:
 #00: <h1>
 #01: <h2>
 #02: <h2>
 #03: <h3>
 #04: <h3>
 #05: <h3>
 #06: <h2>
 #07: <h4>

完整代码示例

这里是完整的代码示例及其输出一目了然:

Full Code-Example

Here the full code-example and its output at a glance:

<?php
/**
 * Use DomDocument to replace all header tags with the h4 tags
 * @link http://stackoverflow.com/q/16307103/367456
 */
$url = 'http://stackoverflow.com/questions/16307103/use-domdocument-to-replace-all-header-tags-with-the-h4-tags';

$dom = new DOMDocument();
$internalErrorsState = libxml_use_internal_errors(true);
$dom->loadHTMLFile($url);
libxml_use_internal_errors($internalErrorsState);
$xpath = new DOMXPath($dom);

$expression = '
(
    //div[@id="content"]//h1
    |//div[@id="content"]//h2
    |//div[@id="content"]//h3
    |//div[@id="content"]//h4
    |//div[@id="content"]//h5
    |//div[@id="content"]//h6
)';

$elements = $xpath->query($expression);
echo "Found ", $elements->length, " elements:
";
foreach ($elements as $index => $element) {
    printf(" #%02d: <%s>
", $index, $element->tagName);
    dom_rename_element($element, 'h4');
}

$elements = $xpath->query($expression);
echo "Found ", $elements->length, " elements:
";
foreach ($elements as $index => $element) {
    printf(" #%02d: <%s>
", $index, $element->tagName);
}

/**
 * Renames a node in a DOM Document.
 *
 * @param DOMElement $node
 * @param string     $name
 *
 * @return DOMNode
 */
function dom_rename_element(DOMElement $node, $name) {
    $renamed = $node->ownerDocument->createElement($name);

    foreach ($node->attributes as $attribute) {
        $renamed->setAttribute($attribute->nodeName, $attribute->nodeValue);
    }

    while ($node->firstChild) {
        $renamed->appendChild($node->firstChild);
    }

    return $node->parentNode->replaceChild($renamed, $node);
}

如果您尝试一下,您可能会注意到,在我回答之后,标题元素的数量发生了变化.我希望这会有所帮助!

If you try it out, you might notice that now, after my answer, the number of heading elements has changed. I hope this is helpful!

834